{"id":91,"date":"2006-03-05T21:12:00","date_gmt":"2006-03-05T21:12:00","guid":{"rendered":"http:\/\/blog.trungson.com\/?p=91"},"modified":"2006-03-05T21:12:00","modified_gmt":"2006-03-05T21:12:00","slug":"convert-vietnamese-from-utf8-to-ascii","status":"publish","type":"post","link":"http:\/\/blog.trungson.com\/?p=91","title":{"rendered":"Convert Vietnamese from UTF8 to ASCII"},"content":{"rendered":"<p>If you need to convert from UTF8\/Unicode Vietnamese with intonation\/ascent signs like acute, grave, circumflex, tilde, dot below, hook above, and more to just plain old ASCII without any extras, or to VIRQ format. Here is a PHP class that performs the conversion. Some examples:<\/p>\n<pre>\nan to\u00e0n => an toan\n\u00e1o gi\u00e1p => ao giap\nx\u00fac ph\u1ea1m => xuc pham\n<\/pre>\n<p>Source Code:<br \/>\n<textarea name=\"code\" class=\"php\"><br \/>\n<?php\n\/** \n* @author Son Nguyen\n* @since 3\/3\/2006\n* @package Framework\n* @subpackage String\n*\/\nclass CStringVI {\n private $mTxt;\n private $mMap;\n \/** constructor *\/\n function __construct($pTxt) {\n  $this->mTxt = $pTxt;<br \/>\n  $this->initMapping();<br \/>\n }<\/p>\n<p> \/** from file vumaps.js in package vietuni8 *\/<br \/>\n function initMapping() {<br \/>\n  $this->mMap[&#8216;Unicode&#8217;] = array(<br \/>\n   97,226,259,101,234,105,111,244,417,117,432,121,<br \/>\n   65,194,258,69,202,73,79,212,416,85,431,89,<br \/>\n   225,7845,7855,233,7871,237,243,7889,7899,250,7913,253,<br \/>\n   193,7844,7854,201,7870,205,211,7888,7898,218,7912,221,<br \/>\n   224,7847,7857,232,7873,236,242,7891,7901,249,7915,7923,<br \/>\n   192,7846,7856,200,7872,204,210,7890,7900,217,7914,7922,<br \/>\n   7841,7853,7863,7865,7879,7883,7885,7897,7907,7909,7921,7925,<br \/>\n   7840,7852,7862,7864,7878,7882,7884,7896,7906,7908,7920,7924,<br \/>\n   7843,7849,7859,7867,7875,7881,7887,7893,7903,7911,7917,7927,<br \/>\n   7842,7848,7858,7866,7874,7880,7886,7892,7902,7910,7916,7926,<br \/>\n   227,7851,7861,7869,7877,297,245,7895,7905,361,7919,7929,<br \/>\n   195,7850,7860,7868,7876,296,213,7894,7904,360,7918,7928,<br \/>\n   100,273,68,272<br \/>\n  );<\/p>\n<p>  $this->mMap[&#8216;ASCII&#8217;] = array(<br \/>\n   &#8216;a&#8217;,&#8217;a&#8217;,&#8217;a&#8217;,&#8217;e&#8217;,&#8217;e&#8217;,&#8217;i&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;u&#8217;,&#8217;u&#8217;,&#8217;y&#8217;,<br \/>\n   &#8216;A&#8217;,&#8217;A&#8217;,&#8217;A&#8217;,&#8217;E&#8217;,&#8217;E&#8217;,&#8217;I&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;U&#8217;,&#8217;U&#8217;,&#8217;Y&#8217;,<br \/>\n   &#8216;a&#8217;,&#8217;a&#8217;,&#8217;a&#8217;,&#8217;e&#8217;,&#8217;e&#8217;,&#8217;i&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;u&#8217;,&#8217;u&#8217;,&#8217;y&#8217;,<br \/>\n   &#8216;A&#8217;,&#8217;A&#8217;,&#8217;A&#8217;,&#8217;E&#8217;,&#8217;E&#8217;,&#8217;I&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;U&#8217;,&#8217;U&#8217;,&#8217;Y&#8217;,<br \/>\n   &#8216;a&#8217;,&#8217;a&#8217;,&#8217;a&#8217;,&#8217;e&#8217;,&#8217;e&#8217;,&#8217;i&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;u&#8217;,&#8217;u&#8217;,&#8217;y&#8217;,<br \/>\n   &#8216;A&#8217;,&#8217;A&#8217;,&#8217;A&#8217;,&#8217;E&#8217;,&#8217;E&#8217;,&#8217;I&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;U&#8217;,&#8217;U&#8217;,&#8217;Y&#8217;,<br \/>\n   &#8216;a&#8217;,&#8217;a&#8217;,&#8217;a&#8217;,&#8217;e&#8217;,&#8217;e&#8217;,&#8217;i&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;u&#8217;,&#8217;u&#8217;,&#8217;y&#8217;,<br \/>\n   &#8216;A&#8217;,&#8217;A&#8217;,&#8217;A&#8217;,&#8217;E&#8217;,&#8217;E&#8217;,&#8217;I&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;U&#8217;,&#8217;U&#8217;,&#8217;Y&#8217;,<br \/>\n   &#8216;a&#8217;,&#8217;a&#8217;,&#8217;a&#8217;,&#8217;e&#8217;,&#8217;e&#8217;,&#8217;i&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;u&#8217;,&#8217;u&#8217;,&#8217;y&#8217;,<br \/>\n   &#8216;A&#8217;,&#8217;A&#8217;,&#8217;A&#8217;,&#8217;E&#8217;,&#8217;E&#8217;,&#8217;I&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;U&#8217;,&#8217;U&#8217;,&#8217;Y&#8217;,<br \/>\n   &#8216;a&#8217;,&#8217;a&#8217;,&#8217;a&#8217;,&#8217;e&#8217;,&#8217;e&#8217;,&#8217;i&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;o&#8217;,&#8217;u&#8217;,&#8217;u&#8217;,&#8217;y&#8217;,<br \/>\n   &#8216;A&#8217;,&#8217;A&#8217;,&#8217;A&#8217;,&#8217;E&#8217;,&#8217;E&#8217;,&#8217;I&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;O&#8217;,&#8217;U&#8217;,&#8217;U&#8217;,&#8217;Y&#8217;,<br \/>\n   &#8216;d&#8217;,&#8217;d&#8217;,&#8217;D&#8217;,&#8217;D&#8217;<br \/>\n  );<\/p>\n<p>  $this->mMap[&#8216;VIRQ&#8217;] = array(<br \/>\n   &#8220;a&#8221;,&#8221;a^&#8221;,&#8221;a(&#8220;,&#8221;e&#8221;,&#8221;e^&#8221;,&#8221;i&#8221;,&#8221;o&#8221;,&#8221;o^&#8221;,&#8221;o+&#8221;,&#8221;u&#8221;,&#8221;u+&#8221;,&#8221;y&#8221;,<br \/>\n   &#8220;A&#8221;,&#8221;A^&#8221;,&#8221;A(&#8220;,&#8221;E&#8221;,&#8221;E^&#8221;,&#8221;I&#8221;,&#8221;O&#8221;,&#8221;O^&#8221;,&#8221;O+&#8221;,&#8221;U&#8221;,&#8221;U+&#8221;, &#8220;Y&#8221;,<br \/>\n   &#8220;a'&#8221;,&#8221;a^'&#8221;,&#8221;a(&#8216;&#8221;,&#8221;e'&#8221;,&#8221;e^'&#8221;,&#8221;i'&#8221;,&#8221;o'&#8221;,&#8221;o^'&#8221;,&#8221;o+'&#8221;,&#8221;u'&#8221;,&#8221;u+'&#8221;,&#8221;y'&#8221;,<br \/>\n   &#8220;A'&#8221;,&#8221;A^'&#8221;,&#8221;A(&#8216;&#8221;,&#8221;E'&#8221;,&#8221;E^'&#8221;,&#8221;I'&#8221;,&#8221;O'&#8221;,&#8221;O^'&#8221;,&#8221;O+'&#8221;,&#8221;U'&#8221;,&#8221;U+'&#8221;,&#8221;Y'&#8221;,<br \/>\n   &#8220;a`&#8221;,&#8221;a^`&#8221;,&#8221;a(`&#8221;,&#8221;e`&#8221;,&#8221;e^`&#8221;,&#8221;i`&#8221;,&#8221;o`&#8221;,&#8221;o^`&#8221;,&#8221;o+`&#8221;,&#8221;u`&#8221;,&#8221;u+`&#8221;,&#8221;y`&#8221;,<br \/>\n   &#8220;A`&#8221;,&#8221;A^`&#8221;,&#8221;A(`&#8221;,&#8221;E`&#8221;,&#8221;E^`&#8221;,&#8221;I`&#8221;,&#8221;O`&#8221;,&#8221;O^`&#8221;,&#8221;O+`&#8221;,&#8221;U`&#8221;,&#8221;U+`&#8221;,&#8221;Y`&#8221;,<br \/>\n   &#8220;a.&#8221;,&#8221;a^.&#8221;,&#8221;a(.&#8221;,&#8221;e.&#8221;,&#8221;e^.&#8221;,&#8221;i.&#8221;,&#8221;o.&#8221;,&#8221;o^.&#8221;,&#8221;o+.&#8221;,&#8221;u.&#8221;,&#8221;u+.&#8221;,&#8221;y.&#8221;,<br \/>\n   &#8220;A.&#8221;,&#8221;A^.&#8221;,&#8221;A(.&#8221;,&#8221;E.&#8221;,&#8221;E^.&#8221;,&#8221;I.&#8221;,&#8221;O.&#8221;,&#8221;O^.&#8221;,&#8221;O+.&#8221;,&#8221;U.&#8221;,&#8221;U+.&#8221;,&#8221;Y.&#8221;,<br \/>\n   &#8220;a?&#8221;,&#8221;a^?&#8221;,&#8221;a(?&#8221;,&#8221;e?&#8221;,&#8221;e^?&#8221;,&#8221;i?&#8221;,&#8221;o?&#8221;,&#8221;o^?&#8221;,&#8221;o+?&#8221;,&#8221;u?&#8221;,&#8221;u+?&#8221;,&#8221;y?&#8221;,<br \/>\n   &#8220;A?&#8221;,&#8221;A^?&#8221;,&#8221;A(?&#8221;,&#8221;E?&#8221;,&#8221;E^?&#8221;,&#8221;I?&#8221;,&#8221;O?&#8221;,&#8221;O^?&#8221;,&#8221;O+?&#8221;,&#8221;U?&#8221;,&#8221;U+?&#8221;,&#8221;Y?&#8221;,<br \/>\n   &#8220;a~&#8221;,&#8221;a^~&#8221;,&#8221;a(~&#8221;,&#8221;e~&#8221;,&#8221;e^~&#8221;,&#8221;i~&#8221;,&#8221;o~&#8221;,&#8221;o^~&#8221;,&#8221;o+~&#8221;,&#8221;u~&#8221;,&#8221;u+~&#8221;,&#8221;y~&#8221;,<br \/>\n   &#8220;A~&#8221;,&#8221;A^~&#8221;,&#8221;A(~&#8221;,&#8221;E~&#8221;,&#8221;E^~&#8221;,&#8221;I~&#8221;,&#8221;O~&#8221;,&#8221;O^~&#8221;,&#8221;O+~&#8221;,&#8221;U~&#8221;,&#8221;U+~&#8221;,&#8221;Y~&#8221;,<br \/>\n   &#8220;d&#8221;,&#8221;dd&#8221;,&#8221;D&#8221;,&#8221;DD&#8221;<br \/>\n  );<br \/>\n }<\/p>\n<p> \/** check if it&#8217;s in order *\/<br \/>\n private function between($pStart,$pVar,$pEnd) {<br \/>\n  return ($pVar>=$pStart &#038;&#038; $pVar<=$pEnd);\n }\n\n \/** map from one charset to another *\/\n function map($pFrom,$pTo) {\n  $vStr = $this->mTxt;<br \/>\n  $vLen = strlen($this->mTxt);<br \/>\n  $vOutput = &#8221;;<br \/>\n  for ($i=0;$i<$vLen;$i++) {\n   $vOrd = 0;\n   $vOrds = array();\n   for ($j=0;$j<6;$j++) {\n    \/\/if ($i+$j<$vLen) {\n    if (isset($vStr[$i+$j])) {\n     $vOrds[$j] = ord($vStr[$i+$j]);\n    } \/\/ fi\n   } \/\/ rof\n\n   \/\/ http:\/\/www1.tip.nl\/~t876506\/utf8tbl.html\n   if ($this->between(0,$vOrds[0],127)) {<br \/>\n    $vOrd = $vOrds[0];<br \/>\n   } elseif ($this->between(192,$vOrds[0],223)) {<br \/>\n    $vOrd = ($vOrds[0]-192)*64+($vOrds[1]-128);<br \/>\n    $i = $i+1;<br \/>\n   } elseif ($this->between(224,$vOrds[0],239)) {<br \/>\n    $vOrd = ($vOrds[0]-224)*4096+($vOrds[1]-128)*64+($vOrds[2]-128);<br \/>\n    $i = $i+2;<br \/>\n   } elseif ($this->between(240,$vOrds[0],247)) {<br \/>\n    $vOrd = ($vOrds[0]-240)*262144+($vOrds[1]-128)*4096+($vOrds[2]-128)*64+($vOrds[3]-128);<br \/>\n    $i = $i+3;<br \/>\n   } elseif ($this->between(248,$vOrds[0],251)) {<br \/>\n    $vOrd = ($vOrds[0]-248)*16777216+($vOrds[1]-128)*262144+($vOrds[2]-128)*4096+($vOrds[3]-128)*64+($vOrds[4]-128);<br \/>\n    $i = $i+4;<br \/>\n   } elseif ($this->between(252,$vOrds[0],253)) {<br \/>\n    $vOrd = ($vOrds[0]-252)*1073741824+($vOrds[1]-128)*16777216+($vOrds[2]-128)*262144+($vOrds[3]-128)*4096+($vOrds[4]-128)*64+($vOrds[5]-128);<br \/>\n    $i = $i+5;<br \/>\n   } elseif ($this->between(254,$vOrds[0],255)) { \/\/ error<br \/>\n    $vOrd = 0;<br \/>\n   } \/\/ fi<\/p>\n<p>   if ($vOrd > 127 ) {<br \/>\n    $vKey = array_search($vOrd,$this->mMap[$pFrom]);<br \/>\n    $vOutput .= $this->mMap[$pTo][$vKey];<br \/>\n   } else {<br \/>\n    $vOutput .= chr($vOrd);<br \/>\n   } \/\/ fi<br \/>\n  } \/\/ rof<br \/>\n  return $vOutput;<br \/>\n }<\/p>\n<p> \/** convert from utf8 to plain text ascii *\/<br \/>\n function uni2ascii() {<br \/>\n  return $this->map(&#8216;Unicode&#8217;,&#8217;ASCII&#8217;);<br \/>\n }<br \/>\n}<br \/>\n?><br \/>\n<\/textarea><\/p>\n<p>Sample Usage:<br \/>\n<textarea name=\"code\" class=\"php\"><br \/>\n$vStr = new CStringVI(&#8216;an to\u00e0n&#8217;);<br \/>\necho ($vStr->uni2ascii()); \/\/ prints out &#8220;an toan&#8221;<br \/>\n<\/textarea><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you need to convert from UTF8\/Unicode Vietnamese with intonation\/ascent signs like acute, grave, circumflex, tilde, dot below, hook above, and more to just plain old ASCII without any extras, or to VIRQ format. Here is a PHP class that performs the conversion. Some examples: an to\u00e0n => an toan \u00e1o gi\u00e1p => ao giap [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/blog.trungson.com\/index.php?rest_route=\/wp\/v2\/posts\/91"}],"collection":[{"href":"http:\/\/blog.trungson.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/blog.trungson.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/blog.trungson.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/blog.trungson.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=91"}],"version-history":[{"count":0,"href":"http:\/\/blog.trungson.com\/index.php?rest_route=\/wp\/v2\/posts\/91\/revisions"}],"wp:attachment":[{"href":"http:\/\/blog.trungson.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=91"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/blog.trungson.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=91"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/blog.trungson.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=91"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}