Code fragment:
$string =”%u4E2D%u56FD%u5E7B%u60F3%u6587%u5B66%u57FA%u573″;
$new_string = utf8RawUrlDecode( $string);
echo “encoded string is “.$new_string;
function utf8RawUrlDecode ($source) {
$decodedStr = ”;
$pos = 0;
$len = strlen ($source);
while ($pos < $len) {
$charAt = substr ($source, $pos, 1);
if ($charAt == ‘%’) {
$pos++;
$charAt = substr ($source, $pos, 1);
if ($charAt == ‘u’) {
// we got a unicode character
$pos++;
$unicodeHexVal = substr ($source, $pos, 4);
$unicode = hexdec ($unicodeHexVal);
$entity = “&#”. $unicode . ‘;’;
$decodedStr .= utf8_encode ($entity);
$pos += 4;
}
else {
// we have an escaped ascii character
$hexVal = substr ($source, $pos, 2);
$decodedStr .= chr (hexdec ($hexVal));
$pos += 2;
}
}
else {
$decodedStr .= $charAt;
$pos++;
}
}
return $decodedStr;
}
This is a test comment
The function above convert an “escaped” Unicode string, for example from a query string parameter to HTML Unicode entities. If instead you want to get a Unicode string in UTF-8 or other encodings, the function at this URL will do the trick.
You’re awesome! I’ve been fighting with this for hours. I’ve been using encodeURI() in javascript. It doesn’t fix &. That’s really annoying. Good work.