This is a discussion on Encoding String within the PHP Programming forums, part of the Web Development category; PHP's Problem with Character Encoding The basic problem PHP has with character encoding is it has a very simple ...
| |||||||
| Register | FAQ | Members List | Calendar | Mark Forums Read |
|
#1
| |||
| |||
| PHP's Problem with Character Encoding The basic problem PHP has with character encoding is it has a very simple idea of what the notion of a character is: that one character equals one byte. Being more precise, the problem is most of PHP‘s string related functionality (see common_problem_areas_with_utf-8 for further details) make this assumption but to be able to support a wide range of characters (or all characters, ever, as Unicode does), you need more than one byte to represent a character. An example in code. From Sam Ruby’s i18n Survival Guide, he recommends using the string Iñtërnâtiônàlizætiøn for testing. Counted with your eye, you can see it contains 20 characters; Iñtërnâtiônàlizætiøn 12345678901234567890 But counted with PHP‘s strlen function... <?php echo strlen('Iñtërnâtiônàlizætiøn'); ?> PHP will report 27 characters. That’s because the string, encoded as UTF-8, contains multi-byte characters which PHP‘s strlen function will count as being multiple characters. Life gets even more interesting if you run the following2); <?php header('Content-Type: text/plain; charset=ISO-8859-1'); $str = 'Iñtërnâtiônàlizætiøn'; $out = ''; $pos = ''; for($i = 0, $j = 1; $i < strlen($str); $i++, $j++) { $out .= $str[$i]; if ( $j == 10 ) $j = 0; $pos .= $j; } echo $out."\n".$pos; ?> You should see something like; Iñtërnâtiônà lizætiøn 123456789012345678901234567 |
![]() |
| Thread Tools | |
| Display Modes | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| how to check which text encoding is used using php | ptrckgorman | PHP Programming | 0 | 12-27-2008 06:53 AM |
| Encoding in PHP files | jamilvoss | PHP Programming | 0 | 12-26-2008 11:35 AM |
| encoding and serialization | smithcarvo | ASP and ASP.NET Programming | 1 | 10-11-2008 06:36 PM |
| Serialization and Encoding | vigneshgets | C# Programming | 1 | 08-01-2007 10:37 PM |
| Encoding WMV file in C# .Net | oxygen | C# Programming | 1 | 07-20-2007 07:16 AM |
Our Partners |