目前,我不明白为什么在处理UTF-8时在
PHP中使用mbstring函数非常重要?我在linux下的语言环境已经设置为UTF-8,那么为什么默认情况下strlen,preg_replace等函数不能正常工作?
无论您的操作系统的语言环境如何,所有PHP
string functions都不会处理多字节字符串.这就是您需要使用多字节字符串函数的原因.
从Multibyte String Introduction开始:
When you manipulate (trim,split,splice,etc.) strings encoded in a multibyte encoding,you need to use special functions since two or more consecutive bytes may represent a single character in such encoding schemes. Otherwise,if you apply a non-multibyte-aware string function to the string,it probably fails to detect the beginning or ending of the multibyte character and ends up with a corrupted garbage string that most likely loses its original meaning.