PHP的mb_internal_encoding实际上做了什么？

According to the PHP website它这样做：

encoding is the character encoding name used for the HTTP input
character encoding conversion,HTTP output character encoding
conversion,and the default character encoding for string functions
defined by the mbstring module. You should notice that the internal
encoding is totally different from the one for multibyte regex.

有人可以用更简单的术语解释一下吗？

> HTTP输入字符编码转换
> HTTP输出字符编码转换
>字符串函数的默认字符编码
>“内部编码与多字节正则表达式完全不同”是什么意思？

我的猜测是

>表示GET和POST被视为该编码.
>表示它输出到该编码.
>表示它对所有多字节字符串函数使用该编码.
>我不知道.为什么正则表达式与普通字符串函数不同？

如果第2点是正确的,你需要这样做：

ini_set('default_charset','UTF-8');

如果我理解正确,这意味着如果你这样做：

mb_internal_encoding('UTF-8')

你不需要这样做：

mb_strtolower($str,'UTF-8');

只是：

mb_strtolower($str);

我确实读过另一个SO帖子,mb_strtolower($str)不应该被信任,你需要为每个多字节字符串函数设置编码.这是真的？

mbstring扩展添加了光荣的想法(< / sarcasm>),以自动将所有传入数据和所有输出数据从某些编码转换为另一种编码.请参见 mbstring HTTP Input and Output.它使用mbstring.http_input ini设置并使用mb_output_handler进行配置. mb_internal_encoding会影响此转换. IMO你应该关掉那些设置而不要碰它们;我还没有找到任何可以通过优雅方式解决的问题,并且总体来说隐藏编码转换是一个糟糕的想法.特别是如果它全部通过在各种不同的上下文中使用的一个全局标志(mb_internal_encoding)来控制.
那是1.和2.

对于3.,的确如此,mb_internal_encoding基本上为接受$encoding参数的所有mb_函数设置默认值.本质上它只是设置一个全局变量(内部),其他函数读取,这就是全部.

最后一部分指的是有一个单独的mb_regex_encoding函数来设置mb_ereg_函数的内部编码.

I did read on another SO post that mb_strtolower($str) should no be trusted and that you need to set the encoding for each multibyte string function. Is this true?

我同意这一点,因为所有的全球国家都不可信任.这是非常值得信赖的：

mb_internal_encoding('UTF-8');
mb_strtolower($string);

但是,这不是真的：

mb_strtolower($string);

看到不同？如果您依赖于在其他地方正确设置全局状态,您永远无法确定它实际上是否正确.你只需要调用一些第三方库,在你不知道的情况下将mb_internal_encoding设置为其他东西,你的mb_strtolower调用会突然表现得非常不同.

PHP的mb_internal_encoding实际上做了什么？

猜你在找的PHP相关文章