多年来一直编码.net我觉得自己像个n00b.为什么以下代码失败?
byte[] a = Guid.NewGuid().ToByteArray(); // 16 bytes in array string b = new UTF8Encoding().GetString(a); byte[] c = new UTF8Encoding().GetBytes(b); Guid d = new Guid(c); // Throws exception (32 bytes recived from c)@H_301_4@更新 @H_301_4@批准了CodeInChaos的答案.可以在他的答案中读取16个字节的原因,即32个字节.答案中也说明了:
@H_301_4@the default constructor of@H_301_4@恕我直言,当尝试将字节数组编码为包含无效字节的字符串时,UTF8编码器应该抛出异常.为了使.net框架正常运行,代码应该编写如下
UTF8Encoding has error checking
disabled
byte[] a = Guid.NewGuid().ToByteArray(); string b = new UTF8Encoding(false,true).GetString(a); // Throws exception as expected byte[] c = new UTF8Encoding(false,true).GetBytes(b); Guid d = new Guid(c);
解决方法
并非每个字节序列都是有效的UTF-8编码字符串.
@H_301_4@GUID几乎可以包含任何字节序列.但是UTF-8作为特定规则,如果值> 127,则允许字节序列. Guid通常不会遵循这些规则.
@H_301_4@然后,当您将损坏的字符串编码回字节数组时,您将获得一个长度超过16个字节的字节数组,这是Guid的构造函数不接受的.
@H_301_4@UTF8Encoding.GetString的文档说明:
@H_301_4@With error detection,an invalid sequence causes this method to throw a ArgumentException. Without error detection,invalid sequences are ignored,and no exception is thrown.@H_301_4@并且UTF8Encoding的默认构造函数已禁用错误检查(不要问我原因).
@H_301_4@This constructor creates an instance that does not provide a Unicode byte order mark and does not throw an exception when an invalid encoding is detected.@H_301_4@您可能希望使用Base64编码而不是UTF-8.这样,您可以将任何有效的字节序列映射到字符串中并返回.
Note
For security reasons,your applications are recommended to enable error detection by using the constructor that accepts a throwOnInvalidBytes parameter and setting that parameter to true.