java.util.Deflater.deflate(byte[] b,int off,int len,int flush)
方法与小输出缓冲区一起使用时.
(我正在开发一些与WebSocket即将推出的扩展 – 扩展相关的低级网络代码,所以小缓冲区对我来说是现实的)
示例代码:
package deflate; import java.nio.charset.StandardCharsets; import java.util.zip.Deflater; public class DeflaterSmallBufferBug { public static void main(String[] args) { boolean nowrap = true; Deflater deflater = new Deflater(Deflater.DEFAULT_COMPRESSION,nowrap); byte[] input = "Hello".getBytes(StandardCharsets.UTF_8); System.out.printf("input is %,d bytes - %s%n",input.length,getHex(input,input.length)); deflater.setInput(input); byte[] output = new byte[input.length]; // break out of infinite loop seen with bug int maxloops = 10; // Compress the data while (maxloops-- > 0) { int compressed = deflater.deflate(output,output.length,Deflater.SYNC_FLUSH); System.out.printf("compressed %,compressed,getHex(output,compressed)); if (compressed < output.length) { System.out.printf("Compress success"); return; } } System.out.printf("Exited compress (maxloops left %d)%n",maxloops); } private static String getHex(byte[] buf,int offset,int len) { StringBuilder hex = new StringBuilder(); hex.append('['); for (int i = offset; i < (offset + len); i++) { if (i > offset) { hex.append(' '); } hex.append(String.format("%02X",buf[i])); } hex.append(']'); return hex.toString(); } }
在上面的例子中,我试图使用长度为5个字节的输出缓冲区为输入“Hello”生成压缩字节.
我假设以下结果字节:
buffer 1 [ F2 48 CD C9 C9 ] buffer 2 [ 07 00 00 00 FF ] buffer 3 [ FF ]
翻译为
[ F2 48 CD C9 C9 07 00 ] <-- the compressed data [ 00 00 FF FF ] <-- the deflate tail bytes
但是,当Deflater.deflate()与一个小缓冲区一起使用时,这个正常循环会无限延续5个字节的压缩数据(似乎只显示在5个字节或更低的缓冲区).
产生上述演示的结果输出…
input is 5 bytes - [48 65 6C 6C 6F] compressed 5 bytes - [F2 48 CD C9 C9] compressed 5 bytes - [07 00 00 00 FF] compressed 5 bytes - [FF 00 00 00 FF] compressed 5 bytes - [FF 00 00 00 FF] compressed 5 bytes - [FF 00 00 00 FF] compressed 5 bytes - [FF 00 00 00 FF] compressed 5 bytes - [FF 00 00 00 FF] compressed 5 bytes - [FF 00 00 00 FF] compressed 5 bytes - [FF 00 00 00 FF] compressed 5 bytes - [FF 00 00 00 FF] Exited compress (maxloops left -1)
如果输入/输出大于5个字节,则问题似乎消失了. (只需输入字符串“Hellox”即可自行测试)
使缓冲区为6个字节的结果(输入为“Hellox”)
input is 6 bytes - [48 65 6C 6C 6F 78] compressed 6 bytes - [F2 48 CD C9 C9 AF] compressed 6 bytes - [00 00 00 00 FF FF] compressed 5 bytes - [00 00 00 FF FF] Compress success
即使这些结果对我来说也有点古怪,因为它似乎存在2个缩减的尾字节序列.
所以,我想我的最终问题是,我是否遗漏了一些关于Deflater用法的东西,这对我来说很奇怪,或者这是否指向JVM Deflater实现本身可能存在的错误?
更新:2015年8月7日
该发现已被接受为bugs.java.com/JDK-8133170
解决方法
In the case of a Z_FULL_FLUSH or Z_SYNC_FLUSH,make sure that
avail_out is greater than six to avoid repeated flush markers due to
avail_out == 0 on return.
发生的事情是每次使用Z_SYNC_FLUSH调用deflate()都会插入一个五字节的flush标记.由于您没有提供足够的输出空间来获取标记,因此您再次调用以获得更多输出,但要求它同时插入另一个刷新标记.
您应该做的是使用Z_SYNC_FLUSH调用deflate()一次,然后使用Z_NO_FLUSH(或Java中的NO_FLUSH),在必要时使用额外的deflate()调用获取所有可用输出.