在x86 / x86-64上编写比特流的最快方法是什么? (码字< = 32bit) 通过写入比特流,我参考将可变比特长度符号连接到连续的存储器缓冲器中的过程. 目前我已经有一个标准的容器与一个32位中间缓冲区写入
void write_bits(SomeContainer<unsigned int>& dst,unsigned int& buffer,unsigned int& bits_left_in_buffer,int codeword,short bits_to_write){ if(bits_to_write < bits_left_in_buffer){ buffer|= codeword << (32-bits_left_in_buffer); bits_left_in_buffer -= bits_to_write; }else{ unsigned int full_bits = bits_to_write - bits_left_in_buffer; unsigned int towrite = buffer|(codeword<<(32-bits_left_in_buffer)); buffer= full_bits ? (codeword >> bits_left_in_buffer) : 0; dst.push_back(towrite); bits_left_in_buffer = 32-full_bits; } }
有没有人知道任何好的优化,快速说明或其他可能有用的信息?
干杯,
解决方法
我写了一个相当快速的实现,但它有一些限制:它在32位x86写入和读取比特流时工作.我不在这里检查缓冲区限制,我正在分配较大的缓冲区,并从呼叫代码不时检查.
unsigned char* membuff; unsigned bit_pos; // current BIT position in the buffer,so it's max size is 512Mb // input bit buffer: we'll decode the byte address so that it's even,and the DWORD from that address will surely have at least 17 free bits inline unsigned int get_bits(unsigned int bit_cnt){ // bit_cnt MUST be in range 0..17 unsigned int byte_offset = bit_pos >> 3; byte_offset &= ~1; // rounding down by 2. unsigned int bits = *(unsigned int*)(membuff + byte_offset); bits >>= bit_pos & 0xF; bit_pos += bit_cnt; return bits & BIT_MASKS[bit_cnt]; }; // output buffer,the whole destination should be memset'ed to 0 inline unsigned int put_bits(unsigned int val,unsigned int bit_cnt){ unsigned int byte_offset = bit_pos >> 3; byte_offset &= ~1; *(unsigned int*)(membuff + byte_offset) |= val << (bit_pos & 0xf); bit_pos += bit_cnt; };