c – 在现代x86硬件上编写比特流的最快方式

前端之家收集整理的这篇文章主要介绍了c – 在现代x86硬件上编写比特流的最快方式前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
在x86 / x86-64上编写比特流的最快方法是什么? (码字< = 32bit) 通过写入比特流,我参考将可变比特长度符号连接到连续的存储器缓冲器中的过程. 目前我已经有一个标准的容器与一个32位中间缓冲区写入
void write_bits(SomeContainer<unsigned int>& dst,unsigned int& buffer,unsigned int& bits_left_in_buffer,int codeword,short bits_to_write){
    if(bits_to_write < bits_left_in_buffer){
        buffer|= codeword << (32-bits_left_in_buffer);
        bits_left_in_buffer -= bits_to_write;

    }else{
        unsigned int full_bits = bits_to_write - bits_left_in_buffer;
        unsigned int towrite = buffer|(codeword<<(32-bits_left_in_buffer));
        buffer= full_bits ? (codeword >> bits_left_in_buffer) : 0;
        dst.push_back(towrite);
        bits_left_in_buffer = 32-full_bits;
    }
}

有没有人知道任何好的优化,快速说明或其他可能有用的信息?

干杯,

解决方法

我写了一个相当快速的实现,但它有一些限制:它在32位x86写入和读取比特流时工作.我不在这里检查缓冲区限制,我正在分配较大的缓冲区,并从呼叫代码不时检查.
unsigned char* membuff; 
unsigned bit_pos; // current BIT position in the buffer,so it's max size is 512Mb

// input bit buffer: we'll decode the byte address so that it's even,and the DWORD from that address will surely have at least 17 free bits
inline unsigned int get_bits(unsigned int bit_cnt){ // bit_cnt MUST be in range 0..17
    unsigned int byte_offset = bit_pos >> 3;
    byte_offset &= ~1;  // rounding down by 2.
    unsigned int bits = *(unsigned int*)(membuff + byte_offset);
    bits >>= bit_pos & 0xF;
    bit_pos += bit_cnt;
    return bits & BIT_MASKS[bit_cnt];
};

// output buffer,the whole destination should be memset'ed to 0
inline unsigned int put_bits(unsigned int val,unsigned int bit_cnt){
    unsigned int byte_offset = bit_pos >> 3;
    byte_offset &= ~1;
    *(unsigned int*)(membuff + byte_offset) |= val << (bit_pos & 0xf);
    bit_pos += bit_cnt;
};

猜你在找的C&C++相关文章