c – 具有两个原子的自旋锁定的最小限制性存储器排序

我有一些工作线程定期执行时间关键处理(大约1 kHz).每个周期,工人都被唤醒做家务,每个人都应该(平均)在下一个周期开始之前完成.它们在同一个对象上运行,有时可以通过主线程进行修改.

为了防止竞争,但允许在下一个循环之前修改对象,我使用了一个自旋锁和一个原子计数器来记录仍有多少线程正在工作：

class Foo {
public:
    void Modify();
    void DoWork( SomeContext& );
private:
    std::atomic_flag locked = ATOMIC_FLAG_INIT;
    std::atomic<int> workers_busy = 0;
};

void Foo::Modify()
{
    while( locked.test_and_set( std::memory_order_acquire ) ) ;   // spin
    while( workers_busy.load() != 0 ) ;                           // spin

    // Modifications happen here ....

    locked.clear( std::memory_order_release );
}

void Foo::DoWork( SomeContext& )
{
    while( locked.test_and_set( std::memory_order_acquire ) ) ;   // spin
    ++workers_busy;
    locked.clear( std::memory_order_release );

    // Processing happens here ....

    --workers_busy;
}

这允许所有剩余的工作立即完成,前提是至少有一个线程已经开始,并且在另一个工作人员开始下一个周期的工作之前总是会阻塞.

使用“获取”和“释放”内存命令访问atomic_flag,这似乎是用C 11实现自旋锁的可接受方式.根据documentation at cppreference.com：

memory_order_acquire : A load operation with this memory order performs the acquire operation on the affected memory location: no memory accesses in the current thread can be reordered before this load. This ensures that all writes in other threads that release the same atomic variable are visible in the current thread.

memory_order_release : A store operation with this memory order performs the release operation: no memory accesses in the current thread can be reordered after this store. This ensures that all writes in the current thread are visible in other threads that acquire the same atomic variable and writes that carry a dependency into the atomic variable become visible in other threads that consume the same atomic.

据我所知,这足以跨线程同步受保护的访问以提供互斥行为,而不会对内存排序过于保守.

我想知道的是内存排序是否可以进一步放宽,因为这种模式的副作用是我使用自旋锁互斥来同步另一个原子变量.

对workers_busy,– workers_busy和workers_busy.load()的调用目前都有默认的内存顺序memory_order_seq_cst.鉴于此原子的唯一有趣用途是使用–workers_busy(旋转锁定互斥锁未同步)取消阻止Modify(),可以使用相同的获取 – 释放内存顺序与此变量一起使用“ 增量？即

void Foo::Modify()
{
    while( locked.test_and_set( std::memory_order_acquire ) ) ;
    while( workers_busy.load( std::memory_order_acquire ) != 0 ) ;  // <--
    // ....
    locked.clear( std::memory_order_release );
}

void Foo::DoWork( SomeContext& )
{
    while( locked.test_and_set( std::memory_order_acquire ) ) ;
    workers_busy.fetch_add( 1,std::memory_order_relaxed );         // <--
    locked.clear( std::memory_order_release );
    // ....
    workers_busy.fetch_sub( 1,std::memory_order_release );         // <--
}

它是否正确？是否有可能进一步放宽这些内存排序？它甚至重要吗？

解决方法

Since you say you’re targeting x86 only,你是 guaranteed strongly-ordered memory anyway;避免使用memory_order_seq_cst是有用的(它可以触发昂贵且不必要的内存限制),但除此之外,大多数其他操作不会产生任何特殊开销,因此除了允许可能不正确的编译器指令重新排序之外,您不会从额外的放松中获得任何东西.这应该是安全的,并且不比使用C 11原子的任何其他解决方案慢：

void Foo::Modify()
{
    while( locked.test_and_set( std::memory_order_acquire ) ) ;
    while( workers_busy.load( std::memory_order_acquire ) != 0 ) ; // acq to see decrements
    // ....
    locked.clear( std::memory_order_release );
}

void Foo::DoWork( SomeContext& )
{
    while(locked.test_and_set(std::memory_order_acquire)) ;
    workers_busy.fetch_add(1,std::memory_order_relaxed); // Lock provides acq and rel free
    locked.clear(std::memory_order_release);
    // ....
    workers_busy.fetch_sub(1,std::memory_order_acq_rel); // No lock wrapping; acq_rel
}

在最坏的情况下,在x86上,这会产生一些编译器排序限制;它不应该引入不需要锁定的额外围栏或锁定指令.

c – 具有两个原子的自旋锁定的最小限制性存储器排序

解决方法

猜你在找的C&C++相关文章