为了防止竞争,但允许在下一个循环之前修改对象,我使用了一个自旋锁和一个原子计数器来记录仍有多少线程正在工作:
class Foo { public: void Modify(); void DoWork( SomeContext& ); private: std::atomic_flag locked = ATOMIC_FLAG_INIT; std::atomic<int> workers_busy = 0; }; void Foo::Modify() { while( locked.test_and_set( std::memory_order_acquire ) ) ; // spin while( workers_busy.load() != 0 ) ; // spin // Modifications happen here .... locked.clear( std::memory_order_release ); } void Foo::DoWork( SomeContext& ) { while( locked.test_and_set( std::memory_order_acquire ) ) ; // spin ++workers_busy; locked.clear( std::memory_order_release ); // Processing happens here .... --workers_busy; }
这允许所有剩余的工作立即完成,前提是至少有一个线程已经开始,并且在另一个工作人员开始下一个周期的工作之前总是会阻塞.
使用“获取”和“释放”内存命令访问atomic_flag,这似乎是用C 11实现自旋锁的可接受方式.根据documentation at cppreference.com:
memory_order_acquire
: A load operation with this memory order performs the acquire operation@H_301_17@ on the affected memory location: no memory accesses in the current thread can be reordered before this load. This ensures that all writes in other threads that release the same atomic variable are visible in the current thread.
memory_order_release
: A store operation with this memory order performs the release operation@H_301_17@: no memory accesses in the current thread can be reordered after this store. This ensures that all writes in the current thread are visible in other threads that acquire the same atomic variable and writes that carry a dependency into the atomic variable become visible in other threads that consume the same atomic.
据我所知,这足以跨线程同步受保护的访问以提供互斥行为,而不会对内存排序过于保守.
我想知道的是内存排序是否可以进一步放宽,因为这种模式的副作用是我使用自旋锁互斥来同步另一个原子变量.
对workers_busy,– workers_busy和workers_busy.load()的调用目前都有默认的内存顺序memory_order_seq_cst.鉴于此原子的唯一有趣用途是使用–workers_busy(旋转锁定互斥锁未同步)取消阻止Modify(),可以使用相同的获取 – 释放内存顺序与此变量一起使用“ 增量?即
void Foo::Modify() { while( locked.test_and_set( std::memory_order_acquire ) ) ; while( workers_busy.load( std::memory_order_acquire ) != 0 ) ; // <-- // .... locked.clear( std::memory_order_release ); } void Foo::DoWork( SomeContext& ) { while( locked.test_and_set( std::memory_order_acquire ) ) ; workers_busy.fetch_add( 1,std::memory_order_relaxed ); // <-- locked.clear( std::memory_order_release ); // .... workers_busy.fetch_sub( 1,std::memory_order_release ); // <-- }
它是否正确?是否有可能进一步放宽这些内存排序?它甚至重要吗?
解决方法
void Foo::Modify() { while( locked.test_and_set( std::memory_order_acquire ) ) ; while( workers_busy.load( std::memory_order_acquire ) != 0 ) ; // acq to see decrements // .... locked.clear( std::memory_order_release ); } void Foo::DoWork( SomeContext& ) { while(locked.test_and_set(std::memory_order_acquire)) ; workers_busy.fetch_add(1,std::memory_order_relaxed); // Lock provides acq and rel free locked.clear(std::memory_order_release); // .... workers_busy.fetch_sub(1,std::memory_order_acq_rel); // No lock wrapping; acq_rel }
在最坏的情况下,在x86上,这会产生一些编译器排序限制;它不应该引入不需要锁定的额外围栏或锁定指令.