The role of a garbage collector (GC) is to reclaim unused memory.
Tracing collectors must identify all live objects by traversing graphs
of objects induced by aggregation relationships. In brief,the GC has
some work-list of tasks to perform. It repeatedly (a) acquires a task
(e.g. an object to inspect),(b) performs the task (e.g. marks the
object unless it is already marked),and (c) generates further tasks
(e.g. adds the children of an unmarked task to the work-list). It is
desirable to parallelise this operation.In a single-threaded
environment,the work-list is usually a single LIFO stack. What would
you have to do to make this safe for a parallel GC? Would this be a
sensible design for a parallel GC? Discuss designs of data structure
to support a parallel GC that would scale better. Explain why you
would expect them to scale better.
解决方法
struct object { vector<object*> references; atomic<bool> is_visited; // for simplicity,or epoch counter // if nothing resets it to false void inspect(); // processing method }; vector<object> objects; // also for simplicity,if it can be for real // things like `parallel_for` would be perfect here
鉴于此数据结构和GC工作方式的描述,它完全适合像divide-and-conquer pattern这样的递归并行:
void object::inspect() { if( ! is_visited.exchange(true) ) { for( object* o : objects ) // alternatively it can be `parallel_for` in some variants cilk_spawn o->inspect(); // for Cilk or `task_group::run` for TBB or PPL // further processing of the object } }
如果问题中的数据结构是任务的组织方式.我推荐一个工作窃取调度程序(如tbb或cilk).关于这个主题有很多论文.简单来说,每个工作线程都有自己但共享的任务副本,当deque为空时,一个线程偷走别人的任务.