我在Postgres有一张名为tasks的表.它记录了机械土耳其式的任务.它包含以下列:
entity_name,text (the thing being reviewed) reviewer_email,text (the email address of the person doing the reviewing) result,boolean (the entry provided by the reviewer)
需要检查的每个实体都会生成两个任务行,每个行分配给不同的审阅者.当两个评论者都不同意时(例如他们的结果值不相等),应用程序将启动第三个任务,分配给主持人.主持人始终拥有相同的电子邮件域.
我试图获得每次评论者的评论,审稿人被主持人否决,或者由主持人确认.我认为我相当接近,但最后一点证明是棘手的:
SELECT reviewer_email,COUNT(*) FILTER( WHERE entity_name IN ( SELECT entity_name FROM tasks GROUP BY entity_name HAVING COUNT(*) FILTER (WHERE result IS NOT NULL) = 3 -- find the entities that have exactly three reviews AND -- this is the tricky part: -- need something like: -- WHERE current_review.result = moderator_review.result ) ) AS overruled_count FROM tasks WHERE result IS NOT NULL GROUP BY reviewer_email HAVING reviewer_email NOT LIKE '%@moderators-domain.net'
样本数据:
id | entity_name | reviewer_email | result 1 | apple | bob@email.net | true 2 | apple | alice@email.net | false 3 | apple | mod@@moderators-domain.net | true 4 | pair | bob@email.net | true 5 | pair | alice@email.net | false 6 | pair | mod@@moderators-domain.net | false 7 | kiwi | bob@email.net | true 8 | kiwi | alice@email.net | true
期望的结果:
reviewer_email | overruled_count | affirmed_count bob@email.net | 1 | 1 alice@email.net | 1 | 1
Bob和Alice各自做了三次评论.在一次审查中,他们同意,因此没有适度.他们对其他两次评论持不同意见并被推翻一次,并由主持人确认一次.
解决方法
我认为这比你可能意识到的更难.以下内容将主持人审核附加到每个非主持人审核:
select t.*,tm.result as moderator_result from tasks t join tasks tm on t.entity_name = tm.entity_name where t.reviewer_email NOT LIKE '%@moderators-domain.net' and tm.reviewer_email LIKE '%@moderators-domain.net';
从这里,我们可以汇总您想要的结果:
select reviewer_email,sum( (result = moderator_result)::int ) as moderator_agrees,sum( (result <> moderator_result)::int ) as moderator_disagrees from (select t.*,tm.result as moderator_result from tasks t join tasks tm on t.entity_name = tm.entity_name where t.reviewer_email NOT LIKE '%@moderators-domain.net' and tm.reviewer_email LIKE '%@moderators-domain.net' ) t group by reviewer_email;
可能有一种方法可以使用过滤器甚至窗口函数来完成此操作.这种方法对我来说似乎是最自然的.
我应该注意,子查询当然没有必要:
select t.reviewer_email,sum( (t.result = tm.result)::int ) as moderator_agrees,sum( (t.result <> tm.result)::int ) as moderator_disagrees from tasks t join tasks tm on t.entity_name = tm.entity_name where t.reviewer_email NOT LIKE '%@moderators-domain.net' and tm.reviewer_email LIKE '%@moderators-domain.net' group by t.reviewer_email;