我有以下查询:
SELECT DISTINCT e.id,folder,subject,in_reply_to,message_id,"references",e.updated_at,( select count(*) from emails where ( select "references"[1] from emails where message_id = e.message_id ) = ANY ("references") or message_id = ( select "references"[1] from emails where message_id = e.message_id ) ) FROM "emails" e INNER JOIN "email_participants" ON ("email_participants"."email_id" = e."id") WHERE (("user_id" = 220) AND ("folder" = 'INBox')) ORDER BY e."updated_at" DESC LIMIT 10 OFFSET 0;
( select count(*) from emails where ( select "references"[1] from emails where message_id = e.message_id ) = ANY ("references") or message_id = ( select "references"[1] from emails where message_id = e.message_id ) )
事实上,我已经尝试过更简单的子查询,似乎是聚合函数本身需要时间.
解决方法
扩展Paul Guyot的答案,您可以将子查询移动到派生表中,该表应该执行得更快,因为它在一次扫描(加一个连接)中获取消息计数,而不是每行一次扫描.
SELECT DISTINCT e.id,e.folder,e.subject,e.message_id,e."references",t1.message_count FROM "emails" e INNER JOIN "email_participants" ON ("email_participants"."email_id" = e."id") INNER JOIN ( SELECT COUNT(e2.id) message_count,e.message_id FROM emails e LEFT JOIN emails e2 ON (ARRAY[e."references"[1]] <@ e2."references" OR e2.message_id = e."references"[1]) GROUP BY e.message_id ) t1 ON t1.message_id = e.message_id WHERE (("user_id" = 220) AND ("folder" = 'INBox')) ORDER BY e."updated_at" DESC LIMIT 10 OFFSET 0;
使用pastebin数据的小提琴 – http://www.sqlfiddle.com/#!15/c6298/7
下面是postgres生成的查询计划,用于通过加入派生表来获取相关子查询中的计数与获取计数.我使用了自己的一张桌子,但我认为结果应该是相似的.
相关子查询
"Limit (cost=0.00..1123641.81 rows=1000 width=8) (actual time=11.237..5395.237 rows=1000 loops=1)" " -> Seq Scan on visit v (cost=0.00..44996236.24 rows=40045 width=8) (actual time=11.236..5395.014 rows=1000 loops=1)" " SubPlan 1" " -> Aggregate (cost=1123.61..1123.62 rows=1 width=0) (actual time=5.393..5.393 rows=1 loops=1000)" " -> Seq Scan on visit v2 (cost=0.00..1073.56 rows=20018 width=0) (actual time=0.002..4.280 rows=21393 loops=1000)" " Filter: (company_id = v.company_id)" " Rows Removed by Filter: 18653" "Total runtime: 5395.369 ms"
加入派生表
"Limit (cost=1173.74..1211.81 rows=1000 width=12) (actual time=21.819..22.629 rows=1000 loops=1)" " -> Hash Join (cost=1173.74..2697.72 rows=40036 width=12) (actual time=21.817..22.465 rows=1000 loops=1)" " Hash Cond: (v.company_id = visit.company_id)" " -> Seq Scan on visit v (cost=0.00..973.45 rows=40045 width=8) (actual time=0.010..0.198 rows=1000 loops=1)" " -> Hash (cost=1173.71..1173.71 rows=2 width=12) (actual time=21.787..21.787 rows=2 loops=1)" " Buckets: 1024 Batches: 1 Memory Usage: 1kB" " -> HashAggregate (cost=1173.67..1173.69 rows=2 width=4) (actual time=21.783..21.784 rows=3 loops=1)" " -> Seq Scan on visit (cost=0.00..973.45 rows=40045 width=4) (actual time=0.003..6.695 rows=40046 loops=1)" "Total runtime: 22.806 ms"