SELECT count(*) FROM contacts_lists JOIN plain_contacts ON contacts_lists.contact_id = plain_contacts.contact_id JOIN contacts ON contacts.id = plain_contacts.contact_id WHERE plain_contacts.has_email AND NOT contacts.email_bad AND NOT contacts.email_unsub AND contacts_lists.list_id =67339
我怎样才能优化这个查询..请你解释一下……
解决方法
为清晰起见,重新格式化您的查询计划:
QUERY PLAN Aggregate (cost=126377.96..126377.97 rows=1 width=0) -> Hash Join (cost=6014.51..126225.38 rows=61033 width=0) Hash Cond: (contacts_lists.contact_id = plain_contacts.contact_id) -> Hash Join (cost=3067.30..121828.63 rows=61033 width=8) Hash Cond: (contacts_lists.contact_id = contacts.id) -> Index Scan using index_contacts_lists_on_list_id_and_contact_id on contacts_lists (cost=0.00..116909.97 rows=61033 width=4) Index Cond: (list_id = 66996) -> Hash (cost=1721.41..1721.41 rows=84551 width=4) -> Seq Scan on contacts (cost=0.00..1721.41 rows=84551 width=4) Filter: ((NOT email_bad) AND (NOT email_unsub)) -> Hash (cost=2474.97..2474.97 rows=37779 width=4) -> Seq Scan on plain_contacts (cost=0.00..2474.97 rows=37779 width=4) Filter: has_email
根据您的数据分布,两个部分索引可能会消除seq扫描:
-- if many contacts have bad emails or are unsubscribed: CREATE INDEX contacts_valid_email_idx ON contacts (id) WHERE (NOT email_bad AND NOT email_unsub); -- if many contacts have no email: CREATE INDEX plain_contacts_valid_email_idx ON plain_contacts (id) WHERE (has_email);
您可能缺少外键的索引:
CREATE INDEX plain_contacts_contact_id_idx ON plain_contacts (contact_id);
最后但并非最不重要的是,如果您从未分析过您的数据,则需要运行:
VACUUM ANALYZE;
如果完成所有操作仍然很慢,那么你可以做的就是没有太多可以合并你的plain_contacts和你的联系人表:尽管有上述索引,上面的查询计划意味着大多数/所有订阅者都订阅了特定列表 – 在这种情况下,上述查询计划是您获得的最快速度.