PostgreSQL 联结方式--排序-合并联结

排序-合并联结独立地读取需要联结的两张表，对每张表中的数据行（仅是那些满足各自where子句中条件的数据行）按照联结键进行排序，然后对排序后的数据行集进行合并。对这种联结方法来说排序的开销的非常大的。对于不能够放入内存中的大的数据源来说，可能会使用临时磁盘空间来完成排序。这是非常占内存和时间资源的。但是一旦数据行几点排序完成了，合并的过程是非常快的。为了进行合并，数据库轮流操作两个列表，比较最上面的数据行，丢弃在排序对列中比另一列表中的最上面一行出现得早的数据行，并只返回匹配的行。

在计算成本的时候，有的时候执行计划的总成本比两个表单独扫描+排序的成本要小，这是因为当两个表排完序后，可能两个表只有前几行是对应的，比如：a表有10条记录，b表10000条，b表只有前10行与a表匹配，那么单独扫描b表闭并排序的成本是比较大的，但是在合并的时候，当a表合并完，发现b表已经没有与a表匹配的了，也就不继续扫描b表了，那么b表也是只扫描了10行，b表实际的成本的计算应该是：b表总成本*(10/10000),后面会具体说这个问题。

1. 建表和插入测试数据参考：http://www.jb51.cc/article/p-chkofhzn-zc.html
2. select s.*,c.class_name from student s,class c where s.class_id=c.id;
这个查询将会向下面的伪代码一样来处理结果：
select c.class_name,c.id from class c order by id
select s.id,s.name,s.class_id from student s order by s.class_id
compare the rowsets and return rows where id/class_id in both lists match
for an outer join,compare the rowsets and return all rows from the first list setting column values for the other table to null
3. 执行计划

postgres=# explain(analyze,verbose,buffers) select s.*,class c where s.class_id=c.id;
                                                        QUERY PLAN                                                         
---------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=24.59..30.40 rows=384 width=44) (actual time=0.169..0.330 rows=384 loops=1)
   Output: s.id,s.class_id,c.class_name
   Merge Cond: (c.id = s.class_id)
   Buffers: shared hit=4
   ->  <span style="color:#ff0000;">Sort</span>  (cost=1.27..1.29 rows=10 width=36) (actual time=0.029..0.029 rows=3 loops=1)
         Output: c.class_name,c.id
         Sort Key: c.id
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=1
         ->  Seq Scan on public.class c  (cost=0.00..1.10 rows=10 width=36) (actual time=0.007..0.010 rows=10 loops=1)
               Output: c.class_name,c.id
               Buffers: shared hit=1
   ->  <span style="color:#ff0000;">Sort</span>  (cost=23.32..24.28 rows=384 width=12) (actual time=0.134..0.200 rows=384 loops=1)
         Output: s.id,s.class_id
         Sort Key: s.class_id
         Sort Method: quicksort  Memory: 43kB
         Buffers: shared hit=3
         ->  Seq Scan on public.student s  (cost=0.00..6.84 rows=384 width=12) (actual time=0.006..0.052 rows=384 loops=1)
               Output: s.id,s.class_id
               Buffers: shared hit=3
 Total runtime: 0.407 ms
(21 rows)

可以看到两个排序节点 Sort。

PostgreSQL 联结方式--排序-合并联结

猜你在找的Postgre SQL相关文章