postgresql explain query中的materialize

前端之家收集整理的这篇文章主要介绍了postgresql explain query中的materialize前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。

今天在GREENPLUM的执行计划中看到了materialize,不知道这个步骤是做什么的,查了下文档,没有相关解释。没事,GREENPLUM不行还有POSTGREsql呢,GOOGLE了一把,找到了答案。

Q:

What does materialize do? I'm joining two tables,not views or anything like that.

A:

A materialize node means the output of whatever is below it in the tree (which can be a scan,or a full set of joins or something like that) is materalized into memory before the upper node is executed. This is usually done when the outer node needs a source that it can re-scan for some reason or other.

So in your case,the planner is determining that the result of a scan on one of your tables will fit in memory,and it till make it possible to choose an upper join operation that requires rescans while still being cheaper.


具体的sql如下

aligputf8=# select count(1) from ttt1;
count
-------
10000
(1 row)

aligputf8=#
aligputf8=# select count(1) from ttt2;
count
-------
10000
(1 row)


aligputf8=# explain select * from ttt1 as a,ttt2 as b where a.id=3;
QUERY PLAN
---------------------------------------------------------------------------------------------
Gather Motion 6:1 (slice2; segments: 6) (cost=0.00..1409.07 rows=1667 width=8)
-> Nested Loop (cost=0.00..1409.07 rows=1667 width=8)
-> Broadcast Motion 1:6 (slice1; segments: 1) (cost=0.00..137.07 rows=6 width=4)
-> Seq Scan on ttt1 a (cost=0.00..137.00 rows=1 width=4)
Filter: id = 3
-> Seq Scan on ttt2 b (cost=0.00..112.00 rows=1667 width=4)
(6 rows)

aligputf8=# explain select * from ttt1 as a,ttt2 as b where a.id=3;
aligputf8=#
aligputf8=# explain select * from ttt1 as a,ttt2 as b where a.id<3000;
QUERY PLAN
------------------------------------------------------------------------------------------------
Gather Motion 6:1 (slice2; segments: 6) (cost=122.00..3599268.93 rows=4998334 width=8)
-> Nested Loop (cost=122.00..3599268.93 rows=4998334 width=8)
-> Broadcast Motion 6:6 (slice1; segments: 6) (cost=0.00..346.93 rows=2999 width=4)
-> Seq Scan on ttt1 a (cost=0.00..137.00 rows=500 width=4)
Filter: id < 3000
-> Materialize (cost=122.00..222.00 rows=1667 width=4)
-> Seq Scan on ttt2 b (cost=0.00..112.00 rows=1667 width=4)
(7 rows)

我的理解就是PG为了加快nested loop循环的速度,把b表的数据缓存在了内存中。我们可以看到前面一个sql,只对b表数据进行1次扫描,因此并不需要进行缓存;后者需要进行2999次扫描。

猜你在找的Postgre SQL相关文章