PostgreSQL index only scan

index only scan,是我们用select选择字段的时候，所选的字段全部都有索引，那么只需在索引中取数据，就不必访问数据块了，从而提高效率。
1. 建表

postgres=# create table tb_index_test(id serial primary key,name character varying);
CREATE TABLE 
postgres=# 
postgres=# \d tb_index_test;
                              Table "public.tb_index_test"
 Column |       Type        |                         Modifiers                          
--------+-------------------+------------------------------------------------------------
 id     | integer           | not null default nextval('tb_index_test_id_seq'::regclass)
 name   | character varying | 
Indexes:
    "tb_index_test_pkey" PRIMARY KEY,btree (id)

2. 插入测试数据

postgres=# insert into tb_index_test values(generate_series(1,10000),'john');
INSERT 0 10000

3. index only scan的启动成本
对于IndexOnlyScan节点,虽然是从index输出结果,但是还要先检查visibility MAP,因此startup_cost也大于0. 但是,它的启动成本计算并未计入这部分开销. 而是和普通的index scan计算方法一样.当你新建表之后，没有进行过vacuum和autovacuum操作，这时还没有VM文件，加上索引并没有保存记录的版本信息，索引index only scan还是需要扫描数据块来获取版本信息，这个时候可能比index scan要慢了。

postgres=# explain(analyze,verbose,buffers)select count(0) from tb_index_test where id<400;
                                                                       QUERY PLAN                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=22.29..22.30 rows=1 width=0) (actual time=0.127..0.127 rows=1 loops=1)
   Output: count(0)
   Buffers: shared hit=6
   ->  Index Only Scan using tb_index_test_pkey on public.tb_index_test  (cost=0.29..21.29 rows=400 width=0) (actual time=0.021..0.088 rows=399 loops=1)
         Output: id
         Index Cond: (tb_index_test.id < 400)
         <span style="color:#ff0000;">Heap Fetches: 399  --没有visibility map文件之前，需要fetch所有的heap page。</span>
         Buffers: shared hit=6
 Total runtime: 0.150 ms
(9 rows)

4. 当筛选的数据集变大到一定程度的时候，优化器还是会选择全表扫描

postgres=# explain(analyze,buffers)select id  from tb_index_test where id<8000;
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Seq Scan on public.tb_index_test  (cost=0.00..180.00 rows=8000 width=4) (actual time=0.009..1.526 rows=7999 loops=1)
   Output: id
   Filter: (tb_index_test.id < 8000)
   Rows Removed by Filter: 2001
   Buffers: shared hit=55
 Total runtime: 1.886 ms
(6 rows)


postgres=# set enable_seqscan =off;
SET
postgres=# explain(analyze,buffers)select id  from tb_index_test where id<8000;
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using tb_index_test_pkey on public.tb_index_test  (cost=0.29..236.28 rows=8000 width=4) (actual time=0.028..2.342 rows=7999 loops=1)
   Output: id
   Index Cond: (tb_index_test.id < 8000)
   Heap Fetches: 0
   Buffers: shared hit=24
 Total runtime: 3.439 ms
(6 rows)

如果把Seq Scan关闭，强制让优化器使用index only scan，发现成本比全表扫描的大。

5. 这个时候执行min(id),max(id)效率是很高的。

postgres=# explain(analyze,buffers)select min(id),max(id) from tb_index_test;
                                                                                        QUERY PLAN                                                                                         
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.63..0.64 rows=1 width=0) (actual time=0.024..0.024 rows=1 loops=1)
   Output: $0,$1
   Buffers: shared hit=6
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.29..0.31 rows=1 width=4) (actual time=0.017..0.017 rows=1 loops=1)
           Output: tb_index_test.id
           Buffers: shared hit=3
           ->  Index Only Scan using tb_index_test_pkey on public.tb_index_test  (cost=0.29..295.29 rows=10000 width=4) (actual time=0.015..0.015 rows=1 loops=1)
                 Output: tb_index_test.id
                 Index Cond: (tb_index_test.id IS NOT NULL)
                 Heap Fetches: 0
                 Buffers: shared hit=3
   InitPlan 2 (returns $1)
     ->  Limit  (cost=0.29..0.31 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=1)
           Output: tb_index_test_1.id
           Buffers: shared hit=3
           ->  Index Only Scan Backward using tb_index_test_pkey on public.tb_index_test tb_index_test_1  (cost=0.29..295.29 rows=10000 width=4) (actual time=0.005..0.005 rows=1 loops=1)
                 Output: tb_index_test_1.id
                 Index Cond: (tb_index_test_1.id IS NOT NULL)
                 Heap Fetches: 0
                 Buffers: shared hit=3
 Total runtime: 0.061 ms
(22 rows)

因为索引是按顺序存储的，只需访问一个索引块就可以得到min(id)，max(id)也是一样的。

PostgreSQL index only scan

猜你在找的Postgre SQL相关文章