比如一张表有1000个数据块,我们建议一个BRIN在ID(假设这个表有ID字段)上的索引。
BRIN默认是每128个连续数据块区间存储一个字段取值的区间,所以这个索引的信息量是将1000个数据块划分为几个连续的128个块的区间,然后存储每个区间ID值的取值范围。
以上是BRIN大概的原理,那么BRIN可以用在什么场景呢?
一个非常好的场景是流式日志数据,比如用户行为,大批量的数据按时间顺序不停的插入数据表。
我们如果要按照时间来访问这样的数据,以往我们需要创建BTREE索引,可以范围查询或者精确匹配。但是BTREE索引需要存储的信息量较大,如果数据量很大,索引也很庞大。
BRIN的话,索引可以变得很小,而且因为数据是按照时间顺序插入的,所以BRIN的信息量也很大,因为每个连续的数据块区间存储的时间范围和其他连续的数据块区间独立性很好,即不会出现大量数据交叉,如果有大量较差,那么使用BRIN检索还不如全表扫描。
BRIN可认为是全表扫描的切片,如果数据值分布和物理值分布的相关性很好,那么BRIN无疑是非常好的选择。
接下来我们测试一下BRIN对于相关性好和相关性差的数据,以及他们的性能。
postgres=# create table t1(id int,info text);CREATE TABLEpostgres=# create table t2(id int,info text);CREATE TABLEpostgres=# insert into t1 select generate_series(1,10000000),md5(random()::text);INSERT 0 10000000
以下数据ID和物理存储相关性非常差。
postgres=# insert into t2 select id,md5(random()::text) from generate_series(1,10000000) as t(id) order by random();INSERT 0 10000000postgres=# analyze t1;ANALYZEpostgres=# analyze t2;ANALYZE
查询他们的相关性。显然T2表的物理存储和实际值顺序相关性很差。
postgres=# select correlation from pg_stats where tablename='t1' and attname='id';correlation-------------1(1 row)postgres=# select correlation from pg_stats where tablename='t2' and attname='id';correlation-------------0.00805771(1 row)
创建索引,创建索引的速度明显比BTREE索引快,因为BRIN只需要存储值区间,瘦得很。
postgres=# create index idx_t1_id on t1 using brin (id);CREATE INDEXpostgres=# create index idx_t2_id on t2 using brin (id);CREATE INDEX
我们看看索引的大小和表的大小,从BRIN的原理我们可以想象索引肯定很小,表650MB,索引才192K。
postgres=# \di+List of relationsSchema | Name | Type | Owner | Table | Size | Description--------+-----------+-------+----------+-------+--------+-------------public | idx_t1_id | index | postgres | t1 | 192 kB |public | idx_t2_id | index | postgres | t2 | 192 kB |(2 rows)postgres=# \dt+ t1List of relationsSchema | Name | Type | Owner | Size | Description--------+------+-------+----------+--------+-------------public | t1 | table | postgres | 650 MB |(1 row)postgres=# \dt+ t2List of relationsSchema | Name | Type | Owner | Size | Description--------+------+-------+----------+--------+-------------public | t2 | table | postgres | 650 MB |(1 row)
来看看实际的查询差别就知道,BRIN有多么适合流式数据了。
postgres=# explain analyze select * from t1 where id>=1000 and id<=5000;QUERY PLAN-------------------------------------------------------------------------------------------------------------------------Bitmap Heap Scan on t1 (cost=50.98..9767.60 rows=3803 width=37) (actual time=0.351..13.732 rows=4001 loops=1)Recheck Cond: ((id >= 1000) AND (id <= 5000))Rows Removed by Index Recheck: 57567Heap Blocks: lossy=128-> Bitmap Index Scan on idx_t1_id (cost=0.00..50.03 rows=3803 width=0) (actual time=0.104..0.104 rows=1280 loops=1)Index Cond: ((id >= 1000) AND (id <= 5000))Planning time: 0.111 msExecution time: 14.019 ms(8 rows)
对于相关性差的,还不如全表扫描。
postgres=# explain analyze select * from t2 where id>=1000 and id<=5000;QUERY PLAN---------------------------------------------------------------------------------------------------------------------------Bitmap Heap Scan on t2 (cost=49.78..9549.73 rows=3686 width=37) (actual time=2.806..2268.044 rows=4001 loops=1)Recheck Cond: ((id >= 1000) AND (id <= 5000))Rows Removed by Index Recheck: 9995999Heap Blocks: lossy=20791-> Bitmap Index Scan on idx_t2_id (cost=0.00..48.86 rows=3686 width=0) (actual time=2.019..2.019 rows=208640 loops=1)Index Cond: ((id >= 1000) AND (id <= 5000))Planning time: 0.195 msExecution time: 2268.590 ms(8 rows)
t2全表扫描
postgres=# set enable_bitmapscan=off;SETpostgres=# explain analyze select * from t2 where id>=1000 and id<=5000;QUERY PLAN-----------------------------------------------------------------------------------------------------------Seq Scan on t2 (cost=0.00..170791.00 rows=3686 width=37) (actual time=0.593..1881.929 rows=4001 loops=1)Filter: ((id >= 1000) AND (id <= 5000))Rows Removed by Filter: 9995999Planning time: 0.109 msExecution time: 1882.397 ms(5 rows)
接下来BRIN和BTREE索引对比一下。
postgres=# create index idx_t1_id_bt on t1 using btree (id);CREATE INDEXpostgres=# create index idx_t2_id_bt on t2 using btree (id);CREATE INDEXpostgres=# set enable_bitmapscan=on;SETpostgres=# drop index idx_t1_id;DROP INDEXpostgres=# drop index idx_t2_id;DROP INDEXpostgres=# explain analyze select * from t1 where id>=1000 and id<=5000;QUERY PLAN--------------------------------------------------------------------------------------------------------------------------Index Scan using idx_t1_id_bt on t1 (cost=0.43..102.04 rows=3880 width=37) (actual time=0.023..1.048 rows=4001 loops=1)Index Cond: ((id >= 1000) AND (id <= 5000))Planning time: 0.412 msExecution time: 1.318 ms(4 rows)postgres=# explain analyze select * from t2 where id>=1000 and id<=5000;QUERY PLAN----------------------------------------------------------------------------------------------------------------------------Bitmap Heap Scan on t2 (cost=53.05..10056.68 rows=3962 width=37) (actual time=1.932..8.304 rows=4001 loops=1)Recheck Cond: ((id >= 1000) AND (id <= 5000))Heap Blocks: exact=3642-> Bitmap Index Scan on idx_t2_id_bt (cost=0.00..52.05 rows=3962 width=0) (actual time=1.143..1.143 rows=4001 loops=1)Index Cond: ((id >= 1000) AND (id <= 5000))Planning time: 0.379 msExecution time: 8.621 ms(7 rows)
postgres=# \di+List of relationsSchema | Name | Type | Owner | Table | Size | Description--------+--------------+-------+----------+-------+--------+-------------public | idx_t1_id_bt | index | postgres | t1 | 213 MB |public | idx_t2_id_bt | index | postgres | t2 | 213 MB |(2 rows)
接下调整brin索引的精度提高查询效率,我们了解到默认的brin是存储128个连续的数据块区间的,这个值越小,精度越高。
postgres=# create index idx_t1_id on t1 using brin (id) with (pages_per_range=1);CREATE INDEXpostgres=# create index idx_t2_id on t2 using brin (id) with (pages_per_range=1);CREATE INDEXpostgres=# \di+List of relationsSchema | Name | Type | Owner | Table | Size | Description--------+--------------+-------+----------+-------+--------+-------------public | idx_t1_id | index | postgres | t1 | 672 kB |public | idx_t1_id_bt | index | postgres | t1 | 213 MB |public | idx_t2_id | index | postgres | t2 | 672 kB |public | idx_t2_id_bt | index | postgres | t2 | 213 MB |(4 rows)postgres=# drop index idx_t1_id_bt;DROP INDEXpostgres=# drop index idx_t2_id_bt;DROP INDEXpostgres=# explain analyze select * from t1 where id>=1000 and id<=5000;QUERY PLAN------------------------------------------------------------------------------------------------------------------------Bitmap Heap Scan on t1 (cost=110.98..9827.60 rows=3803 width=37) (actual time=9.487..10.571 rows=4001 loops=1)Recheck Cond: ((id >= 1000) AND (id <= 5000))Rows Removed by Index Recheck: 328Heap Blocks: lossy=9-> Bitmap Index Scan on idx_t1_id (cost=0.00..110.03 rows=3803 width=0) (actual time=9.449..9.449 rows=90 loops=1)Index Cond: ((id >= 1000) AND (id <= 5000))Planning time: 0.141 msExecution time: 10.853 ms(8 rows)postgres=# explain analyze select * from t2 where id>=1000 and id<=5000;QUERY PLAN-----------------------------------------------------------------------------------------------------------------------------Bitmap Heap Scan on t2 (cost=109.78..9609.73 rows=3686 width=37) (actual time=10.407..481.673 rows=4001 loops=1)Recheck Cond: ((id >= 1000) AND (id <= 5000))Rows Removed by Index Recheck: 2125867 # 看看精度不高的后果,取4001条数据却额外扫描了2125867条无用数据Heap Blocks: lossy=4428-> Bitmap Index Scan on idx_t2_id (cost=0.00..108.86 rows=3686 width=0) (actual time=10.364..10.364 rows=44280 loops=1)Index Cond: ((id >= 1000) AND (id <= 5000))Planning time: 0.106 msExecution time: 482.077 ms(8 rows)
精度提高后,扫描效率有一定的提升。(对于相关度不高的就不要用BRIN了,精度提高到1都于事无补的,无用功太多)当然相比btree还有差距,不过对于大数据场景,我们还要考虑数据的插入性能,对于btree插入性能好还是brin的插入性能好呢?
我这里简单的测试了一下,并未涉及并发处理,已经可以明显的了解到btree索引对数据插入带来的开销更大。
postgres=# \d t1Table "public.t1"Column | Type | Modifiers--------+---------+-----------id | integer |info | text |Indexes:"idx_t1_id" brin (id) WITH (pages_per_range=1)postgres=# \timingTiming is on.postgres=# insert into t1 select generate_series(1,1000000);INSERT 0 1000000Time: 2152.527 mspostgres=# drop index idx_t1_id;DROP INDEXTime: 9.527 mspostgres=# create index idx_t1_id_bt on t1 using btree (id);CREATE INDEXTime: 29659.752 mspostgres=# insert into t1 select generate_series(1,1000000);INSERT 0 1000000Time: 5407.971 ms
最后,我们同样可以使用pageinspect来观测brin索引的内容。
postgres=# create extension pageinspect;CREATE EXTENSIONpostgres=# select * from brin_page_items(get_raw_page('idx_t1_id',10),'idx_t1_id');itemoffset | blknum | attnum | allnulls | hasnulls | placeholder | value------------+--------+--------+----------+----------+-------------+----------------------1 | 2176 | 1 | f | f | f | {1046657 .. 1047137}2 | 2177 | 1 | f | f | f | {1047138 .. 1047618}3 | 2178 | 1 | f | f | f | {1047619 .. 1048099}4 | 2179 | 1 | f | f | f | {1048100 .. 1048580}5 | 2180 | 1 | f | f | f | {1048581 .. 1049061}6 | 2181 | 1 | f | f | f | {1049062 .. 1049542}7 | 2182 | 1 | f | f | f | {1049543 .. 1050023}。。。
例如我们看到
2176
这个数据块的ID取值区间是
{1046657 .. 1047137},我们使用ctid来验证一下.
postgres=# select min(id),max(id) from t1 where ctid::text ~ E'^\\(2176,';min | max---------+---------1046657 | 1047137(1 row)# 完全正确
其他还有几个pageinspect的函数:
postgres=# SELECT brin_page_type(get_raw_page('idx_t1_id',id)) from generate_series(0,10) t(id);brin_page_type----------------revmaprevmaprevmaprevmaprevmapregularregularregularregularregular(11 rows)postgres=# SELECT * FROM brin_Metapage_info(get_raw_page('idx_t1_id',0));magic | version | pagesperrange | lastrevmappage------------+---------+---------------+----------------0xA8109CFA | 1 | 1 | 5(1 row)postgres=# SELECT * FROM brin_revmap_data(get_raw_page('idx_t1_id',1)) limit 5;pages-----------(18,1105)(18,1106)(18,1107)(18,1108)(18,1109)(5 rows)
[参考]
BRIN indexes accept a different parameter:pages_per_rangeDefines the number of table blocks that make up one block range for each entry of a BRIN index (see Section 60.1 for more details). The default is 128.