Postgresql之autovacuum分析及表的垃圾数据查看
前端之家收集整理的这篇文章主要介绍了
Postgresql之autovacuum分析及表的垃圾数据查看,
前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
转载:http://blog.163.com/digoal@126/blog/static/163877040201343031118890/
Postgre
sql的并发控制简单来说是通过多tuple版本,tuple infomask信息,事务提交状态以及事务snapshot来实现的.
当
删除一条记录时,并不是马上回收被
删除的空间,因为有可能其他事务还会用到它,当更新一条记录是,老的记录会保留,然后插入新的记录.
例如 :
digoal=# create table tbl(id int,info text);
CREATE TABLE
digoal=# insert into tbl values (1,'test');
INSERT 0 1
digoal=# delete from tbl;
DELETE 1
digoal=# select ctid,* from tbl;
ctid | id | info
-------+----+------
(0,3) | 1 | test
(1 row)
多次
删除插入后,ctid以及变成3了,因为前面的两条并为
删除.
update也是如此 :
# update tbl set info='new';
UPDATE 1
老的tuple在0号block的itemid=3的位置,新的tuple是后面插入的在0号block的4号槽.
那么这些
垃圾数据是怎么回收的呢,Postgre
sql的vacuum进程就是干这个事情的.
1. vacuum 数据清理.
移除了3个版本.
# vacuum verbose tbl;
INFO: vacuuming "public.tbl"
INFO: "tbl": removed 3 row versions in 1 pages
INFO: "tbl": found 3 removable,1 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
cpu 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_32771"
INFO: index "pg_toast_32771_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted,0 are currently reusable.
INFO: "pg_toast_32771": found 0 removable,0 nonremovable row versions in 0 out of 0 pages
VACUUM
重新插入数据,此时那些被
垃圾占用的槽位就可以被利用了.
# insert into tbl values (1,1) | 1 | test
(2 rows)
一个表有多少条
垃圾数据,多少条活跃数据在系统表
pg_stat_all_tables
中可以查询.
# select * from pg_stat_all_tables where relid='tbl'::regclass;
-[ RECORD 1 ]-----+------------------------------
relid | 32771
schemaname | public
relname | tbl
seq_scan | 6
seq_tup_read | 7
idx_scan |
idx_tup_fetch |
n_tup_ins | 4
n_tup_upd | 1
n_tup_del | 2
n_tup_hot_upd | 1
n_live_tup | 2
n_dead_tup | 0
last_vacuum | 2013-05-27 17:00:17.094391+08
last_autovacuum |
last_analyze |
last_autoanalyze |
vacuum_count | 1
autovacuum_count | 0
analyze_count | 0
autoanalyze_count | 0
@H_
301_382@# delete from tbl;;
DELETE 2
digoal=# select * from pg_stat_all_tables where relid='tbl'::regclass;
seq_scan | 7
seq_tup_read | 9
n_tup_del | 4
n_live_tup | 0
n_dead_tup | 2
autoanalyze_count | 0
# vacuum tbl;
VACUUM
digoal=# select * from pg_stat_all_tables where relid='tbl'::regclass;
-[ RECORD 1 ]-----+------------------------------
relid | 32771
schemaname | public
relname | tbl
seq_scan | 7
seq_tup_read | 9
idx_scan |
idx_tup_fetch |
n_tup_ins | 4
n_tup_upd | 1
n_tup_del | 4
n_tup_hot_upd | 1
n_live_tup | 0
n_dead_tup | 0
last_vacuum | 2013-05-27 17:05:17.664564+08
last_autovacuum |
last_analyze |
last_autoanalyze |
vacuum_count | 2
autovacuum_count | 0
analyze_count | 0
autoanalyze_count | 0
相关参数如下 :
#------------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#------------------------------------------------------------------------------
#autovacuum = on # Enable autovacuum subprocess? 'on'
# requires track_counts to also be on.
#log_autovacuum_min_duration = -1 # -1 disables,0 logs all actions and
# their durations,> 0 logs only
# actions running at least this number
# of milliseconds.
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
# (change requires restart)
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
#autovacuum_analyze_threshold = 50 # min number of row updates before
# analyze
#autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum
#autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum
#autovacuum_vacuum_cost_delay = 20ms # default vacuum cost delay for
# autovacuum,in milliseconds;
# -1 means use vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for
# vacuum_cost_limit
# - Cost-Based Vacuum Delay -
#vacuum_cost_delay = 0 # 0-100 milliseconds
#vacuum_cost_page_hit = 1 # 0-10000 credits
#vacuum_cost_page_miss = 10 # 0-10000 credits
#vacuum_cost_page_dirty = 20 # 0-10000 credits
#vacuum_cost_limit = 200 # 1-10000 credits
简单介绍一下参数的含义 :
log_autovacuum_min_duration,在什么情况下记录autovacuum日志
输出. 0表示记录所有的autovacuum,-1表示不记录,其他为时间阈值,大于或等于这个时长的autovacuum才记录.
autovacuum_max_workers,指最大允许多少个autovacuum子进程同时工作. 因为vacuum会带来IO上的开销,还会消耗内存. 这个就不要配太大了.
autovacuum_vacuum_threshold表示autovacuum的vacuum操作所需的最小变更数,如果这个表的update/delete的tuple总数小于这个数字则不会触发autovacuum的vacuum操作.
和autovacuum_analyze_threshold
表示autovacuum的analyze操作所需的最小变更数,如果这个表的insert/update/delete的tuple总数小于这个数字则不会触发autovacuum的analyze操作.
autovacuum_vacuum_scale_factor,表示autovacuum的vacuum操作所需的变更量阈值,当
这个表的update/delete的tuple总数
大于(pg_class.reltuples*
autovacuum_vacuum_scale_factor+
autovacuum_vacuum_threshold)时,触发vacuum操作.
autovacuum_analyze_scale_factor,表示autovacuum的analyze操作所需的变更量阈值,当
这个表的INSERT/update/delete的tuple总数
大于(pg_class.reltuples*
autovacuum_analyze_scale_factor+
autovacuum_analyze_threshold)时,触发analyze操作.
autovacuum_freeze_max_age,即使autovacuum未开启,为了防止wrapped xid导致数据不可见,也会
自动触发的vacuum操作. 表示一个表中存在的最早的事务信息到现在为止经历的事务数. 超出则强制vacuum. 防止xid wrapped.
autovacuum_vacuum_cost_delay,因为vacuum会带来一定的IO开销,所以Postgre
sql允许
管理员指定当vacuum达到一定的阈值后进入随眠状态,然后再唤醒继续vacuum. 具体的计算需要配置项Cost-Based Vacuum Delay决定.
接下来主要举例说明几个
threshold参数的作用 :
查看当前的阈值 :
# show autovacuum_analyze_scale_factor;
autovacuum_analyze_scale_factor
---------------------------------
0.1
(1 row)
digoal=# show autovacuum_vacuum_scale_factor;
autovacuum_vacuum_scale_factor
--------------------------------
0.2
digoal=# show autovacuum_analyze_threshold;
autovacuum_analyze_threshold
------------------------------
50
digoal=# show autovacuum_vacuum_threshold;
autovacuum_vacuum_threshold
-----------------------------
修改naptime,以及log_autovacuum_min_duration
便于从日志中或者统计表中观察结果 :
pg93@db-17216333-> cd $PGDATA
autovacuum_naptime = 1s
log_autovacuum_min_duration 0
pg93@db pg_ctl reload
server signaled