因为有些表已经达到了2千多万行数据,并且有相当一部分数据是几年以前的@H_404_1@
所以那些数据,对于用户来说是没有必要,所以不得不考虑清除部分过期的数据@H_404_1@
@H_404_1@
但是发现当我使用delete时,发现删除数据很慢,语句如下@H_404_1@
delete from tablename where id <500000@H_404_1@
@H_404_1@
从sql 上发现,很难有优化的地方了,这种语句已经相当的精炼了@H_404_1@
@H_404_1@
后来几经查询,终于找到了解决方案(还是搜索国外的好)@H_404_1@
原来,我这个表用到了外键,而外键表又用到了外键表,所以是个嵌套的表@H_404_1@
@H_404_1@
ALTER TABLE mytable DISABLE TRIGGER ALL;
ALTER TABLE mytable ENABLE TRIGGER ALL; 删除数据从几十分钟,缩短到1分钟之内,终于达到了可接受的范围 摘自一段老外的描述:The usual advice when you complain about slow bulk deletions in postgresql is "make sure the foreign keys (pointing to the table you are deleting from) are indexed". This is because postgresql doesn't create the indexes automatically for all the foreign keys (FK) which can be considered a degree of freedom or a nuisance,depends how you look at it. Anyway,the indexing usually solves the performance issue. Unless you stumble upon a FK field that is not indexable. Like I did.
The field in question has the same value repeated over thousands of rows. Neither B-tree nor hash indexing works so postgresql is forced to do the slow sequential scan each time it deletes the table referenced by this FK (because the FK is a constraint and an automated check is triggered). Multiply this by the number of rows deleted and you'll see the minutes adding up.
@H_404_1@
最好做一个vacuum和reindex这2个操作,会让表的物理空间更加优化@H_404_1@
REINDEX TABLE tablename@H_404_1@
vacuum FULL tablename
@H_404_1@
@H_404_1@
经过这件事,让我觉得@H_404_1@
你用google搜索对应的外文解决方案一定要输对对应的key@H_404_1@
对于这个google的关键字为:@H_404_1@
postgres bulk delete
@H_404_1@
或者@H_404_1@
postgres optimize delete@H_404_1@
@H_404_1@
http://od-eon.com/blogs/stefan/optimizing-particular-case-bulk-deletion-postgresq/ http://www.linuxinsight.com/optimize_postgresql_database_size.html@H_404_1@