我需要所有字段(或几乎所有字段),因此仅选择一个子集并不是一个快速解决方案.选择我想要的字段仍然很慢.
这是我的表模式减去名称:
integer | not null default nextval('core_page_id_seq'::regclass) character varying(255) | not null character varying(64) | not null text | default '{}'::text character varying(255) | integer | not null default 0 text | default '{}'::text text | timestamp with time zone | integer | timestamp with time zone | integer |
文本字段的大小可以是任何大小.但是,在最坏的情况下,不超过几千字节.
问题
>有什么关于这个’尖叫’疯狂低效’?
>有没有办法在Postgres命令行测量页面大小来帮我调试?
Postgresql提供了一些Database Object Size Functions,你可以使用.我在这个查询中打包了最有趣的一些,并添加了一些Statistics Access Functions.
这将证明测量“行的大小”的各种方法可能导致非常不同的结果.这完全取决于您想要精确测量的内容.
将public.tbl替换为您的(可选的模式限定的)表名,以获得有关行大小的收集统计信息的简洁视图.
WITH x AS ( SELECT count(*) AS ct,sum(length(t::text)) AS txt_len -- length in characters,'public.tbl'::regclass AS tbl -- provide (qualified) table name here FROM public.tbl t -- ... and here ),y AS ( SELECT ARRAY [pg_relation_size(tbl),pg_relation_size(tbl,'vm'),'fsm'),pg_table_size(tbl),pg_indexes_size(tbl),pg_total_relation_size(tbl),txt_len ] AS val,ARRAY ['core_relation_size','visibility_map','free_space_map','table_size_incl_toast','indexes_size','total_size_incl_toast_and_indexes','live_rows_in_text_representation' ] AS name FROM x ) SELECT unnest(name) AS what,unnest(val) AS "bytes/ct",pg_size_pretty(unnest(val)) AS bytes_pretty,unnest(val) / ct AS bytes_per_row FROM x,y UNION ALL SELECT '------------------------------',NULL,NULL UNION ALL SELECT 'row_count',ct,NULL FROM x UNION ALL SELECT 'live_tuples',pg_stat_get_live_tuples(tbl),NULL FROM x UNION ALL SELECT 'dead_tuples',pg_stat_get_dead_tuples(tbl),NULL FROM x;
我只在数组中包装值和者再次使用unfst(),所以我不必重复拼写每一行的计算.
通用行计数统计信息最后附加了非常规sql-foo,以便在一个查询中获取所有内容.您可以将其包装到plpgsql函数中以供重复使用,将表名作为参数输入并使用EXECUTE.
结果:
what | bytes/ct | bytes_pretty | bytes_per_row -----------------------------------+----------+--------------+--------------- core_relation_size | 44138496 | 42 MB | 91 visibility_map | 0 | 0 bytes | 0 free_space_map | 32768 | 32 kB | 0 table_size_incl_toast | 44179456 | 42 MB | 91 indexes_size | 33128448 | 32 MB | 68 total_size_incl_toast_and_indexes | 77307904 | 74 MB | 159 live_rows_in_text_representation | 29987360 | 29 MB | 62 ------------------------------ | | | row_count | 483424 | | live_tuples | 483424 | | dead_tuples | 2677 | |
附加模块pgstattuple提供更多有用的功能.
Postgres 9.3的更新
我们可以使用unnest()
in pg 9.4 taking multiple parameters的新形式来并行排除数组.
但是使用LATERAL
and a VALUES
expression,可以进一步简化.加上一些其他改进:
SELECT l.what,l.nr AS "bytes/ct",CASE WHEN is_size THEN pg_size_pretty(nr) END AS bytes_pretty,CASE WHEN is_size THEN nr / x.ct END AS bytes_per_row FROM ( SELECT min(tableoid) AS tbl -- same as 'public.tbl'::regclass::oid,count(*) AS ct,sum(length(t::text)) AS txt_len -- length in characters FROM public.tbl t -- provide table name *once* ) x,LATERAL ( VALUES (true,'core_relation_size',pg_relation_size(tbl)),(true,'vm')),'fsm')),pg_table_size(tbl)),pg_indexes_size(tbl)),pg_total_relation_size(tbl)),'live_rows_in_text_representation',txt_len),(false,'------------------------------',NULL),'row_count',ct),'live_tuples',pg_stat_get_live_tuples(tbl)),'dead_tuples',pg_stat_get_dead_tuples(tbl)) ) l(is_size,what,nr);
结果相同.
Q1:效率低下吗?
您可以优化列顺序以保存每行的一些字节,当前浪费在对齐填充上:
integer | not null default nextval('core_page_id_seq'::regclass) integer | not null default 0 character varying(255) | not null character varying(64) | not null text | default '{}'::text character varying(255) | text | default '{}'::text text | timestamp with time zone | timestamp with time zone | integer | integer |
这样可以节省每行8到18个字节.我把它称为“列俄罗斯方块”.细节:
> Configuring PostgreSQL for read performance
> Calculating and saving space in PostgreSQL
还要考虑:
> Would index lookup be noticeably faster with char vs varchar when all values are 36 chars