sql – 使用int8range连接2个大的postgres表不能很好地缩放

我想加入IP路由表信息给IP whois信息.我正在使用Amazon的RDS,这意味着我不能使用Postgres ip4r扩展,所以我反而使用 int8range类型来表示IP地址范围,具有 gist索引.

我的表格如下所示：

=> \d routing_details
     Table "public.routing_details"
  Column  |   Type    | Modifiers
----------+-----------+-----------
 asn      | text      |
 netblock | text      |
 range    | int8range |
Indexes:
    "idx_routing_details_netblock" btree (netblock)
    "idx_routing_details_range" gist (range)


=> \d netblock_details
    Table "public.netblock_details"
   Column   |   Type    | Modifiers
------------+-----------+-----------
 range      | int8range |
 name       | text      |
 country    | text      |
 source     | text      |
Indexes:
    "idx_netblock_details_range" gist (range)

完整的routing_details表包含的行数不到600K,netblock_details包含大约8.25M行.在两个表中都有重叠的范围,但是对于routing_details表中的每个范围,我想从netblock_details表中获得最佳(最小)的匹配.

我想出了两个不同的查询,我认为会返回准确的数据,一个使用窗口函数,一个使用DISTINCT ON：

EXPLAIN SELECT DISTINCT ON (r.netblock) *
FROM routing_details r JOIN netblock_details n ON r.range <@ n.range
ORDER BY r.netblock,upper(n.range) - lower(n.range);
                                              QUERY PLAN
                                                         QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=118452809778.47..118477166326.22 rows=581300 width=91)
   Output: r.asn,r.netblock,r.range,n.range,n.name,n.country,((upper(n.range) - lower(n.range)))
   ->  Sort  (cost=118452809778.47..118464988052.34 rows=4871309551 width=91)
         Output: r.asn,((upper(n.range) - lower(n.range)))
         Sort Key: r.netblock,((upper(n.range) - lower(n.range)))
         ->  Nested Loop  (cost=0.00..115920727265.53 rows=4871309551 width=91)
               Output: r.asn,(upper(n.range) - lower(n.range))
               Join Filter: (r.range <@ n.range)
               ->  Seq Scan on public.routing_details r  (cost=0.00..11458.96 rows=592496 width=43)
                     Output: r.asn,r.range
               ->  Materialize  (cost=0.00..277082.12 rows=8221675 width=48)
                     Output: n.range,n.country
                     ->  Seq Scan on public.netblock_details n  (cost=0.00..163712.75 rows=8221675 width=48)
                           Output: n.range,n.country
(14 rows)               ->  Seq Scan on netblock_details n  (cost=0.00..163712.75 rows=8221675 width=48)


EXPLAIN VERBOSE SELECT * FROM (
SELECT *,ROW_NUMBER() OVER (PARTITION BY r.range ORDER BY UPPER(n.range) - LOWER(n.range)) AS rank
FROM routing_details r JOIN netblock_details n ON r.range <@ n.range
) a WHERE rank = 1 ORDER BY netblock;

                                                                    QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=118620775630.16..118620836521.53 rows=24356548 width=99)
   Output: a.asn,a.netblock,a.range,a.range_1,a.name,a.country,a.rank
   Sort Key: a.netblock
   ->  Subquery Scan on a  (cost=118416274956.83..118611127338.87 rows=24356548 width=99)
         Output: a.asn,a.rank
         Filter: (a.rank = 1)
         ->  WindowAgg  (cost=118416274956.83..118550235969.49 rows=4871309551 width=91)
               Output: r.asn,row_number() OVER (?),((upper(n.range) - lower(n.range))),r.range
               ->  Sort  (cost=118416274956.83..118428453230.71 rows=4871309551 width=91)
                     Output: ((upper(n.range) - lower(n.range))),r.asn,n.country
                     Sort Key: r.range,((upper(n.range) - lower(n.range)))
                     ->  Nested Loop  (cost=0.00..115884192443.90 rows=4871309551 width=91)
                           Output: (upper(n.range) - lower(n.range)),n.country
                           Join Filter: (r.range <@ n.range)
                           ->  Seq Scan on public.routing_details r  (cost=0.00..11458.96 rows=592496 width=43)
                                 Output: r.asn,r.range
                           ->  Materialize  (cost=0.00..277082.12 rows=8221675 width=48)
                                 Output: n.range,n.country
                                 ->  Seq Scan on public.netblock_details n  (cost=0.00..163712.75 rows=8221675 width=48)
                                       Output: n.range,n.country
(20 rows)

DISTINCT ON看起来效率稍高,所以我进行了一次.当我针对完整数据集运行查询时,在大约24小时的等待之后,我会发现磁盘空间错误.我创建了一个routing_details_small表,其中包含完整routing_details表的N行的子集,以尝试了解发生了什么.

N = 1000

=> EXPLAIN ANALYZE SELECT DISTINCT ON (r.netblock) *
-> FROM routing_details_small r JOIN netblock_details n ON r.range <@ n.range
-> ORDER BY r.netblock,upper(n.range) - lower(n.range);
                                                                                 QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=4411888.68..4453012.20 rows=999 width=90) (actual time=124.094..133.720 rows=999 loops=1)
   ->  Sort  (cost=4411888.68..4432450.44 rows=8224705 width=90) (actual time=124.091..128.560 rows=4172 loops=1)
         Sort Key: r.netblock,((upper(n.range) - lower(n.range)))
         Sort Method: external sort  Disk: 608kB
         ->  Nested Loop  (cost=0.41..1780498.29 rows=8224705 width=90) (actual time=0.080..101.518 rows=4172 loops=1)
               ->  Seq Scan on routing_details_small r  (cost=0.00..20.00 rows=1000 width=42) (actual time=0.007..1.037 rows=1000 loops=1)
               ->  Index Scan using idx_netblock_details_range on netblock_details n  (cost=0.41..1307.55 rows=41124 width=48) (actual time=0.063..0.089 rows=4 loops=1000)
                     Index Cond: (r.range <@ range)
 Total runtime: 134.999 ms
(9 rows)

N = 100000

=> EXPLAIN ANALYZE SELECT DISTINCT ON (r.netblock) *
-> FROM routing_details_small r JOIN netblock_details n ON r.range <@ n.range
-> ORDER BY r.netblock,upper(n.range) - lower(n.range);
                                                                                 QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=654922588.98..659034941.48 rows=200 width=144) (actual time=28252.677..29487.380 rows=98992 loops=1)
   ->  Sort  (cost=654922588.98..656978765.23 rows=822470500 width=144) (actual time=28252.673..28926.703 rows=454856 loops=1)
         Sort Key: r.netblock,((upper(n.range) - lower(n.range)))
         Sort Method: external merge  Disk: 64488kB
         ->  Nested Loop  (cost=0.41..119890431.75 rows=822470500 width=144) (actual time=0.079..24951.038 rows=454856 loops=1)
               ->  Seq Scan on routing_details_small r  (cost=0.00..1935.00 rows=100000 width=96) (actual time=0.007..110.457 rows=100000 loops=1)
               ->  Index Scan using idx_netblock_details_range on netblock_details n  (cost=0.41..725.96 rows=41124 width=48) (actual time=0.067..0.235 rows=5 loops=100000)
                     Index Cond: (r.range <@ range)
 Total runtime: 29596.667 ms
(9 rows)

N = 250000

=> EXPLAIN ANALYZE SELECT DISTINCT ON (r.netblock) *
-> FROM routing_details_small r JOIN netblock_details n ON r.range <@ n.range
-> ORDER BY r.netblock,upper(n.range) - lower(n.range);
                                                                                      QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=1651822953.55..1662103834.80 rows=200 width=144) (actual time=185835.443..190143.266 rows=247655 loops=1)
   ->  Sort  (cost=1651822953.55..1656963394.18 rows=2056176250 width=144) (actual time=185835.439..188779.279 rows=1103850 loops=1)
         Sort Key: r.netblock,((upper(n.range) - lower(n.range)))
         Sort Method: external merge  Disk: 155288kB
         ->  Nested Loop  (cost=0.28..300651962.46 rows=2056176250 width=144) (actual time=19.325..177403.913 rows=1103850 loops=1)
               ->  Seq Scan on netblock_details n  (cost=0.00..163743.05 rows=8224705 width=48) (actual time=0.007..8160.346 rows=8224705 loops=1)
               ->  Index Scan using idx_routing_details_small_range on routing_details_small r  (cost=0.28..22.16 rows=1250 width=96) (actual time=0.018..0.018 rows=0 loops=8224705)
                     Index Cond: (range <@ n.range)
 Total runtime: 190413.912 ms
(9 rows)

对于600k行的全表,大约24小时后查询失败,出现磁盘空间不足的错误,这可能是由外部合并步骤引起的.所以这个查询使用一个小的routing_details表非常快速地工作,但是扩展性非常差.

关于如何改进我的查询的建议,或者甚至可以改变模式,以便这个查询可以在完整的数据集上有效地工作？

解决方法

我正在考虑其他建议方法(例如,Erwin Brandstetter的最后一个查询,他使用简单的int8数据类型和简单的索引),但不想在答案中写下来,因为我认为这不是真的有效率.当您尝试查找覆盖给定范围的所有netblock范围时,索引不会有太大帮助.

我会在这里重复Erwin Brandstetter的查询：

SELECT *  -- only select columns you need to make it faster
FROM   routing_details r,LATERAL (
   SELECT *
   FROM   netblock_details n
   WHERE  n.ip_min <= r.ip_min
   AND    n.ip_max >= r.ip_max
   ORDER  BY n.ip_max - n.ip_min
   LIMIT  1
   ) n;

在netblock_details上有一个索引,像这样：

CREATE INDEX netblock_details_ip_min_max_idx ON netblock_details 
(ip_min,ip_max DESC NULLS LAST);

您可以快速(在O(logN)中)在netblock_details表中找到扫描的起始点 – 最小n.ip_min小于r.ip_min,或最小n.ip_max大于r.ip_max.但是,您必须扫描/读取索引/表的其余部分,并且对于每一行执行检查的第二部分并过滤出大部分行.

换句话说,此索引有助于快速找到满足第一搜索条件的起始行：n.ip_min <= r.ip_min,但是您将继续阅读满足此条件的所有行,并且对于每个此类行执行第二行检查n.ip_max> = r.ip_max.平均(如果数据均匀分布),您必须读取netblock_details表的一半行.优化器可以选择使用索引首先搜索n.ip_max> = r.ip_max,然后应用第二个过滤器n.ip_min< = r.ip_min,但不能使用此索引将两个过滤器应用于一起. 最终结果：
对于route_details中的每一行,我们将从netblock_details读取一半行. 600K * 4M = 2,400,000,000行.
是笛卡尔乘积的2倍.您可以在问题中的EXPLAIN ANALYZE的输出中看到这个数字(笛卡尔乘积).

592,496 * 8,221,675 = 4,871,309,550,800

难怪在尝试实现和排序时,您的查询耗尽磁盘空间.

总体高层次的过程得到最终结果：

>连接两个表(没有找到最好的匹配).在最糟糕的情况下,它是笛卡尔乘积,在最好的情况下,它覆盖范围从netblock_details每个范围从routing_details.你说在routing_details中的每个条目的netblock_details中有多个条目,从3到大约10,所以这个连接的结果可能有〜6M行(不是太多)
>通过routing_details范围对连接的结果进行排序/分区,对于每个这样的范围,从netblock_details找到最佳(最小)的覆盖范围.

我的想法是反转查询.而不是从较小的routing_details表中找到每一行的更大netblock_details的所有覆盖范围,我建议从较大的netblock_details中的每一行的较小的routing_details中找到所有较小的范围.

两步过程

对于每个来自较大netblock_details的行,可以从netblock范围内的routing_details中找到所有范围.

我将在查询中使用以下模式.我已经在两个表中添加了主键ID.

CREATE TABLE routing_details (
ID        int,ip_min   int8,ip_max   int8,asn      text,netblock text
);

CREATE TABLE netblock_details (
ID        int,name     text,country  text,source   text
);

SELECT
    netblock_details.ID AS n_ID,netblock_details.ip_max - netblock_details.ip_min AS n_length,r.ID AS r_ID
FROM
    netblock_details
    INNER JOIN LATERAL
    (
        SELECT routing_details.ID
        FROM routing_details
        WHERE
            routing_details.ip_min >= netblock_details.ip_min
            AND routing_details.ip_min <= netblock_details.ip_max
            -- note how routing_details.ip_min is limited from both sides
            -- this would make it possible to scan only (hopefully) small
            -- portion of the table instead of full or half table
            AND routing_details.ip_max <= netblock_details.ip_max
            -- this clause ensures that the whole routing range
            -- is inside the netblock range
    ) AS r ON true

我们需要在route_details上的索引(ip_min,ip_max).这里的主要内容是ip_min的索引.索引中的第二列有助于消除对ip_max的值进行查找的需要;它在树搜索中没有帮助.

请注意,横向子查询没有限制.这不是最后的结果.此查询应返回〜6M行.将此结果保存在临时表中.
在(r_ID,n_length,n_ID)上添加临时表的索引. n_ID也只是为了删除额外的查找.我们需要一个索引,避免为每个r_ID排序数据.

最后一步

对于route_details的每一行,找到n_length最小的n_ID.现在我们可以用“正确”的顺序使用横向连接.这里temp表是上一步索引的结果.

SELECT
    routing_details.*,t.n_ID,netblock_details.*
FROM
    routing_details
    INNER JOIN LATERAL
    (
        SELECT temp.n_ID
        FROM temp
        WHERE temp.r_ID = routing_details.ID
        ORDER BY temp.n_length
        LIMIT 1
    ) AS t ON true
    INNER JOIN netblock_details ON netblock_details.ID = t.n_ID

这里子查询应该是寻找索引,而不是扫描.优化器应该足够聪明,只需要查找并返回第一个找到的结果,因为LIMIT 1,所以你将在6M行临时表中找到600K的索引.

原来的答案(我会保留它的范围图)：

你可以“欺骗”吗

如果我正确地理解你的查询,对于每个routing_details.range
你想要找到一个最小的netblock_details.range,其覆盖/大于routing_details.range.

给定以下示例,其中r是路由范围,n1,…,n8是netblock范围,正确的答案是n5.

|---|
   n1

     |------------------|
     n2

                           |---------------|
                           n3

                                          |-----|
                                          n4

                  |------------------|
                  n5

                     |--------------------------------------|
                     n6

        |---------------------------|
        n7

                      |-----|
                      n8

                      |------------|
                      r
                     start       end

n.start <= r.start AND n.end >= r.end
order by n.length
limit 1

您的query that took 14 hours 显示索引扫描需要6ms,但按范围长度排序需要80ms.

通过这种搜索,数据没有简单的一维排序.你和n.end一起使用n.start和n.length.但是,也许您的数据不是通用的,或者返回一个有些不同的结果是可行的.

例如,如果结果返回n6是可行的,它可以工作得更快. n6是起始最大的覆盖范围：

n.start <= r.start AND n.end >= r.end
order by n.start desc
limit 1

或者,你可以去n7,最小的结局：

n.start <= r.start AND n.end >= r.end
order by n.end
limit 1

这种搜索将使用n.start(或n.end)上的简单索引,无需额外排序.

第二个完全不同的方法.范围有多大/长？如果它们相对较短(少数),那么可以尝试将它们存储为显式的整数列表,而不是一个范围.例如,范围[5-8]将被存储为4行：(5,6,7,8).使用这种存储模型,可能会更容易找到范围的交集.

sql – 使用int8range连接2个大的postgres表不能很好地缩放

解决方法

猜你在找的MsSQL相关文章