业务是将远程的表的数据同步到本地。
数据库是Oracle 11.2.0.4.0
INSERT INTO GG_ISSUE
(ISSUE_ID,
CONTRACT_ID,
ISSUE_DATE,
PROJECT_ID)
SELECT /*+DRIVING_SITE(ISSUE)*/ /*+DRIVING_SITE(PRO)*/ '09SYN'||ISSUE.ISSUE_ID,
ISSUE.CONTRACT_ID,
ISSUE.ISSUE_DATE,
ISSUE.PROJECT_ID
FROM GG_ISSUE@dblinkname
ISSUE,GG_PROJECT@dblinkname
PRO WHERE ISSUE.PROJECT_ID = PRO.PROJECT_ID
AND ISSUE.ISSUE_TYPE <> 3
AND ISSUE.ISSUE_TYPE > 0
AND ISSUE.ISSUE_ID NOT IN (SELECT SUBSTR(MI.ISSUE_ID,6,LENGTH(ISSUE_ID)) FROM GG_ISSUE MI WHERE SUBSTR(MI.DATA_AREA,2) = '09')
AND ISSUE.ISSUE_STATUS = 'audited';
原sql执行非常缓慢,20分钟都没有反映,发现执行计划相当糟糕:
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
----------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 67925 | 31381350 | 457737 | 01:31:33 |
| 1 | LOAD TABLE CONVENTIONAL | GG_ISSUE | | | | |
| * 2 | FILTER | | | | | |
| 3 | NESTED LOOPS | | 67926 | 31381812 | 2973 | 00:00:36 |
| 4 | REMOTE | GG_ISSUE | 67926 | 29547810 | 2855 | 00:00:35 |
| 5 | REMOTE | GG_PROJECT | 1 | 27 | 0 | 00:00:01 |
| * 6 | TABLE ACCESS FULL | GG_ISSUE | 2 | 58 | 7 | 00:00:01 |
Predicate Information (identified by operation id):
------------------------------------------
* 2 - filter( NOT EXISTS (SELECT 0 FROM "GG_ISSUE" "MI" WHERE SUBSTR("MI"."DATA_AREA",2)='09' AND LNNVL(SUBSTR("MI"."ISSUE_ID",LENGTH("ISSUE_ID"))<>:B1)))
* 6 - filter(SUBSTR("MI"."DATA_AREA",LENGTH("ISSUE_ID"))<>:B1))
去掉insert后直接执行select比较快,查询所有数据2千多条需要15s,执行计划如下:
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT REMOTE | | 247088 | 120578944 | 29555 | 00:05:55 |
| * 1 | HASH JOIN RIGHT ANTI NA | | 247088 | 120578944 | 29555 | 00:05:55 |
| 2 | REMOTE | GG_ISSUE | 14672 | 572208 | 25088 | 00:05:02 |
| 3 | NESTED LOOPS | | 247089 | 110942961 | 4466 | 00:00:54 |
| 4 | PARTITION RANGE ALL | | 247938 | 107853030 | 4452 | 00:00:54 |
| * 5 | TABLE ACCESS FULL | GG_ISSUE | 247938 | 107853030 | 4452 | 00:00:54 |
| * 6 | INDEX UNIQUE SCAN | PK_PROJECT_ID_T | 1 | 14 | 0 | 00:00:01 |
---------------------------------------
Predicate Information (identified by operation id):
------------------------------------------
* 1 - access("A2"."ISSUE_ID"=SUBSTR("A3"."ISSUE_ID",LENGTH("A3"."ISSUE_ID")))
* 5 - filter("A2"."ISSUE_TYPE"<>3 AND "A2"."ISSUE_STATUS"='audited' AND "A2"."ISSUE_TYPE">0)
* 6 - access("A2"."PROJECT_ID"="A1"."PROJECT_ID")
说明insert 导致/*+DRIVING_SITE(ISSUE)*/失效。如果让sql走上正确的执行计划呢?有几种方案尝试一下:
1.加hint告诉CBO表上有多少数据,观察执行计划是否有变。
2.用merge into,观察执行计划是否有变。
3.将远程的关联表作为视图,然后通过dblink访问视图。
验证方案1:
SELECT /*+use_hash(PRO,ISSUE) CARDINALITY(PRO 1000000) CARDINALITY(ISSUE 2000000)*/ '09SYN'||ISSUE.ISSUE_ID,
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
--------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 3596461335 | 1661565136770 | 24935447682 | 999:59:59 |
| 1 | LOAD TABLE CONVENTIONAL | GG_ISSUE | | | | |
| * 2 | FILTER | | | | | |
| * 3 | HASH JOIN | | 3596475454 | 1661571659748 | 462539 | 01:32:31 |
| 4 | REMOTE | GG_PROJECT | 10000000 | 270000000 | 64 | 00:00:01 |
| 5 | REMOTE | GG_ISSUE | 20000000 | 8700000000 | 3578 | 00:00:43 |
| * 6 | TABLE ACCESS FULL | GG_ISSUE | 2 | 58 | 7 | 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------
* 2 - filter( NOT EXISTS (SELECT 0 FROM "GG_ISSUE" "MI" WHERE SUBSTR("MI"."DATA_AREA",LENGTH("ISSUE_ID"))<>:B1)))
* 3 - access("ISSUE"."PROJECT_ID"="PRO"."PROJECT_ID")
* 6 - filter(SUBSTR("MI"."DATA_AREA",LENGTH("ISSUE_ID"))<>:B1))
SELECT /*+use_hash(PRO,
SELECT /*+use_hash(PRO,ISSUE) CARDINALITY(MI 2000000)*/ SUBSTR(MI.ISSUE_ID,2) = '09'
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 35964613 | 16615651206 | 71116178 | 237:03:15 |
| 1 | LOAD TABLE CONVENTIONAL | GG_ISSUE | | | | |
| * 2 | FILTER | | | | | |
| * 3 | HASH JOIN | | 35964755 | 16615716810 | 47294 | 00:09:28 |
| 4 | REMOTE | GG_PROJECT | 1000000 | 27000000 | 64 | 00:00:01 |
| 5 | REMOTE | GG_ISSUE | 2000000 | 870000000 | 2925 | 00:00:36 |
| * 6 | TABLE ACCESS FULL | GG_ISSUE | 2000000 | 58000000 | 2 | 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------
* 2 - filter( NOT EXISTS (SELECT /*+ OPT_ESTIMATE (TABLE "MI" ROWS=2000000.000000 ) */ 0 FROM "GG_ISSUE" "MI" WHERE SUBSTR("MI"."DATA_AREA",2)='09' AND
LNNVL(SUBSTR("MI"."ISSUE_ID",LENGTH("ISSUE_ID"))<>:B1))
FILTER的算法类似nestloop,这样的执行计划很难执行出结果来
验证方案2:
merge into GG_ISSUE MI
using( SELECT /*+DRIVING_SITE(ISSUE)*/ /*+DRIVING_SITE(PRO)*/
ISSUE.ISSUE_ID,
ISSUE.ISSUE_DATE
FROM GG_ISSUE@dblinkname ISSUE,
GG_PROJECT@dblinkname PRO
WHERE ISSUE.PROJECT_ID = PRO.PROJECT_ID
AND ISSUE.ISSUE_TYPE <> 3
AND ISSUE.ISSUE_TYPE > 0
AND ISSUE.ISSUE_STATUS = 'audited') bb
on(SUBSTR(MI.ISSUE_ID,LENGTH(MI.ISSUE_ID))=bb.ISSUE_ID)
when not matched then
insert(ISSUE_ID,CONTRACT_ID,ISSUE_DATE) values(bb.ISSUE_ID,bb.CONTRACT_ID,bb.ISSUE_DATE)
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-----------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT,GOAL=ALL_ROWS | | 188585335 | 73193 | 00:01:28 |
| 1 | MERGE | GG_ISSUE | | | | |
| * 2 | VIEW | | | | | |
| * 3 | HASH JOIN OUTER | | 347938 | 188585335 | 73193 | 00:01:28 |
| 4 | VIEW | | | 3328374 | 2971 | 00:00:32 |
| 5 | REMOTE | | 1 | | | 00:00:00 |
| * 6 | TABLE ACCESS FULL | GG_ISSUE | 247938 | 635299745 | 39254 | 00:00:54 |
-----------------------------------------------------------------------------------------------
执行计划变得正常,实测12s能够完成。由于方案2解决了问题,方案3就不用验证了。