这是简化为两个索引的性能问题.进行以下设置:
CREATE TABLE ZZ_BASE AS SELECT dbms_random.random AS ID,DBMS_RANDOM.STRING('U',10) AS STR FROM DUAL CONNECT BY LEVEL <=1000000; CREATE INDEX ZZ_B_I ON ZZ_BASE(ID ASC); CREATE TABLE ZZ_CHILD AS SELECT dbms_random.random AS ID,10) AS STR FROM DUAL CONNECT BY LEVEL <=1000000; CREATE INDEX ZZ_C_I ON ZZ_CHILD(ID ASC); -- As @Flado pointed out,the following is required so index scanning can be done ALTER TABLE ZZ_BASE MODIFY (ID CONSTRAINT NN_B NOT NULL); ALTER TABLE ZZ_CHILD MODIFY (ID CONSTRAINT NN_C NOT NULL); -- given the join below not mandatory.
现在我要LEFT OUTER JOIN这两个表,只输出已经编入索引的ID字段.
SELECT ZZ_BASE.ID FROM ZZ_BASE LEFT OUTER JOIN ZZ_CHILD ON (ZZ_BASE.ID = ZZ_CHILD.ID); ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%cpu)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000K| 9765K| | 4894 (2)| 00:00:30 | |* 1 | HASH JOIN OUTER | | 1000K| 9765K| 16M| 4894 (2)| 00:00:30 | | 2 | INDEX FAST FULL SCAN| ZZ_B_I | 1000K| 4882K| | 948 (3)| 00:00:06 | | 3 | INDEX FAST FULL SCAN| ZZ_C_I | 1000K| 4882K| | 948 (3)| 00:00:06 | ----------------------------------------------------------------------------------------
正如你可以看到没有必要的表访问,只有索引访问.但是根据常识,HASH加入并不是加入这两个指标的最佳方式.如果这两个表大得多,那么就需要一个非常大的哈希表.
更有效的方法是将两个索引进行SORT-MERGE.
SELECT /*+ USE_MERGE(ZZ_BASE ZZ_CHILD) */ ZZ_BASE.ID FROM ZZ_BASE LEFT OUTER JOIN ZZ_CHILD ON (ZZ_BASE.ID = ZZ_CHILD.ID); ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%cpu)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000K| 9765K| | 6931 (3)| 00:00:42 | | 1 | MERGE JOIN OUTER | | 1000K| 9765K| | 6931 (3)| 00:00:42 | | 2 | INDEX FULL SCAN | ZZ_B_I | 1000K| 4882K| | 2258 (2)| 00:00:14 | |* 3 | SORT JOIN | | 1000K| 4882K| 22M| 4673 (4)| 00:00:29 | | 4 | INDEX FAST FULL SCAN| ZZ_C_I | 1000K| 4882K| | 948 (3)| 00:00:06 | -----------------------------------------------------------------------------------------
但是看起来第二个索引被排序,即使它已经是(“如果索引存在,那么数据库可以避免排序第一个数据集,但是数据库总是排序第二个数据集,而不考虑索引”1)
基本上,我想要的是使用SORT-MERGE连接并立即开始输出记录的查询,即:
>没有HASH加入,因为它首先必须制作哈希表(IO开销,如果存储在磁盘上),因此不会立即输出.
>没有NESTED LOOP加入,尽管它会输出
在索引较大的情况下,在非顺序索引读取时,具有索引戳和大IO开销的log(N)复杂度.
INDEX_ASC(或只是INDEX)是您可能想要尝试的提示,以便将性能与实际数据进行比较.
由于B * Tree索引找不到NULL键,而ZZ_BASE没有NOT NULL约束,我有点惊讶地得到外部行源的任何类型的索引扫描.添加和提示一些将使您以ZZ_C_I索引的索引顺序进行全面扫描.不幸的是,这并不能节省您的SORT JOIN步骤,但是至少应该比O(n)快得多 – 因为数据已经被排序了.
alter table zz_base modify (id not null); SELECT /*+ leading(zz_base) USE_MERGE(ZZ_CHILD) index_asc(zz_base (id)) index(zz_child (id)) */ ZZ_BASE.ID FROM ZZ_BASE left outer join ZZ_CHILD on zz_base.id=zz_child.id;
此查询使用以下执行计划:
------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%cpu)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1000K| 9765K| | 8241 (3)| 00:00:50 | | 1 | MERGE JOIN OUTER | | 1000K| 9765K| | 8241 (3)| 00:00:50 | | 2 | INDEX FULL SCAN | ZZ_B_I | 1000K| 4882K| | 2258 (2)| 00:00:14 | |* 3 | SORT JOIN | | 1000K| 4882K| 22M| 5983 (3)| 00:00:36 | | 4 | INDEX FULL SCAN| ZZ_C_I | 1000K| 4882K| | 2258 (2)| 00:00:14 | ------------------------------------------------------------------------------------