查询优化器
Oracle的查询优化器(QO)分为两种:
1. RBO:Ruled-Based Optimization, 基于规则的优化器;
2. CBO :Cost-Based Optimization, 基于代价的优化器;
从 Oracle 10g开始,Oracle已放弃RBO,但为了兼容性,仍然可以设置RBO.
优化模式
优化模式分为:
FIRST_ROWS: 尽可能快的先返回几行数据;
FIRST_ROWS_n:包含FIRST_ROWS_1000、FIRST_ROWS_100、FIRST_ROWS_10、FIRST_ROWS_1 和上面类似,只是制定了具体的行数;
ALL_ROWS: 以最快的方式返回所有的记录,这是默认的优化模式;
sql> show parameter optimizer_mode;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_mode string ALL_ROWS
sql>@H_502_32@
表的连接方式
Hash Join : 把小表的数据存到内存中,并建立HashTable,然后用大表的每条记录来匹配HashTable。两个表关联的字段无需建立索引,查找Hash要比查找索引快;
A join in which the database uses the smaller of two tables or data sources to build a hash table in memory. The database scans the larger table,probing the hash table for the addresses of the matching rows in the smaller table.
为了提高效率,需要设置hash_area_size 足够大,如果Hash表占用的内存超过了hash_area_size的大小,就会分页到临时表空间,这会带来一定的性能损耗。
什么时候optimizer 会选择使用 Hash Joins呢?
一、优化器自动选择
The optimizer uses a hash join to join two tables if they are joined using an equijoin and if either of the following conditions are true:
1)A large amount of data must be joined.
2)A large fraction of a small table must be joined.
二、人工指定
可以通过use_hash来强制使用Hash Join.
sql> select /*+use_hash(amy_emp,amy_dept*/ count(*) from amy_emp,amy_dept where amy_emp.deptno=amy_dept.deptno;@H_502_32@
a) 这种方法是在oracle7后来引入的,使用了比较先进的连接理论,一般来说,其效率应该好于其它2种连接,但是这种连接只能用在CBO优化器中,而且需要设置合适的hash_area_size参数,才能取得较好的性能。
b) 在2个较大的row source之间连接时会取得相对较好的效率,在一个row source较小时则能取得更好的效率。
c) 只能用于等值连接中。
Nested Loop :
外表驱动内表,外表的每一行都会在内表中进行匹配。与Hash Join不同的是,没有使用内表来生成HashTable,因此内表最好有索引。
It is important to ensure that the inner table is driven from (dependent on) the outer table. If the inner table’s access path is independent of the outer table,then the same rows are retrieved for every iteration of the outer loop,degrading performance considerably. In such cases,hash joins joining the two independent row sources perform better.
a) 如果driving row source(外部表)比较小,并且在inner row source(内部表)上有唯一索引,或有高选择性非唯一索引时,使用这种方法可以得到较好的效率。
b) NESTED LOOPS有其它连接方法没有的的一个优点是:可以先返回已经连接的行,而不必等待所有的连接操作处理完才返回数据,这可以实现快速的响应时间。
Sort Merge Join :
Sort merge joins can join rows from two independent sources. Hash joins generally perform better than sort merge joins. However,sort merge joins can perform better than hash joins if both of the following conditions exist:
1.The row sources are sorted already.
2.A sort operation does not have to be done.
However,if a sort merge join involves choosing a slower access method (an index scan as opposed to a full table scan),then the benefit of using a sort merge might be lost.
Sort merge joins are useful when the join condition between two tables is an inequality condition such as <,<=,>,or >=.
Sort merge joins perform better than nested loop joins for large data sets.
You cannot use hash joins unless there is an equality condition.
In a merge join,there is no concept of a driving table. The join consists of two steps:
1.Sort join operation: Both the inputs are sorted on the join key.
2.Merge join operation: The sorted lists are merged together.
If the input is sorted by the join column,then a sort join operation is not performed for that row source.
However,a sort merge join always creates a positionable sort buffer for the right side of the join so that it can seek back to the last match in the case where duplicate join key values come out of the left side of the join.
a) 对于非等值连接,这种连接方式的效率是比较高的。
b) 如果在关联的列上都有索引,效果更好。
c) 对于将2个较大的row source做连接,该连接方法比NL连接要好一些。
d) 但是如果sort merge返回的row source过大,则又会导致使用过多的rowid在表中查询数据时,数据库性能下降,因为过多的I/O。
三种连接方式的区别和选择
- Hash Join 不一定就比其它两种快,Hash Join 只能用于等值连接中;
- 如果使用FIRST_ROWS等提示,会强制CBO选择NESTED LOOP;
http://blog.csdn.net/cupid1102/article/details/7591027?locationNum=10&fps=1
http://www.2cto.com/database/201309/245953.html
http://www.cnblogs.com/kerrycode/p/3842215.html