min/max函数的index问题
在查询某表时同时使用了min和max函数,结果查询很慢,但是单独执行min或者max时速度很快,建个环境模拟一下:
create table t1_t(a int,b varchar2(50));
insert into t1_t(a,b) (select mod(rownum,50) rn,object_name from user_objects);
insert into t1_t(a,b) (select * from t1_t);
insert into t1_t(a,b) (select * from t1_t);
insert into t1_t(a,b) (select * from t1_t);
insert into t1_t(a,b) (select * from t1_t);
commit;
create index t1_a_index on t1_t(a);
create index t1_b_index on t1_t(b);
select min(a) from t1_t;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3242 Card=1 Bytes=3)
1 0 SORT (AGGREGATE)
2 1 INDEX (FULL SCAN (MIN/MAX)) OF 'T1_T_INDEX' (INDEX)
--只有min函数时是走索引的
select min(a),max(a) from t1_t;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3242 Card=1 Bytes=3)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'T1_T' (TABLE) (Cost=3242 Card=3591122 Bytes=10773366)
--两个一起的时候就全表扫描了
INDEX (FULL SCAN (MIN/MAX))算法:如果是Max,首先沿着最右边的Root-Branch Node-Leaf Node,发现最右边的Leaf Block是空的,于是沿着逆向指针往左走,一直走到最左边发现都是空的,于是扫描了所有的Leaf Blocks。如果是Min,首先沿着最左边的Root-Branch Node-Leaf Node,发现最左边的Leaf Block是空的,于是沿着顺向指针往右走,走到最右边发现都是空的,扫描了所有的Leaf blocks。
但是如何让min、max同时查询时走索引?似乎不可以,以下转载一篇非常详细的说明文档:
=============================================================================================
Table sbfi_ctry_flow_curve_wheel有大约1500万条记录,运行下面的sql需要4秒钟左右,developer认为时间太长,想优化。
SQL>select min(trade_dt), max(trade_dt) from sbfi_ctry_flow_curve_wheel;
developer很奇怪,trade_dt列上建有一个索引,但是执行的时候,oracle总是选择走primary key,而不选择那个索引。
经检查,发现表sbfi_ctry_flow_curve_wheel的索引情况如下:
SQL> list
1 select index_name,column_name,column_position from user_ind_columns
2 where table_name=upper('sbfi_ctry_flow_curve_wheel')
3 order by 1
4* ,3
SQL> /
INDEX_NAME COLUMN_NAME COLUMN_POSITION
------------------------------ ------------ ---------------
SBFI_CTRY_FLOW_CURVE_WHEEL_PK TRADE_DT 1
SBFI_CTRY_FLOW_CURVE_WHEEL_PK CTRY_CODE 2
SBFI_CTRY_FLOW_CURVE_WHEEL_PK MONTH 3
TRADE_DT_INDEX TRADE_DT 1
尝试加hint /*+ index(sbfi_ctry_flow_curve_wheel TRADE_DT_INDEX) */,让oracle选择走索引TRADE_DT_INDEX。结果发现运行时间没有缩短,反而从4秒增加到7秒。
SQL> select /*+ index(sbfi_ctry_flow_curve_wheel TRADE_DT_INDEX) */ min(trade_dt), max(trade_dt) from bfi_ctry_flow_curve_wheel;
MIN(TRADE MAX(TRADE
--------- ---------
01-JAN-01 07-SEP-07
Elapsed: 00:00:06.91
为什么会出现这种情况?我们尝试用TRACE去跟踪执行过程。
==session 1, don't use hint
alter session set timed_statistics=true
/
alter session set max_dump_file_size=unlimited
/
alter session set tracefile_identifier='PRIMARY_KEY'
/
alter session set events '10046 trace name context forever, level 12'
/
select min(trade_dt), max(trade_dt) from sbfi_ctry_flow_curve_wheel
/
alter session set events '10046 trace name context off'
/
==session 2, use hint
alter session set timed_statistics=true
/
alter session set max_dump_file_size=unlimited
/
alter session set tracefile_identifier='TRADE_DT_INDEX'
/
alter session set events '10046 trace name context forever, level 12'
/
select /*+ index(sbfi_ctry_flow_curve_wheel TRADE_DT_INDEX) */ min(trade_dt), max(trade_dt) from sbfi_ctry_flow_curve_wheel
/
alter session set events '10046 trace name context off'
/
两种情况的trace用tkprof分析后,主要部分结果如下:
==session1, don't use hint
********************************************************************************
select min(trade_dt), max(trade_dt)
from
sbfi_ctry_flow_curve_wheel
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.34 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 2.14 3.84 26044 26067 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 2.14 4.18 26044 26067 0 1
Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 89
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
7538400 INDEX FAST FULL SCAN SBFI_CTRY_FLOW_CURVE_WHEEL_PK (object id 35844)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
db file scattered read 1649 0.14 2.26
SQL*Net message from client 2 4.15 4.15
********************************************************************************
==session2, use hint
********************************************************************************
select /*+ index(sbfi_ctry_flow_curve_wheel TRADE_DT_INDEX) */ min(trade_dt),
max(trade_dt)
from
sbfi_ctry_flow_curve_wheel
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 3.21 8.84 19945 19945 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 3.21 8.85 19945 19945 0 1
Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 89
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
7538400 INDEX FULL SCAN TRADE_DT_INDEX (object id 35830)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
db file sequential read 19945 0.05 5.93
SQL*Net message from client 2 6.10 6.10
********************************************************************************
对比后主要的不同罗列如下:
走primary key, consistent read,即query值为 26067,fetch时间为3.84秒,访问主键索引的方法为INDEX FAST FULL SCAN。
走TRADE_DT_INDEX索引,consistent read值为19945, fetch时间为8.84秒,访问索引TRADE_DT_INDEX的方法为INDEX FULL SCAN。
关于INDEX FAST FULL SCAN,oracle文档中解释如下:
Fast full index scans are an alternative to a full table scan when the index contains all the columns that are needed for the query, and at least one column in the index key has the NOT NULL constraint. A fast full scan accesses the data in the index itself, without accessing the table. It cannot be used to eliminate a sort operation, because the data is not ordered by the index key. It reads the entire index using multiblock reads, unlike a full index scan, and can be parallelized.
You can specify fast full index scans with the initialization parameter OPTIMIZER_FEATURES_ENABLE or the INDEX_FFS hint. Fast full index scans cannot be performed against bitmap indexes.
A fast full scan is faster than a normal full index scan in that it can use multiblock I/O and can be parallelized just like a table scan.
关于INDEX FAST FULL SCAN:
A full scan is available if a predicate references one of the columns in the index. The predicate does not need to be an index driver. A full scan is also available when there is no predicate, if both the following conditions are met:
All of the columns in the table referenced in the query are included in the index.
At least one of the index columns is not null.
A full scan can be used to eliminate a sort operation, because the data is ordered by the index key. It reads the blocks singly.
上面明确讲到了fast full scan比full scan要快,因为它用multiblock I/O,而且可以parallelized。
顺便也注意到要调优的这句sql,只返回trade_dt列的值,满足index (fast) full scan的条件,即返回结果的列全都包含在索引里,非空。所以该sql只要扫描索引就能返回需要的结果,不需要再根据rowid去访问表。
既然要扫描整个索引,FAST FULL SCAN 比 FULL SCAN 快,TRADE_DT_INDEX 的 size 比 PK 的 size 小,那对 RADE_DT_INDEX 做FFS应该是最快的访问路径。用index_ffs hint:
SQL> select /*+ index_ffs(sbfi_ctry_flow_curve_wheel TRADE_DT_INDEX) */ min(trade_dt), max(trade_dt) from bfi_ctry_flow_curve_wheel;
MIN(TRADE MAX(TRADE
--------- ---------
01-JAN-01 07-SEP-07
Elapsed: 00:00:02.61
相比上面的4秒和7秒是快了一些。
但是根据一般的理解,象min(),max()这样的函数,Oracle应该直接访问索引的最左边或者最右边,这样的访问速度才是最快的。尝试SQL>select min(trade_dt) from sbfi_ctry_flow_curve_wheel并生成10046 trace文件,用tkprof格式化后结果如下:
==session 3, single function
********************************************************************************
select max(trade_dt)
from
sbfi_ctry_flow_curve_wheel
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.00 0.00 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 4 0.00 0.00 0 6 0 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 8 0.00 0.01 0 6 0 2
Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 89
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
1 INDEX FULL SCAN (MIN/MAX) SBFI_CTRY_FLOW_CURVE_WHEEL_PK (object id 35844)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 4 0.00 0.00
SQL*Net message from client 4 494.34 501.34
********************************************************************************
一些重要的信息: consistent read值为6, 相比以前的26067(PK)/19945(index),fetch时间<0.01秒,相比3.84s(PK)/8.84s(index)。访问索引的方法为INDEX FULL SCAN (MIN/MAX)。这是oracle文档库里没有提到的访问方法,但是
http://www.juliandyke.com/Optimisation/Operations/IndexFullScanMinMax.html
有一些介绍:Returns the first or last entry in the index。
看来oracle对于单个的min(),max()函数,能直接访问索引的最左边或者最右边取到结果,但是如果两个函数同时出现在一个sql里,oracle就只能扫描整个索引。这一点上还是不够智能。
=============================================================================================