min/max函数的index问题

min/max函数的index问题

在查询某表时同时使用了min和max函数,结果查询很慢，但是单独执行min或者max时速度很快，建个环境模拟一下:

create table t1_t(a int,b varchar2(50));

insert into t1_t(a,b) (select mod(rownum,50) rn,object_name from user_objects);

insert into t1_t(a,b) (select * from t1_t);
insert into t1_t(a,b) (select * from t1_t);
insert into t1_t(a,b) (select * from t1_t);
insert into t1_t(a,b) (select * from t1_t);

commit;

create index t1_a_index on t1_t(a);
create index t1_b_index on t1_t(b);

select min(a) from t1_t;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3242 Card=1 Bytes=3)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (FULL SCAN (MIN/MAX)) OF 'T1_T_INDEX' (INDEX)

--只有min函数时是走索引的

select min(a),max(a) from t1_t;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3242 Card=1 Bytes=3)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'T1_T' (TABLE) (Cost=3242 Card=3591122 Bytes=10773366)

--两个一起的时候就全表扫描了

INDEX (FULL SCAN (MIN/MAX))算法：如果是Max,首先沿着最右边的Root-Branch Node-Leaf Node，发现最右边的Leaf Block是空的，于是沿着逆向指针往左走，一直走到最左边发现都是空的，于是扫描了所有的Leaf Blocks。如果是Min,首先沿着最左边的Root-Branch Node-Leaf Node，发现最左边的Leaf Block是空的，于是沿着顺向指针往右走，走到最右边发现都是空的，扫描了所有的Leaf blocks。

但是如何让min、max同时查询时走索引？似乎不可以，以下转载一篇非常详细的说明文档：

http://zhyuh.itpub.net/get/334/mix_max_index

=============================================================================================

Table sbfi_ctry_flow_curve_wheel有大约1500万条记录，运行下面的sql需要4秒钟左右，developer认为时间太长，想优化。
SQL>select min(trade_dt), max(trade_dt) from sbfi_ctry_flow_curve_wheel;
developer很奇怪，trade_dt列上建有一个索引，但是执行的时候，oracle总是选择走primary key，而不选择那个索引。

经检查，发现表sbfi_ctry_flow_curve_wheel的索引情况如下：
SQL> list
1 select index_name,column_name,column_position from user_ind_columns
2 where table_name=upper('sbfi_ctry_flow_curve_wheel')
3 order by 1
4* ,3
SQL> /

INDEX_NAME                     COLUMN_NAME COLUMN_POSITION
------------------------------ ------------ ---------------
SBFI_CTRY_FLOW_CURVE_WHEEL_PK TRADE_DT                   1
SBFI_CTRY_FLOW_CURVE_WHEEL_PK CTRY_CODE                  2
SBFI_CTRY_FLOW_CURVE_WHEEL_PK MONTH                      3
TRADE_DT_INDEX                 TRADE_DT                   1

尝试加hint /*+ index(sbfi_ctry_flow_curve_wheel TRADE_DT_INDEX) */，让oracle选择走索引TRADE_DT_INDEX。结果发现运行时间没有缩短，反而从4秒增加到7秒。
SQL> select /*+ index(sbfi_ctry_flow_curve_wheel TRADE_DT_INDEX) */ min(trade_dt), max(trade_dt) from bfi_ctry_flow_curve_wheel;

MIN(TRADE MAX(TRADE
--------- ---------
01-JAN-01 07-SEP-07

Elapsed: 00:00:06.91

为什么会出现这种情况？我们尝试用TRACE去跟踪执行过程。
==session 1, don't use hint
alter session set timed_statistics=true
/
alter session set max_dump_file_size=unlimited
/
alter session set tracefile_identifier='PRIMARY_KEY'
/
alter session set events '10046 trace name context forever, level 12'
/
select min(trade_dt), max(trade_dt) from sbfi_ctry_flow_curve_wheel
/
alter session set events '10046 trace name context off'
/

==session 2, use hint
alter session set timed_statistics=true
/
alter session set max_dump_file_size=unlimited
/
alter session set tracefile_identifier='TRADE_DT_INDEX'
/
alter session set events '10046 trace name context forever, level 12'
/
select /*+ index(sbfi_ctry_flow_curve_wheel TRADE_DT_INDEX) */ min(trade_dt), max(trade_dt) from sbfi_ctry_flow_curve_wheel
/
alter session set events '10046 trace name context off'
/

两种情况的trace用tkprof分析后，主要部分结果如下:
==session1, don't use hint
********************************************************************************
select min(trade_dt), max(trade_dt)
from
sbfi_ctry_flow_curve_wheel

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.34          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      2.14       3.84      26044      26067          0           1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        4      2.14       4.18      26044      26067          0           1

Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 89

Rows     Row Source Operation
------- ---------------------------------------------------
      1 SORT AGGREGATE
7538400   INDEX FAST FULL SCAN SBFI_CTRY_FLOW_CURVE_WHEEL_PK (object id 35844)

Elapsed times include waiting on following events:
Event waited on                             Times   Max. Wait Total Waited
----------------------------------------   Waited ---------- ------------
SQL*Net message to client                       2        0.00          0.00
db file scattered read                       1649        0.14          2.26
SQL*Net message from client                     2        4.15          4.15
********************************************************************************

==session2, use hint
********************************************************************************
select /*+ index(sbfi_ctry_flow_curve_wheel TRADE_DT_INDEX) */ min(trade_dt),
max(trade_dt)
from
sbfi_ctry_flow_curve_wheel

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      3.21       8.84      19945      19945          0           1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        4      3.21       8.85      19945      19945          0           1

Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 89

Rows     Row Source Operation
------- ---------------------------------------------------
      1 SORT AGGREGATE
7538400   INDEX FULL SCAN TRADE_DT_INDEX (object id 35830)

对比后主要的不同罗列如下：
走primary key, consistent read，即query值为 26067，fetch时间为3.84秒，访问主键索引的方法为INDEX FAST FULL SCAN。
走TRADE_DT_INDEX索引，consistent read值为19945, fetch时间为8.84秒，访问索引TRADE_DT_INDEX的方法为INDEX FULL SCAN。

关于INDEX FAST FULL SCAN，oracle文档中解释如下：
Fast full index scans are an alternative to a full table scan when the index contains all the columns that are needed for the query, and at least one column in the index key has the NOT NULL constraint. A fast full scan accesses the data in the index itself, without accessing the table. It cannot be used to eliminate a sort operation, because the data is not ordered by the index key. It reads the entire index using multiblock reads, unlike a full index scan, and can be parallelized.
You can specify fast full index scans with the initialization parameter OPTIMIZER_FEATURES_ENABLE or the INDEX_FFS hint. Fast full index scans cannot be performed against bitmap indexes.
A fast full scan is faster than a normal full index scan in that it can use multiblock I/O and can be parallelized just like a table scan.

关于INDEX FAST FULL SCAN：
A full scan is available if a predicate references one of the columns in the index. The predicate does not need to be an index driver. A full scan is also available when there is no predicate, if both the following conditions are met:
All of the columns in the table referenced in the query are included in the index.
At least one of the index columns is not null.
A full scan can be used to eliminate a sort operation, because the data is ordered by the index key. It reads the blocks singly.

上面明确讲到了fast full scan比full scan要快，因为它用multiblock I/O，而且可以parallelized。
顺便也注意到要调优的这句sql，只返回trade_dt列的值，满足index (fast) full scan的条件，即返回结果的列全都包含在索引里，非空。所以该sql只要扫描索引就能返回需要的结果，不需要再根据rowid去访问表。

既然要扫描整个索引，FAST FULL SCAN 比 FULL SCAN 快，TRADE_DT_INDEX 的 size 比 PK 的 size 小，那对 RADE_DT_INDEX 做FFS应该是最快的访问路径。用index_ffs hint:
SQL> select /*+ index_ffs(sbfi_ctry_flow_curve_wheel TRADE_DT_INDEX) */ min(trade_dt), max(trade_dt) from bfi_ctry_flow_curve_wheel;

MIN(TRADE MAX(TRADE
--------- ---------
01-JAN-01 07-SEP-07

Elapsed: 00:00:02.61

相比上面的4秒和7秒是快了一些。

但是根据一般的理解，象min(),max()这样的函数，Oracle应该直接访问索引的最左边或者最右边，这样的访问速度才是最快的。尝试SQL>select min(trade_dt) from sbfi_ctry_flow_curve_wheel并生成10046 trace文件，用tkprof格式化后结果如下：

==session 3, single function
********************************************************************************
select max(trade_dt)
from
sbfi_ctry_flow_curve_wheel

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        2      0.00       0.00          0          0          0           0
Execute      2      0.00       0.00          0          0          0           0
Fetch        4      0.00       0.00          0          6          0           2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        8      0.00       0.01          0          6          0           2

Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 89

Rows     Row Source Operation
------- ---------------------------------------------------
      1 SORT AGGREGATE
      1   INDEX FULL SCAN (MIN/MAX) SBFI_CTRY_FLOW_CURVE_WHEEL_PK (object id 35844)

Elapsed times include waiting on following events:
Event waited on                             Times   Max. Wait Total Waited
----------------------------------------   Waited ---------- ------------
SQL*Net message to client                       4        0.00          0.00
SQL*Net message from client                     4      494.34        501.34
********************************************************************************
一些重要的信息: consistent read值为6，相比以前的26067(PK)/19945(index)，fetch时间<0.01秒，相比3.84s(PK)/8.84s(index)。访问索引的方法为INDEX FULL SCAN (MIN/MAX)。这是oracle文档库里没有提到的访问方法，但是 http://www.juliandyke.com/Optimisation/Operations/IndexFullScanMinMax.html 有一些介绍：Returns the first or last entry in the index。

看来oracle对于单个的min(),max()函数，能直接访问索引的最左边或者最右边取到结果，但是如果两个函数同时出现在一个sql里，oracle就只能扫描整个索引。这一点上还是不够智能。
=============================================================================================

posted on 2009-01-27 21:38 decode360 阅读(509) 评论(0) 编辑收藏所属分类: 05.SQL

新用户注册刷新评论列表


只有注册用户登录后才能发表评论。




网站导航: 博客园 IT新闻 Chat2DB C++博客博问管理
相关文章: 一段很绕的SQL(双重否定) SQLPlus中的COPY指令学习关于Views的Updatable问题 Update的另一种写法两段好玩的SQL[AskTom] 关于Break On [SQLPlus] ratio_to_report函数关于MERGE操作的一些建议 SQLPlus { sqlterminator \| escape } Oracle的转义字符

Decode360's Blog

公告

常用链接

留言簿(13)

随笔分类(397)

随笔档案(397)

文章分类(33)

新闻分类(15)

收藏夹(74)

Blog_List

IT_Web

My_Link

最新随笔

最新评论