An important note for this example - the placement of the ALTER SESSION SET SQL_TRACE=TRUE is important. I needed to set it before running the query the first time. As an exercise - move it to just be before the second execution of the query and you'll find (from the tkprof) that the query is hard parsed the second time - and the row source operation in the tkprof will be a full scan. That is because the first time a query is executed with sql_trace=true (as opposed to the default of false), it will be hard parsed - as of right now.
Explain plan is blind to the bind
Explain plan does not "bind peek". This is pretty easy to observe:
ops$tkyte%ORA10GR2> create table t
2 as
3 select a.*, 1 id
4 from all_objects a
5 where rownum <= 5000;
Table created.
ops$tkyte%ORA10GR2> update t
2 set id = 99
3 where rownum = 1;
1 row updated.
ops$tkyte%ORA10GR2> create index t_idx on t(id);
Index created.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns size 254' );
PL/SQL procedure successfully completed.
So we have created some skewed data. If we say "where id=1", we would expect a full scan (index would be inefficient). If we say "where id = 99", we would expect an index range scan - as id=99 returns a single row. Using two queries that differ only in bind names (which is sufficient to prevent cursor sharing - these are two DIFFERENT queries to Oracle!), we'll execute a query with a bind set to the value 99 and then another with a bind set to 1.
ops$tkyte%ORA10GR2> variable x_is_99_first number
ops$tkyte%ORA10GR2> variable x_is_1_first number
ops$tkyte%ORA10GR2> exec :x_is_99_first := 99; :x_is_1_first := 1;
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> alter session set sql_trace=true;
Session altered.
ops$tkyte%ORA10GR2> select id, object_name from t where id = :x_is_99_first;
ID OBJECT_NAME
---------- ------------------------------
99 ICOL$
ops$tkyte%ORA10GR2> select id, object_name from t where id = :x_is_1_first;
ID OBJECT_NAME
---------- ------------------------------
1 I_USER1
....
1 USER_SCHEDULER_PROGRAM_ARGS4999 rows selected.
Now we'll just flip flop the values and re-execute the queries. Note that they will soft parse, just reuse the existing plans generated from above.
ops$tkyte%ORA10GR2> exec :x_is_99_first := 1; :x_is_1_first := 99;
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select id, object_name from t where id = :x_is_99_first;
ID OBJECT_NAME
---------- ------------------------------
1 I_USER1
...
1 USER_SCHEDULER_PROGRAM_ARGS
4999 rows selected.
ops$tkyte%ORA10GR2> select id, object_name from t where id = :x_is_1_first;
ID OBJECT_NAME
---------- ------------------------------
99 ICOL$
ops$tkyte%ORA10GR2> alter session set sql_trace=false;
Session altered.
Reviewing the TKPROF report first for the "x is 99 first" query we see:
select id, object_name from t where id = :x_is_99_first
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 3 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 3 0 1
Misses in library cache during parse: 1
Misses in library cache during execute: 1O
ptimizer mode: ALL_ROWS
Parsing user id: 120 (OPS$TKYTE)
Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS BY INDEX ROWID T (cr=3 pr=0 pw=0 time=59 us)
1 INDEX RANGE SCAN T_IDX (cr=2 pr=0 pw=0 time=32 us)(object id 70394)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
1 TABLE ACCESS MODE: ANALYZED (FULL) OF 'T' (TABLE)
********************************************************************************
select id, object_name from t where id = :x_is_99_first
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 335 0.02 0.02 0 739 0 4999
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 337 0.02 0.02 0 739 0 4999
Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 120 (OPS$TKYTE)
Rows Row Source Operation
------- ---------------------------------------------------
4999 TABLE ACCESS BY INDEX ROWID T (cr=739 pr=0 pw=0 time=50042 us)
4999 INDEX RANGE SCAN T_IDX (cr=344 pr=0 pw=0 time=30018 us)(object id 70394)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
4999 TABLE ACCESS MODE: ANALYZED (FULL) OF 'T' (TABLE)
So, the "real plan" used is an index range scan - both times. But, explain plan - which cannot, does not bind peek - will say "full scan". The reason? explain plan is optimizing "select * from t where id = ?" - and it says "5,000 rows, 2 values of id, id is not selective, full scan". But the optimizer is optimizing the query "select * from t where id = 99" - because it peeked at the bind the first time! The soft parse won't peek (else it would be a hard parse!) and just reused the existing plan - the inefficient range scan to read every row out.
On the other hand, looking at the "x is 1 first" query:
select id, object_name from t where id = :x_is_1_first
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 335 0.01 0.01 0 398 0 4999
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 337 0.01 0.01 0 398 0 4999
Misses in library cache during parse: 1
Misses in library cache during execute: 1O
ptimizer mode: ALL_ROWS
Parsing user id: 120 (OPS$TKYTE)
Rows Row Source Operation
------- ---------------------------------------------------
4999 TABLE ACCESS FULL T (cr=398 pr=0 pw=0 time=15094 us)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
4999 TABLE ACCESS MODE: ANALYZED (FULL) OF 'T' (TABLE)
********************************************************************************
select id, object_name from t where id = :x_is_1_first
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 67 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 67 0 1
Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 120 (OPS$TKYTE)
Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS FULL T (cr=67 pr=0 pw=0 time=82 us)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
1 TABLE ACCESS MODE: ANALYZED (FULL) OF 'T' (TABLE)
Explain plan appears to have gotten it right - but only by accident. It is just a coincidence that the plans "match" - they were arrived at by very different thought processes. The optimizer optimized 'where id=1' and said "about 5,000 rows, about 4,999 will be returned, full scan". The explain plan optimized "where id=?" and said "about 5,000 rows in the table, two values for ID, about 50% of the table will be returned, full scan".
So, that example shows explain plan "getting it wrong" because it is blind to the bind - and shows the effect of bind variable peeking (which you can read more about on asktom using the link above...)
Explain plan doesn't see your datatype...
The last bit about explain plan I'll look at is the fact that explain plan doesn't see your bind datatype. It presumes all binds are varchar2's regardless of how the developer is binding. Consider:
ops$tkyte%ORA10GR2> create table t
2 ( x varchar2(10) primary key,
3 y date
4 );
Table created.
ops$tkyte%ORA10GR2> insert into t values ( 1, sysdate );
1 row created.
ops$tkyte%ORA10GR2> variable x number
ops$tkyte%ORA10GR2> exec :x := 1
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> alter session set sql_trace=true;
Session altered.
ops$tkyte%ORA10GR2> select * from t where x = :x;
X Y
---------- ---------
1 06-APR-07
ops$tkyte%ORA10GR2> alter session set sql_trace=false;
Session altered.
So, we have a table with a varchar2 datatype for the primary key - but we only stuff numbers in there. End users and developers know it is always a number and then presume the type is a number (makes sense) - but someone used the wrong datatype (just in case maybe....). When we look at the TKPROF we'll see the explain plan mismatch:
select * from t where x = :x
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 1 0 0
Fetch 2 0.00 0.00 0 7 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 8 0 1
Misses in library cache during parse: 1O
ptimizer mode: ALL_ROWS
Parsing user id: 120 (OPS$TKYTE)
Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS FULL T (cr=7 pr=0 pw=0 time=76 us)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
1 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE)
0 INDEX (UNIQUE SCAN) OF 'SYS_C0013586' (INDEX (UNIQUE))
Explain plan - the 'execution plan' shows an index unique scan, but reality (the row source operation) shows we full scanned. DBMS_XPLAN (autotrace in 10gr2 uses these new package introduced in 9ir2, you can use it directly if you like) shows us why we are full scanning:
ops$tkyte%ORA10GR2> select * from t where x = to_number(:x);
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 16 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 16 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("X")=TO_NUMBER(:X))
Note
-----
- dynamic sampling used for this statement
So, when I told explain plan "hey, we have a NUMBER here" using to_number(), we can see what happened. In the predicate information, we see that when you compare a number to a string, Oracle will TO_NUMBER(the string). That of course makes using the index on the string not possible!
So, this example shows two things. Firstly, that explain plan assumes varchar2 (so use to_number or literals!! and to_date to get the right type conveyed to explain plan). Secondly, that implicit conversions are evil and should always be avoided.