In Oracle we now see
11g
extended optimizer statistics, an
alternative to dynamic_sampling for estimating result set
sizes.
PART 2 - CBO Statistics
The
most important key to success with the CBO is to carefully define and
manage your statistics. In order for the CBO to make an intelligent
decision about the best execution plan for your SQL, it must
have information about the table and indexes that participate in the
query. When the CBO knows the size of the tables and the distribution,
cardinality, and selectivity of column values, the CBO can make an
informed decision and almost always generates the best execution plan.
As a review, the CBO gathers information from many sources, and he has
the lofty goal of using DBA-provided metadata to always make the "best"
execution plan decision:
Oracle uses data
from many sources to make an execution plan
Let's examine the following areas of CBO
statistics and see how to gather top-quality statistics for the CBO
and how to create an appropriate CBO environment for your database.
Getting top-quality statistics for the
CBO. The choices of executions plans
made by the CBO are only as good as the statistics available to it.
The old-fashioned analyze table
and dbms_utility methods for
generating CBO statistics are obsolete and somewhat dangerous to SQL
performance. As we may know, the CBO uses object statistics to choose
the best execution plan for all SQL statements.
The
dbms_stats
utility does a far better job in estimating statistics, especially for
large partitioned tables, and the better statistics result in faster
SQL execution plans. Here is a sample execution of
dbms_stats with the OPTIONS
clause:
exec dbms_stats.gather_schema_stats( -
ownname => 'SCOTT', -
options => 'GATHER AUTO', -
estimate_percent => dbms_stats.auto_sample_size, -
method_opt => 'for all columns size repeat', -
degree => 34 -
)
Here is another dbms_stats example that creates histograms on all indexes columns:
BEGIN
dbms_stats.gather_schema_stats(
ownname=>'TPCC',
METHOD_OPT=>'FOR ALL INDEXED COLUMNS SIZE SKEWONLY',
CASCADE=>TRUE,
ESTIMATE_PERCENT=>100);
END;
/
There are several values for the
OPTIONS parameter that we need to know
about:
- GATHER_ reanalyzes the
whole schema
- GATHER EMPTY_ only
analyzes tables that have no existing statistics
- GATHER STALE_ only
reanalyzes tables with more than 10 percent modifications (inserts,
updates, deletes)
- GATHER AUTO_ will
reanalyze objects that currently have no statistics and objects with
stale statistics. Using GATHER AUTO
is like combining GATHER STALE
and GATHER EMPTY.
Note that both
GATHER STALE
and GATHER AUTO require
monitoring. If you issue the ALTER TABLE XXX MONITORING
command, Oracle tracks changed tables with the
dba_tab_modifications view. Below we see
that the exact number of inserts, updates and deletes are tracked
since the last analysis of statistics:
SQL> desc dba_tab_modifications;
Name Type
--------------------------------
TABLE_OWNER VARCHAR2(30)
TABLE_NAME VARCHAR2(30)
PARTITION_NAME VARCHAR2(30)
SUBPARTITION_NAME VARCHAR2(30)
INSERTS NUMBER
UPDATES NUMBER
DELETES NUMBER
TIMESTAMP DATE
TRUNCATED VARCHAR2(3)
The most interesting of these options is the
GATHER STALE option. Because
all statistics will become stale quickly in a robust OLTP database, we
must remember the rule for GATHER STALE
is > 10% row change (based on num_rows
at statistics collection time). Hence, almost every table except
read-only tables will be reanalyzed with the GATHER STALE
option, making the GATHER STALE
option best for systems that are largely read-only. For example, if
only five percent of the database tables get significant updates, then
only five percent of the tables will be reanalyzed with the
GATHER STALE option.
Automating sample size with
dbms_stats.The better the quality
of the statistics, the better the job that the CBO will do when
determining your execution plans. Unfortunately, doing a complete
analysis on a large database could take days, and most shops must
sample your database to get CBO statistics. The goal is to take a
large enough sample of the database to provide top-quality data for
the CBO.
Now that we see how the
dbms_stats option works, let's see how to
specify an adequate sample size for dbms_stats.
In earlier releases, the DBA had to guess
what percentage of the database provided the best sample size and
sometimes underanalyzed the schema. Starting with Oracle9i
Database, the estimate_percent
argument is a great way to allow Oracle's dbms_stats
to automatically estimate the "best" percentage of a segment to sample
when gathering statistics:
estimate_percent => dbms_stats.auto_sample_size
After collecting automatic sample sizes, you
can verify the accuracy of the automatic statistics sampling by looking at the
sample_size
column on any of these data dictionary views:
- DBA_ALL_TABLES
- DBA_INDEXES
- DBA_IND_PARTITIONS
- DBA_IND_SUBPARTITIONS
- DBA_OBJECT_TABLES
- DBA_PART_COL_STATISTICS
- DBA_SUBPART_COL_STATISTICS
- DBA_TABLES
- DBA_TAB_COLS
- DBA_TAB_COLUMNS
- DBA_TAB_COL_STATISTICS
- DBA_TAB_PARTITIONS
- DBA_TAB_SUBPARTITIONS
Note that Oracle generally chooses a
sample_size
from 5 to 20 percent when using automatic sampling, depending on the
size of the tables and the distribution of column values. Remember, the
better the quality of your statistics, the better the decision of the
CBO.
Update:
In Oracle we now see
11g
extended optimizer statistics, an
alternative to dynamic_sampling for estimating result set
sizes.
Now that we understand the value of CBO
statistics, let's look at ways that the CBO statistics are managed in
a successful Oracle shop.
The WISE
Oracle tool is
the easiest way to analyze Oracle performance and WISE
allows you to spot hidden performance trends.
原文地址: http://www.dba-oracle.com/art_otn_cbo_p2.htm