There Is No Time Like ‘%NOW%’ To Use Dynamic Sampling

I recently came across a query in which the Optimizer was making a poor cardinality estimate, which in turn caused inefficient join type, which in turn caused the query to run excessively long. This post is a reenactment of my troubleshooting.

The Suspect SQL

The original SQL was quite large and had a fairly complex plan so I simplified it down to this test case for the purpose of this blog post:

select [...]
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       al1.business_type in ('002', '003', '007', '009') and
       (not (al1.cust_po like '%MATERIAL HANDLING%' or
             al1.cust_po like '%SECURITY%' or
             al1.cust_po like '%SUMMER OF CREATIVITY%' or
             al1.cust_po like '%TEST%'));
----------------------------------------------------------------------------
| Id  | Operation                            | Name                | Rows  |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                     |     1 |
|   1 |  SORT AGGREGATE                      |                     |     1 |
|   2 |   PARTITION LIST SINGLE              |                     |     9 |
|   3 |    PARTITION HASH ALL                |                     |     9 |
|*  4 |     TABLE ACCESS BY LOCAL INDEX ROWID| FACT_TABLE          |     9 |
|   5 |      BITMAP CONVERSION TO ROWIDS     |                     |       |
|*  6 |       BITMAP INDEX SINGLE VALUE      | FACT_BX10           |       |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("AL1"."CUST_PO" NOT LIKE '%MATERIAL HANDLING%' AND
              "AL1"."CUST_PO" NOT LIKE '%SECURITY%' AND
              "AL1"."CUST_PO" NOT LIKE '%SUMMER OF CREATIVITY%' AND
              "AL1"."CUST_PO" NOT LIKE '%TEST%')
   6 - access("AL1"."ORDER_TYPE"='Z010')

Looking at the plan I would surmise that the cardinality estimate for the fact table (line 4) must surely be greater than 9. To find out how far the Optimizer’s guess is off, I ran this query which contained all the filter predicates for FACT_TABLE:

select count(*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       (not (al1.cust_po like '%MATERIAL HANDLING%' or
             al1.cust_po like '%SECURITY%' or
             al1.cust_po like '%SUMMER OF CREATIVITY%' or
             al1.cust_po like '%TEST%'));

  COUNT(*)
----------
 1,324,510

As you can see, the cardinality estimate in this case is way off. The Optimizer estimated 9 rows and in reality the query returns 1.3 million rows, only a difference of 6 orders of magnitude (10^6 or 1,000,000). How could this be? Let’s try and understand why and where the cardinality estimate went wrong.

Bite Size Chunks

I find the easiest way to debug these issues is to use start with one predicate then add one predicate at a time, noting the cardinality estimate and comparing it to the actual cardinality value.

One Predicate, Two Predicate, Red Predicate, Blue Predicate

explain plan for
select count (*)
from   fact_table al1
where  al1.region = '003'
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE'));
--------------------------------------------------------------
| Id  | Operation              | Name                | Rows  |
--------------------------------------------------------------
|   0 | SELECT STATEMENT       |                     |     1 |
|   1 |  SORT AGGREGATE        |                     |     1 |
|   2 |   PARTITION LIST SINGLE|                     |   141M|
|   3 |    PARTITION HASH ALL  |                     |   141M|
|   4 |     TABLE ACCESS FULL  | FACT_TABLE          |   141M|
--------------------------------------------------------------

   COUNT(*)
-----------
141,821,991

Looks good so far. REGION is the list partition key so we’d expect an accurate estimate.

explain plan for
select count (*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010'
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE'));
------------------------------------------------------------
| Id  | Operation                     | Name       | Rows  |
------------------------------------------------------------
|   0 | SELECT STATEMENT              |            |     1 |
|   1 |  SORT AGGREGATE               |            |     1 |
|   2 |   PARTITION LIST SINGLE       |            |  1456K|
|   3 |    PARTITION HASH ALL         |            |  1456K|
|   4 |     BITMAP CONVERSION COUNT   |            |  1456K|
|*  5 |      BITMAP INDEX SINGLE VALUE| FACT_BX10  |       |
------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("AL1"."ORDER_TYPE"='Z010')

  COUNT(*)
----------
 1,324,642

No issues here. 1,456,000 statistically equivalent to 1,324,642.

explain plan for
select count (*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       (not (al1.cust_po like '%MATERIAL HANDLING%'))
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE'));
----------------------------------------------------------------------------
| Id  | Operation                            | Name                | Rows  |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                     |     1 |
|   1 |  SORT AGGREGATE                      |                     |     1 |
|   2 |   PARTITION LIST SINGLE              |                     | 72803 |
|   3 |    PARTITION HASH ALL                |                     | 72803 |
|*  4 |     TABLE ACCESS BY LOCAL INDEX ROWID| FACT_TABLE          | 72803 |
|   5 |      BITMAP CONVERSION TO ROWIDS     |                     |       |
|*  6 |       BITMAP INDEX SINGLE VALUE      | FACT_BX10           |       |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("AL1"."CUST_PO" NOT LIKE '%MATERIAL HANDLING%')
   6 - access("AL1"."ORDER_TYPE"='Z010')

  COUNT(*)
----------
 1,324,642

With the addition of the NOT LIKE predicate we start to see a bit of a difference. This plan has a 20x reduction from the previous cardinality estimate (1,456,000/72,803 = 20). Let’s add one more NOT LIKE predicate and see what we get.

explain plan for
select count (*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       (not (al1.cust_po like '%MATERIAL HANDLING%' or
             al1.cust_po like '%SECURITY%'))
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE'));
----------------------------------------------------------------------------
| Id  | Operation                            | Name                | Rows  |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                     |     1 |
|   1 |  SORT AGGREGATE                      |                     |     1 |
|   2 |   PARTITION LIST SINGLE              |                     |  3640 |
|   3 |    PARTITION HASH ALL                |                     |  3640 |
|*  4 |     TABLE ACCESS BY LOCAL INDEX ROWID| FACT_TABLE          |  3640 |
|   5 |      BITMAP CONVERSION TO ROWIDS     |                     |       |
|*  6 |       BITMAP INDEX SINGLE VALUE      | FACT_BX10           |       |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("AL1"."CUST_PO" NOT LIKE '%MATERIAL HANDLING%' AND
              "AL1"."CUST_PO" NOT LIKE '%SECURITY%')
   6 - access("AL1"."ORDER_TYPE"='Z010')


  COUNT(*)
----------
 1,324,642

With the addition of a second NOT LIKE predicate the cardinality estimate has dropped to 3,640 from 72,803, a 20x reduction.

explain plan for
select count (*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       (not (al1.cust_po like '%MATERIAL HANDLING%' or
             al1.cust_po like '%SECURITY%' or
             al1.cust_po like '%SUMMER OF CREATIVITY%'))
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE'));
----------------------------------------------------------------------------
| Id  | Operation                            | Name                | Rows  |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                     |     1 |
|   1 |  SORT AGGREGATE                      |                     |     1 |
|   2 |   PARTITION LIST SINGLE              |                     |   182 |
|   3 |    PARTITION HASH ALL                |                     |   182 |
|*  4 |     TABLE ACCESS BY LOCAL INDEX ROWID| FACT_TABLE          |   182 |
|   5 |      BITMAP CONVERSION TO ROWIDS     |                     |       |
|*  6 |       BITMAP INDEX SINGLE VALUE      | FACT_BX10           |       |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("AL1"."CUST_PO" NOT LIKE '%MATERIAL HANDLING%' AND
              "AL1"."CUST_PO" NOT LIKE '%SECURITY%' AND
              "AL1"."CUST_PO" NOT LIKE '%SUMMER OF CREATIVITY%')
   6 - access("AL1"."ORDER_TYPE"='Z010')

   COUNT(*)
----------
 1,324,642

With the addition of the third NOT LIKE predicate the cardinality estimate has dropped from 3,640 to 182, another 20x reduction. Looks like we may have found the issue. Each NOT LIKE predicate appears to result in a 20x reduction (5% selectivity) from the previous estimate. The original query had all four NOT LIKE predicates on it and had a cardinality estimate of 9. If we work the math: 182 * 5% = 9.

Looking at the query we can see there are four NOT LIKE predicates each with a leading and trailing wild card (%). Since DBMS_STATS does not gather table column information on parts of strings, each of the NOT LIKE predicates will have a default selectivity guess of 5%. Given this query has four NOT LIKE predicates, the total reduction for those four predicates will be 5%^4 = 1/160,000 = 0.00000625 which is quite significant, and in this case not representative and the root cause of the original query’s suboptimal access and join type.

Dynamic Sampling To The Rescue

Dynamic Sampling was designed with cases like this in mind. That is, cases where the Optimizer has to resort to selectivity guesses and could very well guess poorly. Substring guessing is not simply done as the substring could appear anywhere in the string. Let’s see what the cardinality estimate is when I add a dynamic_sampling hint to the query.

explain plan for
select /*+ dynamic_sampling(al1 4) */
       count (*)
from   fact_table al1
where  al1.region = '003' and
       al1.order_type = 'Z010' and
       (not (al1.cust_po like '%MATERIAL HANDLING%' or
             al1.cust_po like '%SECURITY%' or
             al1.cust_po like '%SUMMER OF CREATIVITY%' or
             al1.cust_po like '%TEST%'))
/
select *
from table(dbms_xplan.display(format=>'BASIC ROWS PREDICATE NOTE'));
--------------------------------------------------------------
| Id  | Operation              | Name                | Rows  |
--------------------------------------------------------------
|   0 | SELECT STATEMENT       |                     |     1 |
|   1 |  SORT AGGREGATE        |                     |     1 |
|   2 |   PARTITION LIST SINGLE|                     |  1606K|
|   3 |    PARTITION HASH ALL  |                     |  1606K|
|*  4 |     TABLE ACCESS FULL  | FACT_TABLE          |  1606K|
--------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("AL1"."ORDER_TYPE"='Z010' AND
              "AL1"."CUST_PO" NOT LIKE '%MATERIAL HANDLING%' AND
              "AL1"."CUST_PO" NOT LIKE '%SECURITY%' AND
              "AL1"."CUST_PO" NOT LIKE '%SUMMER OF CREATIVITY%' AND
              "AL1"."CUST_PO" NOT LIKE '%TEST%')

Note
-----
   - dynamic sampling used for this statement

With a level 4 dynamic_sampling hint the Optimizer estimates 1.6 million rows, very close to the actual value of 1.3 million rows. This estimate is surely close enough to give us the optimal access and join type such that the original query should perform optimally.

Summary

There are case where the Optimizer guesses and sometimes can guess poorly. When this is the case, dynamic sampling can be used to give the Optimizer a better cardinality guess. Dynamic sampling is probably best suited for queries that run minutes or longer, such that the overhead of the dynamic sampling query is only a fraction of the total run time. In these cases, the minimal overhead of dynamic sampling can well out weigh the cost of a suboptimal plan.

All tests were performed on 11.1.0.6.0.

11 comments

  1. Steve Catmull

    I work on a relatively large datawarehouse (multi-terrabyte). The number one tuning hint I tell people is to help Oracle get the cardinality right along each step of the operation. If Oracle gets that right, it will usually come up with a good plan as far as join orders, join types and access paths.

    This hint is one of my favorites because often we have data that violates the core assumption that no two columns are correalted (hence a reason for extended statistics in 11g.)

    It’s not perfect. It only helps improve cardinalities of a single table. It does not effect (near as I can tell in 9i/10g) the join cardinality calculation.

    The thing I love about this hint though is that it is basically maintenance proof. We are not “locking” the optimizer into a certain order, join type or access path. We’re just saying maybe you should sample the data first before you decide.

    When new functionality is added to the optimizer, we don’t have to re-evaulate hinted queries because this hint does not advocate a single approach. Purely delightful.

  2. Greg Rahn

    You are completely spot on Steve. Sometimes sampling data is the only way to get the true relationship between the columns.

    Currently multi-column/extended statistics only work for single equality predicates so dynamic sampling may be one of the only ways to get the cardinality estimate close.

  3. David Aldridge

    I find that dynamic sampling to be very seductive. For quite a while I’ve been telling report developers who have problems with execution plans to turn up the dynamic sampling setting through a hint to see if the plan improves — sometime the default sampling rate is not high enough, and frankly it’s often an easy win.

    If they’d just learn to apply dynamic sampling by default then I’d be practically out of a job.

  4. Pingback: Choosing An Optimal Stats Gathering Strategy | Structured Data
  5. Todor Botev

    Great post! I liked the “methodology” of adding the predicates one at a time.

    Why did you choose level 4 for dynamic sampling? According to the docs (11g), when you use a hint like in your case, level 1 could be enough.

  6. Greg Rahn

    Todor-

    I had chosen 4 because I had initially used optimizer_dynamic_sampling = 4. I do recognize that they are not equals. I did just run a quick test and ran through all 10 levels of the hint. Here are the results:

    DS Level  Rows
    -------- -----
    1            9
    2         355K
    3        1874K
    4        1606K
    5        1249K
    6         654K
    7        1419K
    8         953K
    9        1232K
    10       1324K
    

    Interesting enough, a larger sample does not always mean a more accurate estimate. This is probably data dependent. For some reason, level 1 did not produce a note in the explain plan that dynamic sampling was used, and thus the low estimate of 9. I’ll investigate this.

  7. Todor Botev

    OK, so playing out with the level seems to be worth the effort.

    As for level 1 – could it be that level 1 (OPTIMIZIER HINT) is only for the unanalyzed tables and it is missing in the docs?

  8. Rainer Stenzel

    my first question was (and still is), why the optimizer is not regarding the NOT operator ?
    In my opinion the estimated selectivity for the negated LIKE predicate shoud be:
    s(NOT LIKE =1-s( LIKE )
    but instead it seems to be equal.
    Applying this to the presented query the total reduction for those four predicates should only be 95%^4 (about 0.8) instead of 5%^4 resulting into an about 100.000 times greater cardinality estimation.
    Regards,
    Rainer Stenzel

  9. Rainer Stenzel

    additional question and remark after some tests:
    Was the presented scenario investigated with Oracle 9i ?
    With Oracle 10g I found modified selectivities
    with literals:
    5% for column LIKE wild_card_prefixed_pattern
    50% for column NOT LIKE wild_card_prefixed_pattern
    respective with bind variables:
    1% for column LIKE pattern_bind_variable
    5% for column NOT LIKE pattern_bind_variable
    undocumented and consequently quite wired.
    Regards,
    Rainer Stenzel

  10. Greg Rahn

    @Rainer

    Q: Why the optimizer is not regarding the NOT operator ?
    In my opinion the estimated selectivity for the negated LIKE predicate should be:
    s(NOT LIKE)=1-s(LIKE)

    A: Because that is not currently the logic in the code. They are not equivalent today.

    I only performed my tests on 11.1.0.6.

    If you check out Cost-Based Oracle Fundamentals by Jonathan Lewis, page 133, table 6-6 “Fixed Percentages Used for Selectivity on Character Expressions”, you can see his observations.

  11. Rainer Stenzel

    Hello Greg,
    unfortunately I did not see your hint to 11.1.0.6 first time and can not test on 11g by myself currently.I did some tests on 10g again and found:
    1) if using literals (or bind peeking ?) the optimizer
    1.1) is applying s(column NOT LIKE pattern)=1-s(column LIKE pattern) as expected and
    1.2) is using different fixed selectivities for wildcard prefixed patterns and not wildcard prefixed patterns.
    1.3) is using different fixed selectivities for non wildcard prefixed patterns with varying length.
    2) is using the same selectivity (5%) for LIKE :v as for NOT LIKE :v when using bind variables and (however) not using bind peeking.
    May be things changed again with using 11g and adaptive cursor sharing,
    ==> but at least on 10g I would expect a better cardinality estimation by the optimizer than presented from your example.
    Have a look at the different cardinality assumptions at the example below:

    select * from T where UID not like 'LMN%'
    union all
    select * from T where UID like 'LMN%'
    union all
    select * from T where UID not like '%LMN%'
    union all
    select * from T where UID like '%LMN%'
    union all
    select * from T;
    
    Ausf├╝hrungsplan
    ----------------------------------------------------------
       0      SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=14476 Card=6013368 Bytes=234521352)
       1    0   UNION-ALL
       2    1     TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=3624 Card=2004453 Bytes=78173667)
       3    1     TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=1 Card=3 Bytes=117)
       4    3       INDEX (RANGE SCAN) OF 'PIT' (INDEX (UNIQUE)) (Cost=1 Card=3)
       5    1     TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=3624 Card=1904233 Bytes=74265087)
       6    1     TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=3618 Card=100223 Bytes=3908697)
       7    1     TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=3609 Card=2004456 Bytes=78173784)
    
    variable b1 varchar2(20);
    
    select * from T where UID not like :b1
    union all
    select * from T where UID like :b1
    union all
    select * from T;
    
    Ausf├╝hrungsplan
    ----------------------------------------------------------
       0      SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=7929 Card=2204902 Bytes=85991178)
       1    0   UNION-ALL
       2    1     TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=3618 Card=100223 Bytes=3908697)
       3    1     TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=702 Card=100223 Bytes=3908697)
       4    3       INDEX (RANGE SCAN) OF 'PIT' (INDEX (UNIQUE)) (Cost=10 Card=18040)
       5    1     TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=3609 Card=2004456 Bytes=78173784)
    

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s