Bloom Filters

October 23rd, 2007

The other day I was reading the 11g Database VLDB and Partitioning Guide and came across the below execution plan for a partial partition-wise join between sales and customers.

---------------------------------------------------------------------------
| Id  | Operation                             | Name      | Pstart| Pstop |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |           |       |       |
|   1 |  PX COORDINATOR                       |           |       |       |
|   2 |   PX SEND QC (RANDOM)                 | :TQ10002  |       |       |
|*  3 |    FILTER                             |           |       |       |
|   4 |     HASH GROUP BY                     |           |       |       |
|   5 |      PX RECEIVE                       |           |       |       |
|   6 |       PX SEND HASH                    | :TQ10001  |       |       |
|   7 |        HASH GROUP BY                  |           |       |       |
|*  8 |         HASH JOIN                     |           |       |       |
|   9 |          PART JOIN FILTER CREATE      | :BF0000   |       |       |
|  10 |           PX RECEIVE                  |           |       |       |
|  11 |            PX SEND PARTITION (KEY)    | :TQ10000  |       |       |
|  12 |             PX BLOCK ITERATOR         |           |       |       |
|  13 |              TABLE ACCESS FULL        | CUSTOMERS |       |       |
|  14 |          PX PARTITION HASH JOIN-FILTER|           |:BF0000|:BF0000|
|* 15 |           TABLE ACCESS FULL           | SALES     |:BF0000|:BF0000|
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(COUNT(SYS_OP_CSR(SYS_OP_MSR(COUNT(*)),0))>100)
   8 - access("S"."CUST_ID"="C"."CUST_ID")
  15 - filter("S"."TIME_ID"<=TO_DATE(' 1999-10-01 00:00:00',
                                     'syyyy-mm-dd hh24:mi:ss') AND
              "S"."TIME_ID">=TO_DATE(' 1999-07-01 00:00:00',
                                     'syyyy-mm-dd hh24:mi:ss'))

At first, it may seem just like another parallel execution plan, but if you look again you may notice that the Pstart and Pstop values are :BF0000 (id 15) and :BF0000 also appears on as part of the PART JOIN FILTER CREATE (id 9). So what exactly is :BF0000? This is a bloom filter being applied as part of the DFO (data flow operator).

It’s not necessary to completely understand bloom filters to take advantage of them, but the next time you see :BF0000 in your parallel execution plan, you will be able to recognize what it is.

Post to Twitter Tweet This Post

Execution Plans, Optimizer, Oracle, SQL Tuning

  1. Trackbacks

  2. No trackbacks yet.
  1. Comments

  2. Christian Antognini
    October 23rd, 2007 at 23:04 | #1

    Hi Greg

    I guess you know it… But for the other readers that might ignore it just a short information from my part. Bloom filters are already available in 10gR2. Unfortunately, for some unknown reasons (at least for me), Oracle never documented such a wonderful feature!

    Take care,
    Chris

  3. October 24th, 2007 at 08:13 | #2

    You are correct Christian, bloom filters are in 10gR2 and I don’t believe they were specifically documented but probably fell under “Optimizer Enhancements” or similar.