Tagged: Oracle

Oracle Sun FAQ

On Monday, April 20, 2009, Oracle announced that it had agreed to acquire Sun Microsystems. Since then there has been much speculation and question raised around numerous areas of the deal. There is an official FAQ that discusses many areas, but I thought I would highlight three that seem to be fairly popular around the blogosphere:

Will the ownership of Solaris change Oracle’s position on Linux?
No. This transaction enhances our commitment to open standards and choice. Oracle is as committed as ever to Linux and other platforms and will continue to support and enhance our strong industry partnerships.

What does Oracle plan to do with MySQL?
MySQL will be an addition to Oracle’s existing suite of database products, which already includes Oracle Database 11g, TimesTen, Berkeley DB open source database, and the open source transactional storage engine, InnoDB.

What impact does this announcement have on the HP Oracle Exadata Storage Server and HP Oracle Database Machine products?
There is no impact. Oracle remains fully committed to the HP Oracle Exadata Storage Server and HP Oracle Database Machine products.

Database Customer Benchmarketing Reports

A few weeks ago I read Curt Monash’s report on interpreting the results of data warehouse proofs-of-concept (POCs) and I have to say, I’m quite surprised that this topic hasn’t been covered more by analysts in the data warehousing space. I understand that analysts are not database performance engineers, but where do they think that the performance claims of 10x to 100x or more come from? Do they actually investigate these claims or just report on them? I can not say that I have ever seen any database analyst offer any technical insight into these boasts of performance. If some exist be sure to leave a comment and point me to them.

Oracle Exadata Performance Architect Kevin Closson has blogged about a 485x performance increase of Oracle Exadata vs. Oracle Exadata and his follow-up post to explain exactly where the 485x performance gain comes from gave me the nudge to finish this post that had been sitting in my drafts folder since I first read Curt’s post.

Customer Bechmarketing Claims

I thought I would compile a list of what the marketing folks at other database vendors are saying about the performance of their products. Each of these statements have been taken from the given vendor’s website.

  • Netezza: 10-100 times faster than traditional solutions…but it is not uncommon to see performance differences as large as 200x to even 400x or more when compared to existing Oracle systems
  • Greenplum: often 10 to 100 times faster than traditional solutions
  • DATAllegro: 10-100x performance over traditional platforms
  • Vertica: Performs 30x-200x faster than other solutions
  • ParAccel: 20X – 200X performance gains
  • EXASolution: can perform up to 100 times faster than with traditional databases
  • Kognitio WX2: Tests have shown to out-perform other database / data warehouse solutions by 10-60 times

Certainly seems these vendors are a positioning themselves against traditional database solutions, whatever that means. And differences as large as 400x against Oracle? What is it exactly they are comparing?

Investigative Research On Netezza’s Performance Claims

Using my favorite Internet search engine I came across this presentation by Netezza dated October 2007. On slide 21 Netezza is comparing an NPS 8150 (112 SPU, up to 4.5 TB of user data) server to IBM DB2 UDB on a p680 with 12 CPUs (the existing solution). Not being extremely familiar with the IBM hardware mentioned, I thought I’d research to see exactly what an IBM p680 server consists of. The first link in my search results took me to here where the web page states:

The IBM eServer pSeries 680 has been withdrawn from the market, effective March 28, 2003.

Searching a bit more I came across this page which states that the 12 CPUs in the pSeries 680 are RS64 IV microprocessors. According to Wikipedia the “RS64-IV or Sstar was introduced in 2000 at 600 MHz, later increased to 750 MHz”. Given that at best, the p680 had 12 CPUs running at 750 MHz and the NPS 8150 had 112 440GX PowerPC processors I would give the compute advantage to Netezza by a significant margin. I guess it is cool to brag how your most current hardware beat up on some old used and abused server who has already been served its end-of-life notice. I found it especially intriguing that Netezza is boasting about beating out an IBM p680 server that has been end-of-lifed more than four years prior to the presentation’s date. Perhaps they don’t have any more recent bragging to do?

Going back one slide to #20 you will notice a comparison of Netezza and Oracle. Netezza clearly states they used a NPS 8250 (224 SPUs, up to 9 TB of user data) against Oracle 10g RAC running on Sun/EMC. Well ok…Sun/EMC what??? Obviously there were at least 2 Sun servers, since Oracle 10g RAC is involved, but they don’t mention the server models at all, nor the storage, nor the storage connectivity to the hosts. Was this two or more Sun Netra X1s or what??? Netezza boasts a 449x improvement in a “direct comparison on one day’s worth of data”. What exactly is being compared is up to the imagination. I guess this could be one query or many queries, but the marketeers intentionally fail to mention. They don’t even mention the data set size being compared. Given that Netezza can read data off the 224 drives at 60-70 MB/s, the NPS 8250 has a total scan rate of over 13 GB/s. I can tell you first hand that there are very few Sun/EMC solutions that are configured to support 13 GB/s of I/O bandwidth. Most configurations of that vintage probably don’t support 1/10th of that I/O bandwidth (1.3 GB/s).

Here are a few more comparisons that I have seen in Netezza presentations:

  • NPS 8100 (112 SPUs/4.5 TB max) vs. SAS on Sun E5500/6 CPUs/6GB RAM
  • NPS 8100 (112 SPUs/4.5 TB max) vs. Oracle 8i on Sun E6500/12 CPUs/8 GB RAM
  • NPS 8400 (448 SPUs/18 TB max) vs. Oracle on Sun (exact hardware not mentioned)
  • NPS 8100 (112 SPUs/4.5 TB max) vs. IBM SP2 (database not mentioned)
  • NPS 8150z (112 SPUs/5.5 TB max) vs. Oracle 9i on Sun/8 CPUs
  • NPS 8250z (224 SPUs/11 TB max) vs. Oracle 9i on Sun/8 CPUs

As you can see, Netezza has a way of finding the oldest hardware around and then comparing it to its latest, greatest NPS. Just like Netezza slogan, [The Power to ]Question Everything™, I suggest you question these benchmarketing reports. Database software is only as capable as the hardware it runs on and when Netezza targets the worst performing and oldest systems out there, they are bound to get some good marketing numbers. If they compete against the latest, greatest database software running on the latest, greatest hardware, sized competitively for the NPS being used, the results are drastically different. I can vouch for that one first hand having done several POCs against Netezza.

One Benchmarketing Claim To Rule Them All

Now, one of my favorite benchmarketing reports is one from Vertica. Michael Stonebraker’s blog post on customer benchmarks contains the following table:

vertica_benchmark_table.png

Take a good look at the Query 2 results. Vertica takes a query running in the current row store from running in 4.5 hours (16,200 seconds) to 1 second for a performance gain of 16,200x. Great googly moogly batman, that is reaching ludicrous speed. Heck, who needs 100x or 400x when you do 16,200x. That surely warrants an explanation of the techniques involved there. It’s much, much more than simply column store vs. row store. It does raise the question (at least to me): why Vertica doesn’t run every query in 1 second. I mean, come on, why doesn’t that 19 minute row store query score better than a 30x gain? Obviously there is a bit of the magic pixie dust going on here with, what I would refer to as “creative solutions” (in reality it is likely just a very well designed projection/materaizied view, but by showing the query and telling us how it was possible would make it less unimpressive [sic]).

What Is Really Going On Here

First of all, you will notice that not one of these benchmarketing claims is against a vendor run system. Each and every one of these claims are against existing customer systems. The main reason for this is that most vendors prohibit benchmark results being published with out prior consent from the vendor in the licensing agreement. Seems the creative types have found that taking the numbers from the existing, production system is not prohibited in the license agreement so they compare that to their latest, greatest hardware/software and execute or supervise the execution of a benchmark on their solution. Obviously this is a one sided apples to bicycles comparison, but quite favorable for bragging rights for the new guy.

I’ve been doing customer benchmarks and proof of concepts (POCs) for almost 5 years at Oracle. I can guarantee you that Netezza has never even come close to getting 10x-100x the performance over Oracle running on a competitive hardware platform. Now I can say that it is not uncommon for Oracle running on a balanced system to perform 10x to 1000x (ok, in extreme cases) over an existing poorly performing Oracle system. All it takes is to have a very unbalanced system with no I/O bandwidth, not be using parallel query, not use compression, poor or no use of partitioning and you have created a springboard for any vendor to look good.

One More Juicy Marketing Tidbit

While searching the Internet for creative marketing reports I have to admit that the crew at ParAccel probably takes the cake (and not in an impressive way). On one of their web pages they have these bullet points (plus a few more uninteresting ones):

  • All operations are done in parallel (A non-parallel DBMS must scan all of the data sequentially)
  • Adaptive compression makes disks faster…

Ok, so I can kinda, sorta see the point that a non-parallel DBMS must do something sequentially…not sure how else it would do it, but then again, I don’t know any enterprise database that is not capable of parallel operations. However, I’m going to need a bit of help on the second point there…how exactly does compression make disks faster? Disks are disks. Whether or not compression is involved has nothing to do with how fast a disk is. Perhaps they mean that compression can increase the logical read rate from a disk given that compression allows more data to be stored in the same “space” on the disk, but that clearly is not what they have written. Reminds me of DATAllegro’s faster-than-wirespeed claims on scan performance. Perhaps these marketing guys should have their numbers and wording validated by some engineers.

Do You Believe In Magic Or Word Games?

Creditable performance claims need to be accounted for and explained. Neil Raden from Hired Brains Research offers guidance for evaluating benchmarks and interpreting market messaging in his paper, Questions to Ask a Data Warehouse Appliance Vendor. I think Neil shares the same opinion of these silly benchmarketing claims. Give his paper a read.

Preprocessor For External Tables

Before External Tables existed in the Oracle database, loading from flat files was done via SQL*Loader. One option that some used was to have a compressed text file and load it with SQL*Loader via a named pipe. This allowed one not to have to extract the file, which could be several times the size of the compressed file. As of 11.1.0.7, a similar feature is now available for External Tables (and will be in 10.2.0.5). This enhancement is a result of Bug 6522622 which is mentioned in the Bugs fixed in the 11.1.0.7 Patch Set note. Unfortunately it appears that there aren’t any notes on how to actually use the External Table Preprocessor so allow me to give some insight into its use.

The PREPROCESSOR clause is part of the record_format_info clause. The syntax of the PREPROCESSOR clause is as follows:

PREPROCESSOR [directory_spec:] file_spec [preproc_options_spec]

It’s pretty straight forward when you see an example. Line 31 contains the new clause.

create or replace directory load_dir as '/data/tpc-ds/flat_files/1gb';
create or replace directory log_dir  as '/tmp';
create or replace directory exec_dir as '/bin';
--
-- ET_CUSTOMER_ADDRESS
--
DROP TABLE ET_CUSTOMER_ADDRESS;
CREATE TABLE ET_CUSTOMER_ADDRESS
(
    "CA_ADDRESS_SK"                  NUMBER
   ,"CA_ADDRESS_ID"                  CHAR(16)
   ,"CA_STREET_NUMBER"               CHAR(10)
   ,"CA_STREET_NAME"                 VARCHAR2(60)
   ,"CA_STREET_TYPE"                 CHAR(15)
   ,"CA_SUITE_NUMBER"                CHAR(10)
   ,"CA_CITY"                        VARCHAR2(60)
   ,"CA_COUNTY"                      VARCHAR2(30)
   ,"CA_STATE"                       CHAR(2)
   ,"CA_ZIP"                         CHAR(10)
   ,"CA_COUNTRY"                     VARCHAR2(20)
   ,"CA_GMT_OFFSET"                  NUMBER
   ,"CA_LOCATION_TYPE"               CHAR(20)
)
ORGANIZATION EXTERNAL
(
   TYPE oracle_loader
   DEFAULT DIRECTORY load_dir
   ACCESS PARAMETERS
   (
      RECORDS DELIMITED BY NEWLINE
      PREPROCESSOR exec_dir:'gunzip' OPTIONS '-c'
      BADFILE log_dir: 'CUSTOMER_ADDRESS.bad'
      LOGFILE log_dir: 'CUSTOMER_ADDRESS.log'
      FIELDS TERMINATED BY '|'
      MISSING FIELD VALUES ARE NULL
      (
          "CA_ADDRESS_SK"
         ,"CA_ADDRESS_ID"
         ,"CA_STREET_NUMBER"
         ,"CA_STREET_NAME"
         ,"CA_STREET_TYPE"
         ,"CA_SUITE_NUMBER"
         ,"CA_CITY"
         ,"CA_COUNTY"
         ,"CA_STATE"
         ,"CA_ZIP"
         ,"CA_COUNTRY"
         ,"CA_GMT_OFFSET"
         ,"CA_LOCATION_TYPE"
      )
   )
   LOCATION ('customer_address.dat.gz')
)
REJECT LIMIT UNLIMITED
;

SQL> select count(*) from ET_CUSTOMER_ADDRESS;

  COUNT(*)
----------
     50000

Now let’s double check:

$ gunzip -c customer_address.dat.gz | wc -l
50000

Note: The preprocessor option does not allow the use│, , &, and $ characters due to security reasons.

This is a great enhancement for those who transport compressed files around their networks and want to load them directly into their database via External Tables. One advantage of this feature is that when loading flat files from an NFS staging area, the network traffic is reduced by N, where N is the compression ratio of the file. For example, if your flat file compresses 10x (which is not uncommon), then you get an effective gain of 10x the throughput for the same network bandwidth. Or if you like, the required network bandwidth is reduced 10x to transfer the same logical data set. In this case the compression rate was 4x.

There are a few things to be aware of when using this feature. If the external table is parallel, then the number of files in the External Table Location clause should be equal or greater than the degree of parallelism (DOP). This is because the preprocessor outputs a stream and this stream can not be broken down into granules for multiple Parallel Query Slaves to work on, like a normal uncompressed text file. Each PQ Slave can work on at most, 1 file/stream. For example, if you have a DOP of 16 set on the External Table, but only have 10 files, 10 PQ Slaves will be busy and 6 will be idle, as there are more slaves than files. This means that to get optimal throughput the number of files should be a multiple of the DOP. Obviously this is not always possible so the recommendation is to have more smaller files vs. fewer larger files. This will limit the skew in the workload for the PQ Slaves if/when there are “remainder” files.

Hopefully you will find this enhancement very useful. I sure do.

Oracle Exadata: In Response to Chuck Hollis

Chuck Hollis, VP and Global Marketing CTO at EMC has written a couple blog posts offering his thoughts on Oracle Exadata. The first was “Oracle Does Hardware” which he wrote the day after the product launch. The second, unimpressively titled “I Annoy Kevin Closson at Oracle” was on Monday October 20th which was in response to a blog post by Exadata Performance Architect, Kevin Closson who commented on Chuck’s first post and some comments left on Kevin’s blog.

Clearly Stated Intentions

Since Chuck had disabled comments for his “I Annoy Kevin” post, I’m going to write my comments here. I have no intention to get into some fact-less debate turn flame, but I will make some direct comments with supporting facts and numbers while keeping it professional.

Storage Arrays: Bottleneck or Not?

Chuck thinks:

“…array-based storage technology is not the bottleneck; our work with Oracle [on the Oracle Optimized Warehouse Initiative] and other DW/BI environments routinely shows that we can feed data to a server just as fast as it can take it.”

First let me comment on the Optimized Warehouse Initiative. There have been some good things that have come out of this effort. I believe it has increased the level of awareness when it comes to sizing storage for BI/DW workloads. All too often storage sizing for BI/DW is done by capacity, not I/O bandwidth. The focus is on building balanced systems: systems that can execute queries and workloads such that no one component (CPU/storage connectivity/disk array/disk drives) becomes the bottleneck prematurely. The industry seems to agree: IBM has the Balanced Warehouse and Microsoft has a reference architecture for Project Madison as well.

So the question comes back to: Is array-based storage technology the bottleneck or not? I would argue it is. Perhaps I would use a word other than “bottleneck”, but let’s be clear on the overall challenge here. That is: to read data off disk with speed and efficiently return it to the database host to process it as fast as possible.

Let’s start at the bottom of the stack: hard disk drives. If the challenge is to scan lots of data fast, then how fast data can be read off disk is the first important metric to consider. In the white paper Deploying EMC CLARiiON CX4-960 for Data Warehouse/Decision Support System (DSS) Workloads EMC reports a drive scan rate (for a BI/DW workload) of 20 MB/s using 8+1 RAID-5 and 33 MB/s using a 2+1 RAID-5 LUN configuration. Oracle Exadata delivers drive scan rates around 85 MB/s, a difference of 2.5X to 4.25X. To understand the performance impact of this I’ve put together a few tables of data based on these real workload numbers.

Hardware Specs and Numbers for Data Warehouse Workloads

Storage RAID Raw:Usable Ratio Disk Drives Disk Scan Rate
EMC CX4-960
8+1 RAID 5
9:8
146 GB FC 15k RPM
20 MB/s
EMC CX4-960
2+1 RAID 5
3:2
146 GB FC 15k RPM
33 MB/s
EMC CX4-960
8+1 RAID 5
9:8
300 GB FC 15k RPM
20 MB/s
EMC CX4-960
2+1 RAID 5
3:2
300 GB FC 15k RPM
33 MB/s
Oracle Exadata
ASM Mirroring
2:1
450 GB SAS 15k RPM
85 MB/s

Sizing By Capacity

Storage RAID Total Usable Space Disk Drive Number of Drives Total Scan Rate
EMC CX4-960
8+1 RAID 5
18 TB
146 GB
139
2.8 GB/s
EMC CX4-960
2+1 RAID 5
18 TB
146 GB
185
6.1 GB/s*
EMC CX4-960
8+1 RAID 5
18 TB
300 GB
68
1.4 GB/s
EMC CX4-960
2+1 RAID 5
18 TB
300 GB
90
3.0 GB/s
Oracle Exadata
ASM Mirroring
18 TB
450 GB
80
6.8 GB/s

* I’m not sure that the CX4-960 array head is capable of 6.1 GB/s so it likley takes at least 2 CX4-960 array heads to deliver this throughput to the host(s).

Sizing By Scan Rate

Storage RAID Total Scan Rate Disk Drive Number of Drives Total Usable Space
EMC CX4-960
8+1 RAID 5
3.00 GB/s
146 GB
150
19.46 TB
EMC CX4-960
2+1 RAID 5
3.00 GB/s
146 GB
90
8.76 TB
EMC CX4-960
8+1 RAID 5
3.00 GB/s
300 GB
150
40.00 TB
EMC CX4-960
2+1 RAID 5
3.00 GB/s
300 GB
90
18.00 TB
Oracle Exadata
ASM Mirroring
3.00 GB/s
450 GB
36
8.10 TB

A Few Comments On The Above Data Points

Please note that “Total Usable Space” is a rough number for the total protected disk space one can use for a database if you filled each drive up to capacity. It does not take into consideration things like loss for formatting, space for sort/temp, etc, etc. I would use a 60% rule for estimating data space for database vs. total usable space. This means that 18 TB of total usable space would equate to 10 TB (max) of space for database data (compression not accounted for).

I’d also like to note that in the Sizing By Capacity table the “Total Scan Rate” is a disk only calculation. Whether or not a single CX4-960 array head can move data at that rate is in question. Based on the numbers in the EMC whitepaper it would appear CX4-960 head is capable of 3 GB/s but I would question if it is capable of much more than that, hence the reason for the asterisk(*).

Looking At The Numbers

If you look at the number for Sizing By Capacity, you can see that for the given fixed size, Exadata provides the fastest scan rate while using only 80 disk drives. The next closest scan rate is just 700 MB/s less but it uses 105 more disk drives (80 vs. 185). Quite a big difference.

When it comes to delivering I/O bandwidth, Exadata clearly stands out. Targeting a scan rate of 3 GB/s, Exadata delivers this using only 36 drives, just 3 Exadata Storage Servers. If one wanted to deliver this scan rate with the CX4 it would take 2.5X as many drives (90 vs. 36) using 2+1 RAID 5.

So are storage arrays the bottleneck? You can draw your own conclusions, but I think the numbers speak to the performance advantage with Oracle Exadata when it comes to delivering I/O bandwidth and fast scan rates. Consider this: What would the storage topology look like if you wanted to deliver a scan rate of 74 GB/s as we did for Oracle OpenWorld with 84 HP Oracle Exadata Storage Servers (6 HP Oracle Database Machines)? Honestly I would struggle to think where I would put the 185 or so 4Gb HBAs to achieve that.

Space Saving RAID or Wasteful Mirroring

This leads me to another comment by Chuck in his second post:

“[with Exadata] The disk is mirrored, no support of any space-saving RAID options — strange, for such a large machine”

And this one in his first post:

“If it were me, I’d want a RAID 5 (or 6) option.”

And his comment on Kevin’s blog:

“The fixed ratio of 12 disks (6 usable) per server element strikes us as a bit wasteful….And, I know this only matters to storage people, but there’s the minor matter of having two copies of everything, rather than the more efficient parity RAID approaches. Gets your attention when you’re talking 10-40TB usable, it does.”

Currently Exadata uses ASM mirroring for fault tolerance so there is a 2:1 ratio of raw disk to usable disk, however I don’t think it matters much. The logic behind that comment is that when one is sizing for a given scan rate, Exadata uses less spindles than the other configurations even though the disk protection is mirroring and not space-saving RAID 5. I guess I think it is strange to worry about space savings when disks just keep getting bigger and many are keeping the same performance characteristics as their predecessors. Space is cheap. Spindles are expensive. When one builds a configuration that satisfies the I/O scan rate requirement, chances are you have well exceeded the storage capacity requirement, even when using mirroring.

Perhaps Chuck likes space-saving RAID 5, but I think using less drives (0.4 as many, 36 vs. 90) to deliver the same scan rate is hardly wasteful. You know what really gets my attention? Having 40 TB of total usable space on 15 HP Oracle Exadata Storage Servers (180 450GB SAS drives) and being able to scan it at 15 GB/s compared to say having a CX4 with 200 drives @ 300GB using 2+1 R5 and only being able to scan them at 6.6 GB/s. I’d also be willing to bet that would require at least 2 if not 3 CX4-960 array heads and at least 30 4Gb HBAs running at wire speed (400 MB/s).

Exadata Is Smart Storage

Chuck comments:

“Leaving hardware issues aside, how much of the software functionality shown here is available on generic servers, operating systems and storage that Oracle supports today? I was under the impression that most of this great stuff was native to Oracle products, and not a function of specific tin …

If the Exadata product has unique and/or specialized Oracle logic, well, that’s a different case.”

After reading that I would said Chuck has not read the Technical Overview of the HP Oracle Exadata Storage Server. Not only does Exadata have a very fast scan rate, it has intelligence. A combination of brawn and brains which is not available with other storage platforms. The Oracle Exadata Storage Server Software (say that 5 times fast!!!) is not an Oracle database. It is storage software not database software. The intelligence and specialized logic is that Exadata Smart Scans return only the relevant rows and columns of a query, allowing for better use of I/O bandwidth and increased database performance because the database host(s) are not issuing I/O requests for data that is not needed for the query and then processing it post-fact. There are a couple slides (18 & 19) referencing a simple example of the benifits of Smart Scans in the HP Oracle Exadata Storage Server technical overview slide deck. It is worth the read.

It Will Be Interesting Indeed

Chuck concludes his second post with:

“The real focus here should be software, not hardware.”

Personally I think the focus should be on solutions that perform and scale and I think the HP Oracle Exadata Storage Server is a great solution for Oracle data warehouses that require large amounts of I/O bandwidth.

Ending On A Good Note

While many comments by Chuck do not seem to be well researched I would comment that having a conventional mid-range storage array that can deliver 3 GB/s is not a bad thing at all. I’ve seen many Oracle customers that have only a fraction of that and there are probably some small data warehouses out there that may run fine with 3 GB/s of I/O bandwidth. However, I think that those would run even faster with Oracle Exadata and I’ve never had a customer complain about queries running too fast.

Extreme Buzz About Oracle OpenWorld

Last week I read Mark Rittman’s post and today I was reading this InfoWorld article and a few others that are speculating what will be the big news this year at Oracle OpenWorld.

I guess the first teaser was given out by Larry Ellison on the F4Q08 earnings call on June 25:

We have a major database innovation that we will announce in September of this year. It is going to be a very big and important announcement for us, so we are not standing still in database.

Then on today’s earnings call for F1Q09 Charles Phillips mentions:

There will be an additional 11g update this quarter. It will contain some ease of use enhancements and a major dimension to the product line to be discussed next week. It’s more than a feature, let’s say — it puts Oracle into a new segment of the database market, which should help us sustain the consistent market share gains we’ve enjoyed the last several years, so if you want to hear the details, please come on out to OpenWorld in San Francisco next week.

So what is Oracle’s new database accelerator? How will query processing be revolutionized? I guess we’ll all have to wait for Larry Ellison’s keynote entitled “Extreme. Performance.” on Wednesday, September 24 at 2:30 p.m. to find out.

What Are Your System Statistics?

I’ve been working on a few test cases and I’m in search of some real-world data. If your production Oracle database uses system statistics, either Workload Statistics or Noworkload Statistics, and you are willing to share them, please post a comment with the output from the following two queries:

select version from v$instance;
select pname, pval1 from sys.aux_stats$ where sname = 'SYSSTATS_MAIN';

For example, my noworkload system statistics look like this:

SQL> select version from v$instance;

VERSION
-----------------
11.1.0.6.0

SQL> select pname, pval1 from sys.aux_stats$ where sname = 'SYSSTATS_MAIN';

PNAME                               PVAL1
------------------------------ ----------
CPUSPEED
CPUSPEEDNW                        726.951
IOSEEKTIM                           4.683
IOTFRSPEED                       36625.24
MAXTHR
MBRC
MREADTIM
SLAVETHR
SREADTIM

To help with fixed width formatting (pretty printing), please surround your results in the comment text box with a pre tag like such:
<pre>
blah blah blah
</pre>

Thanks for participating!

Quick link to 10.2 System Statistics Documentation for those unfamiliar with it.

Oracle 11g: Extended Statistics

In the Real-World Performance Roundtable, Part 2: The Optimizer, Schema Statistics, SQL Tuning at Oracle OpenWorld 2006, I worked an example of how the optimizer can have difficulty estimating the correct cardinality when there is data correlation. (The Zodiac example can be found on pages 46-49 of the presentation.) In Oracle 11g, there has been some enhancements to help the optimizer deal with data correlation.

DBMS_STATS.CREATE_EXTENDED_STATS

Previously I blogged about the 11g enhancement to DBMS_STATS.AUTO_SAMPLE_SIZE and the new algorithm for gathering NDV. One of the other enhancements to DBMS_STATS is the CREATE_EXTENDED_STATS function. It is this function that will allow us to tell the Optimizer that two or more columns have data that is correlated.

Zodiac Calendar Example

Let’s turn to the Zodiac calendar example to demonstrate where the functionality of DBMS_STATS.CREATE_EXTENDED_STATS can be applied. As you may know, there is a correlation between the Zodiac Sign and the calendar month. Below are the Zodiac signs and the corresponding days of the month.

  • Aries - March 21 – April 20
  • Taurus - April 21 – May 21
  • Gemini - May 22 – June 21
  • Cancer - June 22 – July 22
  • Leo - July 23 -August 21
  • Virgo - August 22 – September 23
  • Libra - September 24 – October 23
  • Scorpio - October 24 – November 22
  • Sagittarius - November 23 – December 22
  • Capricorn - December 23 – January 20
  • Aquarius - January 21 – February 19
  • Pisces - February 20- March 20

For this test case I am going to load two tables, CALENDAR and PERSON. Below is a description of each.

SQL> desc calendar
 Name              Null?    Type
 ----------------- -------- ------------
 DATE_ID           NOT NULL NUMBER(8)
 MONTH             NOT NULL VARCHAR2(16)
 ZODIAC            NOT NULL VARCHAR2(16)

SQL> desc person
 Name              Null?    Type
 ----------------- -------- ------------
 PERSON_ID         NOT NULL NUMBER(10)
 DATE_ID           NOT NULL NUMBER(8)

The CALENDAR table has 365 rows, one row for every day of the calendar year. The PERSON table has 32,768 rows for each DAY_ID (each day of the year) for a total of 11,960,320 rows.

There are a few indexes I’m building on the tables:

  • Unique index on PERSON(PERSON_ID)
  • Unique index on CALENDAR(DATE_ID)
  • Non-Unique index on PERSON(DATE_ID)

Now that the tables loaded and indexes created, it’s time to create the Extended Stats. Below is a portion of the documentation.


CREATE_EXTENDED_STATS Function

This function creates a column statistics entry in the system for a user specified column group or an expression in a table. Statistics for this extension will be gathered when user or auto statistics gathering job gathers statistics for the table. We call statistics for such an extension, “extended statistics”. This function returns the name of this newly created entry for the extension.

Syntax

DBMS_STATS.CREATE_EXTENDED_STATS (
   ownname    VARCHAR2,
   tabname    VARCHAR2,
   extension  VARCHAR2)
 RETURN VARCHAR2;

Parameters

Table 127-8 CREATE_EXTENDED_STATS Function Parameters

Parameter Description
ownname Owner name of a table
tabname Name of the table
extension Can be either a column group or an expression. Suppose the specified table has two column c1, c2. An example column group can be “(c1, c2)” and an example expression can be “(c1 + c2)”.

Return Values
This function returns the name of this newly created entry for the extension.


Since there is a correlation between the MONTH and ZODIAC columns in the CALENDAR table, the column group for the extended statistics will be (MONTH, ZODIAC).

Here is the command to create the extended stats:
SELECT DBMS_STATS.CREATE_EXTENDED_STATS(USER, 'CALENDAR', '(MONTH, ZODIAC)') FROM DUAL;

Now that we have the extended stats definition created, it’s time to gather stats. Here are the commands I’m using to gather stats:

BEGIN
 DBMS_STATS.GATHER_TABLE_STATS
 (
  OWNNAME => USER
 ,TABNAME => 'CALENDAR'
 ,ESTIMATE_PERCENT => NULL
 ,METHOD_OPT => 'FOR ALL COLUMNS SIZE SKEWONLY'
 );
END;
/
BEGIN
 DBMS_STATS.GATHER_TABLE_STATS
 (
  OWNNAME => USER
 ,TABNAME => 'PERSON'
 ,ESTIMATE_PERCENT => NULL
 );
END;
/

Lets look at the column stats on the two tables:

SELECT
   TABLE_NAME,
   COLUMN_NAME,
   NUM_DISTINCT as NDV,
   NUM_BUCKETS,
   SAMPLE_SIZE,
   HISTOGRAM
FROM
   USER_TAB_COL_STATISTICS
ORDER BY 1,2;
TABLE_NAME COLUMN_NAME                    NDV      NUM_BUCKETS SAMPLE_SIZE HISTOGRAM
---------- ------------------------------ -------- ----------- ----------- ---------------
CALENDAR   DATE_ID                             365         254         365 HEIGHT BALANCED
CALENDAR   MONTH                                12          12         365 FREQUENCY
CALENDAR   SYS_STUWHPY_ZSVI_W3#C$I3EUUYB4       24          24         365 FREQUENCY
CALENDAR   ZODIAC                               12          12         365 FREQUENCY
PERSON     DATE_ID                             365           1    11960320 NONE
PERSON     PERSON_ID                      11960320           1    11960320 NONE

As you can see, there are column statistics gathered on column group of CALENDAR.(MONTH, ZODIAC) represented by the SYS_STUWHPY_ZSVI_W3#C$I3EUUYB4 column.

The Moment of Truth

Will the extended statistics be enough to give the optimizer the information it needs to estimate an accurate number of rows? Let’s test it by running three test cases:

  1. How many people have a birth month of May?
  2. How many people have a Zodiac sign of Taurus?
  3. How many people have a birth month of May and a Zodiac sign of Taurus?

Each query is run with a /*+ gather_plan_statistics */ hint followed by
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, 'ALLSTATS LAST'));
The goal is to have the E-Rows (Optimizer Estimated Rows) be statistically accurate of the A-Rows (Actual Rows).

Below is the output from DBMS_XPLAN.DISPLAY_CURSOR for each of the three test cases.

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------
SQL_ID  55qv2rt3k8b3w, child number 0
-------------------------------------
select /*+ gather_plan_statistics */  count(*)
from  person p ,calendar c
where p.date_id = c.da te_id and month = 'may'

Plan hash value: 1463406140

--------------------------------------------------------------------
| Id  | Operation           | Name      | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------
|   1 |  SORT AGGREGATE     |           |      1 |      1 |      1 |
|   2 |   NESTED LOOPS      |           |      1 |   1015K|   1015K|
|*  3 |    TABLE ACCESS FULL| CALENDAR  |      1 |     31 |     31 |
|*  4 |    INDEX RANGE SCAN | PERSON_N1 |     31 |  32768 |   1015K|
--------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("MONTH"='may')
   4 - access("P"."DATE_ID"="C"."DATE_ID")



PLAN_TABLE_OUTPUT
--------------------------------------------------------------------
SQL_ID  8y54wtmy228r0, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*)
from  person p ,calendar c
where p.date_id = c.date_id and zodiac = 'taurus'

Plan hash value: 1463406140

--------------------------------------------------------------------
| Id  | Operation           | Name      | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------
|   1 |  SORT AGGREGATE     |           |      1 |      1 |      1 |
|   2 |   NESTED LOOPS      |           |      1 |   1015K|   1015K|
|*  3 |    TABLE ACCESS FULL| CALENDAR  |      1 |     31 |     31 |
|*  4 |    INDEX RANGE SCAN | PERSON_N1 |     31 |  32768 |   1015K|
--------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("ZODIAC"='taurus')
   4 - access("P"."DATE_ID"="C"."DATE_ID")


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------
SQL_ID  8ntkxs4ztb2rz, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*)
from  person p ,calendar c
where p.date_id = c.date_id and zodiac = 'taurus' and month = 'may'

Plan hash value: 1463406140

--------------------------------------------------------------------
| Id  | Operation           | Name      | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------
|   1 |  SORT AGGREGATE     |           |      1 |      1 |      1 |
|   2 |   NESTED LOOPS      |           |      1 |    688K|    688K|
|*  3 |    TABLE ACCESS FULL| CALENDAR  |      1 |     21 |     21 |
|*  4 |    INDEX RANGE SCAN | PERSON_N1 |     21 |  32768 |    688K|
--------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(("ZODIAC"='taurus' AND "MONTH"='may'))
   4 - access("P"."DATE_ID"="C"."DATE_ID")

Summary

As demonstrated, adding Extended Statistics and using Histograms allowed the Optimizer to accurately estimate the number of rows, even when there was data correlation. This is a very useful enhancement to assist the Optimizer when there is known data correlation.