On Monday, April 20, 2009, Oracle announced that it had agreed to acquire Sun Microsystems. Since then there has been much speculation and question raised around numerous areas of the deal. There is an official FAQ that discusses many areas, but I thought I would highlight three that seem to be fairly popular around the blogosphere:
Will the ownership of Solaris change Oracle’s position on Linux?
No. This transaction enhances our commitment to open standards and choice. Oracle is as committed as ever to Linux and other platforms and will continue to support and enhance our strong industry partnerships.
What does Oracle plan to do with MySQL?
MySQL will be an addition to Oracle’s existing suite of database products, which already includes Oracle Database 11g, TimesTen, Berkeley DB open source database, and the open source transactional storage engine, InnoDB.
What impact does this announcement have on the HP Oracle Exadata Storage Server and HP Oracle Database Machine products?
There is no impact. Oracle remains fully committed to the HP Oracle Exadata Storage Server and HP Oracle Database Machine products.
After posting about how ridiculous some of the benchmarketing claims that database vendors are making, Dave Menninger, VP of Marketing & Product Management at Vertica posted a comment that one of their customers reported a 40,400x gain in one query (this of course is after I openly joked about the 16,200x Vertica claim). So I made my way over to check out this claim, and sure enough, someone reported this. Here is the table presented in the webcast:
To this database performance engineer, this yet another unimpressive performance claim, but rather a very creative use of numbers, or maybe better put, a good case of bad math. Or better yet, big fun with small numbers. Honestly, measuring a BI query response time in milliseconds?!?! I don’t even know if OLTP database users measure their query response time in milliseconds. I simply can’t stop laughing at the fact that there needs to be precision below 1 second. Obviously BI users could not possibly tell that their query ran in less than 1 second because the network latency would mask this. Not only that, it seems there were 154 queries to choose from and the Vertica marketing crew chose to mention this one. Brilliant I say. So yes Dave, this is even more ludicrous than the 16,200x claim. At best it is a 202x gain. You won’t get credit from me (and probably others) for fractional seconds, but thanks for mentioning it. It was a good chuckle. By the way, why add two extra places of precision for this query and not all the others?
I think it is also worth mentioning that the data set size for this case is 84GB (raw) and 10.5GB in the Vertica DB (8x compression). Given the server running the database has 32GB of RAM it easily classifies as an in-memory database, so response time should certainly be in the seconds. I don’t know about you, but performance claims on a database in which the uncompressed data fits on an Apple iPod don’t excite me.
Dave Menninger also mentions:
One other piece of information in an effort of full (or at least more) disclosure is the following blog post that breaks down the orders of magnitude differences between row stores and column stores to their constituent parts.
Debunking Yet Another Myth: Column-Stores As A Storage-Layer Only Optimization
Column stores have been a topic of many research papers. The one that has caught my attention most recently is the paper by Allison Holloway and David DeWitt (Go Badgers!) entitled Read-Optimized Databases, In Depth and the VLDB 2008 presentation which has an alternate title of Yet Another Row Store vs Column Store Paper. I might suggest that you give them a read. Perhaps the crew at The Database Column will offer some comments on Allison and David’s research. I’m surprised that they haven’t already.
Well, that’s enough fun for a Friday. Time to kick off some benchmark queries on my HP Oracle Database Machine.
Oracle OpenWorld 2008 is now in the books and it surely was a busy and exciting one. The launch of Oracle Exadata Storage Server and the HP Oracle Database Machine was the highlight for me and I hope for many of you as well.
Real-World Performance Group Presentations
This year the Real-World Performance Group did three sessions and the slides for those sessions have been uploaded to the OpenWorld Content Catalog as well as the presentations page on my blog. Hopefully you were able to attend the sessions and found them informative and had some good take-aways. If you did not attend, the slides are available for your reading enjoyment. If you have any questions, please post a comment.
If you haven’t been under a rock you know that Larry Ellison announced the Oracle Exadata Storage Server and the HP Oracle Database Machine at Oracle OpenWorld 2008. There seems to be quite a bit of interest and excitement about the product and I for one will say that I am extremely excited about it especially after having used it. If you were an OOW attendee, hopefully you were able to see the HP Oracle Database Machine live demo that was in the Moscone North lobby. Kevin Closson and I were both working the live demo Thursday morning and Doug Burns snapped a few photos of Kevin and I doing the demo.
HP Oracle Database Machine Demos
In order to demonstrate Oracle Exadata, we had an HP Oracle Database Machine set up with some live demos. This Database Machine was split into two parts, the first had two Oracle database servers and two Oracle Exadata servers, the second had six Oracle database servers and 12 Oracle Exadata servers. A table scan query was started on the two Oracle Exadata servers config. The same query was then started on the 12 Oracle Exadata servers config. The scan rates were displayed on the screen and one could see that each Exadata cell was scanning at a rate around 1GB/s for a total aggregate of around 14GB/s. Not too bad for a single 42U rack of gear. This demo also showed that the table scan time was linear with the number of Exadata cells: 10 seconds vs. 60 seconds. With six times the number of Exadata cells, the table scan time was cut by 6.
The second live demo we did was to execute query consisting of a four table join (PRODUCTS, STORES, ORDERS, ORDER_ITEMS) with some data that was based off one of the trial customers. The query was to find how many items were sold yesterday in four southwestern states of which the item name contained the string “chili sauce”. The ORDER_ITEMS table contained just under 2 billion rows for that day and the ORDERS table contained 130 million rows for the day. This query’s execution time was less than 20 seconds. The execution plan for this query was all table scans – no indexes, etc were used.
When One HP Oracle Database Machine Is Not Enough
As a demonstration of the linear scalability of Oracle Exadata, a configuration of six (6) HP Oracle Database Machines for a total of 84 Exadata cells was assembled. 14 days worth of POS (point of sale) data onto one Database Machine and executed a query to full table scan the entire 14 days. Another 14 days of data were loaded and a second Database Machine was added to the configuration. The query was run again, now against 28 days across two Database Machines. This process was repeated, loading 14 more days of data and adding another Database Machine until 84 days were loaded across six Database Machines. As expected, all six executions of the query were nearly identical in execution time demonstrating the scalability of the product. The amazing bit about this all was with six Database Machines and 84 days of data (around 163 billion rows), the physical I/O scan rate was over 74 GB/s (266.4 TB/hour) sustained. To put that in perspective, it equates to scanning 1 TB of uncompressed data in just 13.5 seconds. In this case, Oracle’s compression was used so the time to scan 1 TB of user data was just over 3 seconds. Now that is extreme performance!!!
As I’m getting ready to post this, I see Kevin has beat me to it. Man, that guy is an extreme blogging machine.
Initial Customer Experiences
Several Oracle customers had a 1/2 HP Oracle Database Machine* (see Kevin’s comments below) to do testing with their data and their workloads. These are the ones that were highlighted in Larry’s keynote.
- Currently runs on two IBM P570s with EMC CX-30 storage
- 4.5TB of Call Data Records
- Exadata speedup: 10x to 72x (average 28x)
- “Every query was faster on Exadata compared to our current systems. The smallest performance improvement was 10x and the biggest one was 72x.”
- Currently runs on HP Superdome and XP24000 storage
- 220TB of Call Data Records
- “Call Data Records queries that used to run over 30 minutes now complete in under 1 minute. That’s extreme performance.”
- “Oracle Exadata outperforms anything we’ve tested to date by 10 to 15 times. This product flat out screams.”
- Currently runs on IBM P570 (13 CPUs) and EMC CLARiiON and DMX storage
- 5TB of retail data
- Exadata speedup: 3x to 50x (average 16x)