The Best Benchmarketing I’ve Seen Yet: Measure BI Queries In Milliseconds

After posting about how ridiculous some of the benchmarketing claims that database vendors are making, Dave Menninger, VP of Marketing & Product Management at Vertica posted a comment that one of their customers reported a 40,400x gain in one query (this of course is after I openly joked about the 16,200x Vertica claim). So I made my way over to check out this claim, and sure enough, someone reported this. Here is the table presented in the webcast:


To this database performance engineer, this yet another unimpressive performance claim, but rather a very creative use of numbers, or maybe better put, a good case of bad math. Or better yet, big fun with small numbers. Honestly, measuring a BI query response time in milliseconds?!?! I don’t even know if OLTP database users measure their query response time in milliseconds. I simply can’t stop laughing at the fact that there needs to be precision below 1 second. Obviously BI users could not possibly tell that their query ran in less than 1 second because the network latency would mask this. Not only that, it seems there were 154 queries to choose from and the Vertica marketing crew chose to mention this one. Brilliant I say. So yes Dave, this is even more ludicrous than the 16,200x claim. At best it is a 202x gain. You won’t get credit from me (and probably others) for fractional seconds, but thanks for mentioning it. It was a good chuckle. By the way, why add two extra places of precision for this query and not all the others?

I think it is also worth mentioning that the data set size for this case is 84GB (raw) and 10.5GB in the Vertica DB (8x compression). Given the server running the database has 32GB of RAM it easily classifies as an in-memory database, so response time should certainly be in the seconds. I don’t know about you, but performance claims on a database in which the uncompressed data fits on an Apple iPod don’t excite me.

Dave Menninger also mentions:

One other piece of information in an effort of full (or at least more) disclosure is the following blog post that breaks down the orders of magnitude differences between row stores and column stores to their constituent parts.
Debunking Yet Another Myth: Column-Stores As A Storage-Layer Only Optimization

Column stores have been a topic of many research papers. The one that has caught my attention most recently is the paper by Allison Holloway and David DeWitt (Go Badgers!) entitled Read-Optimized Databases, In Depth and the VLDB 2008 presentation which has an alternate title of Yet Another Row Store vs Column Store Paper. I might suggest that you give them a read. Perhaps the crew at The Database Column will offer some comments on Allison and David’s research. I’m surprised that they haven’t already.

Well, that’s enough fun for a Friday. Time to kick off some benchmark queries on my HP Oracle Database Machine.


  1. Greg

    “I simply can’t stop laughing at the fact that there needs to be precision below 1 second.” Look into the very large areas of financial trading…anything from stocks to futures, options, and so on (in any country in the world). Almost every exchange in the world measures their query performance in milliseconds. I thought it a joke, until I saw how competitive it was to ‘beat’ out other brokers in a live trade. Trading houses that try to gain competitive advantages against one another sometimes have small 2-4 man doctoral-level research teams to get their response times down and do measure the times at this level. And it’s not just the database queries they look at…it’s a whole group of other things as well! As for other industries, I don’t know of any that would require it, let alone put lots of energy into attaining that response time.

  2. Greg Rahn

    In context, I was referring to BI queries, submitted to a database by a human. But, you do have a point. The response time for the financial trading systems is very low, probably mili or centi seconds. I’ll have to ping some of the customers I work with in Chicago to find out more on this. I guess the reason I poked fun of this is that Oracle’s SQL*Plus only reports time to two places, centiseconds.

  3. Greg

    “I was referring to BI queries”.

    Point taken there…and I’d agree. I cannot think of anything that would be ms within BI. Unless it is a top-secret gov’t project!

  4. Tanel Poder

    Another key thing in the trading systems millisecond response time war is that the “client” systems requiring these response times are automated algorighmic trading apps, not people sitting behind computers directly.

  5. Val

    Another point is that arguably only in-memory database can consistently ensure this kind of query response time. Whenever disk IO is involved, all bets are off !

    Parenthetically, everyone is in awe today with respect to milliseconds trading brokers as well as doctoral-level modelling chaps just by looking at the shape the economy is nowadays ! The fruits of their toil are here for everyone to observe.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s