Tagged: scan rate

The Core Performance Fundamentals Of Oracle Data Warehousing – Balanced Hardware Configuration

[back to Introduction]

If you want to build a house that will stand the test of time, you need to build on a solid foundation. The same goes for architecting computer systems that run databases. If the underlying hardware is not sized appropriately it will likely lead to people blaming software. All too often I see data warehouse systems that are poorly architected for the given workload requirements. I frequently tell people, “you can’t squeeze blood from a turnip“, meaning if the hardware resources are not there for the software to use, how can you expect the software to scale?

Undersizing data warehouse systems has become an epidemic with open platforms – platforms that let you run on any brand and configuration of hardware. This problem has been magnified over time as the size of databases have grown significantly, and generally outpacing the experience of those managing them. This has caused the “big three” database vendors to come up with suggested or recommended hardware configurations for their database platforms:

 

Simply put, the reasoning behind those initiatives was to help customers architect systems that are well balanced and sized appropriately for the size of their data warehouse.

Balanced Hardware Configurations

The foundation for a well performing data warehouse (or any system for that matter) is the hardware that it runs on. There are three main hardware resources to consider when sizing your data warehouse hardware. Those are:

  1. Number of CPUs
  2. Number of storage devices (HDDs or SSDs)
  3. I/O bandwidth between CPUs and storage devices

 

NB: I’ve purposely left off memory (RAM) as most systems are pretty well sized at 2GB or 4GB per CPU core these days.

A balanced system has the following characteristics:

dd459146image013en-us.jpg

 

As you can see, each of the three components are sized proportionally to each other. This allows for the max system throughput capacity as no single resource will become the bottleneck before any other. This was one of the critical design decisions that went into the Oracle Database Machine.

Most DBAs and System Admins know what the disk capacity numbers are for their systems, but when it comes to I/O bandwidth or scan rates, most are unaware of what the system is capable of in theory, let alone in practice. Perhaps I/O bandwidth utilization should be included in the system metrics that are collected for your databases. You do collect system metrics, right?

There are several “exchanges” that data must flow through from storage devices to host CPUs, many of which could become bottlenecks. Those include:

  • Back-end Fibre Channel loops (the fibre between the drive shelves and the storage array server processor)
  • Front-end Fibre Channel ports
  • Storage array server processors (SP)
  • Host HBAs

One should understand the throughput capacity of each of these components to ensure that one (or more) of them do not restrict the flow of data to the CPUs prematurely.

 

Unbalanced Hardware Configurations

All too frequently systems are not architected as balanced systems and the system ends up being constrained in one of the following three scenarios:

dd459146image011en-us.jpg

 

From the production systems that I have seen, the main deficiency is in I/O bandwidth (both I/O Channel and HDD). I believe there are several reasons for this. First, too many companies capacity plan for their data warehouse based on the size the data occupies on disk alone. That is, they purchase the number of HDDs for the system based on the drive capacity, not on the I/O bandwidth requirement. Think of it like this: If you were to purchase 2 TB of mirrored disk capacity (4 TB total) would you rather purchase 28 x 146 GB drives or 14 x 300 GB drives (or even 4 x 1 TB drives)? You may ask: Well, what is the difference (other than price); in each case you have the same net capacity, correct? Indeed, both configurations do have the same capacity, but I/O bandwidth (how fast you can read data off the HDDs) is proportional to the number of HDDs, not the capacity. Thus it should be slightly obvious then that 28 HDDs can deliver 2X the disk I/O bandwidth that 14 HDDs can. This means that it will take 2X as long to read the same amount of data off of 14 HDDs as 28 HDDs.

Unfortunately what tends to happen is that the bean counter types will see only two things:

  1. The disk capacity (space) you want to purchase (or the capacity that is required)
  2. The price per MB/GB/TB

This is where someone worthy of the the title systems architect needs to stand up and explain the concept of I/O bandwidth and the impact it has on data warehouse performance (your systems architect does know this, correct?). This is generally a difficult discussion because I/O bandwidth is not a line item on a purchase order, it is a derived metric that requires both thought and engineering (which means someone had to do some thinking about the requirements for this system!).

 

Summary

When sizing the hardware for your data warehouse consider your workload and understand following (and calculate numbers for them!):

  1. What rate (in MB/GB per second) can the CPUs consume data?
  2. What rate can storage devices produce data (scan rate)?
  3. What rate can the data be delivered from the storage array(s) to the host HBAs?

 

If you are unable to answer these questions in theory then you need to sit down and do some calculations. Then you need to use some micro benchmarks (like Oracle ORION) and prove out those calculations. This will give you the “speed limit” and an metric by which you can measure your database workload against. All computer systems much obey the laws of physics! There is no way around that.

Additional Readings

Kevin Closson has several good blog posts on related topics including:

as well as numerous others.

 

Oracle Documentation References:

 

Oracle Exadata: In Response to Chuck Hollis

Chuck Hollis, VP and Global Marketing CTO at EMC has written a couple blog posts offering his thoughts on Oracle Exadata. The first was “Oracle Does Hardware” which he wrote the day after the product launch. The second, unimpressively titled “I Annoy Kevin Closson at Oracle” was on Monday October 20th which was in response to a blog post by Exadata Performance Architect, Kevin Closson who commented on Chuck’s first post and some comments left on Kevin’s blog.

Clearly Stated Intentions

Since Chuck had disabled comments for his “I Annoy Kevin” post, I’m going to write my comments here. I have no intention to get into some fact-less debate turn flame, but I will make some direct comments with supporting facts and numbers while keeping it professional.

Storage Arrays: Bottleneck or Not?

Chuck thinks:

“…array-based storage technology is not the bottleneck; our work with Oracle [on the Oracle Optimized Warehouse Initiative] and other DW/BI environments routinely shows that we can feed data to a server just as fast as it can take it.”

First let me comment on the Optimized Warehouse Initiative. There have been some good things that have come out of this effort. I believe it has increased the level of awareness when it comes to sizing storage for BI/DW workloads. All too often storage sizing for BI/DW is done by capacity, not I/O bandwidth. The focus is on building balanced systems: systems that can execute queries and workloads such that no one component (CPU/storage connectivity/disk array/disk drives) becomes the bottleneck prematurely. The industry seems to agree: IBM has the Balanced Warehouse and Microsoft has a reference architecture for Project Madison as well.

So the question comes back to: Is array-based storage technology the bottleneck or not? I would argue it is. Perhaps I would use a word other than “bottleneck”, but let’s be clear on the overall challenge here. That is: to read data off disk with speed and efficiently return it to the database host to process it as fast as possible.

Let’s start at the bottom of the stack: hard disk drives. If the challenge is to scan lots of data fast, then how fast data can be read off disk is the first important metric to consider. In the white paper Deploying EMC CLARiiON CX4-960 for Data Warehouse/Decision Support System (DSS) Workloads EMC reports a drive scan rate (for a BI/DW workload) of 20 MB/s using 8+1 RAID-5 and 33 MB/s using a 2+1 RAID-5 LUN configuration. Oracle Exadata delivers drive scan rates around 85 MB/s, a difference of 2.5X to 4.25X. To understand the performance impact of this I’ve put together a few tables of data based on these real workload numbers.

Hardware Specs and Numbers for Data Warehouse Workloads

Storage RAID Raw:Usable Ratio Disk Drives Disk Scan Rate
EMC CX4-960
8+1 RAID 5
9:8
146 GB FC 15k RPM
20 MB/s
EMC CX4-960
2+1 RAID 5
3:2
146 GB FC 15k RPM
33 MB/s
EMC CX4-960
8+1 RAID 5
9:8
300 GB FC 15k RPM
20 MB/s
EMC CX4-960
2+1 RAID 5
3:2
300 GB FC 15k RPM
33 MB/s
Oracle Exadata
ASM Mirroring
2:1
450 GB SAS 15k RPM
85 MB/s

Sizing By Capacity

Storage RAID Total Usable Space Disk Drive Number of Drives Total Scan Rate
EMC CX4-960
8+1 RAID 5
18 TB
146 GB
139
2.8 GB/s
EMC CX4-960
2+1 RAID 5
18 TB
146 GB
185
6.1 GB/s*
EMC CX4-960
8+1 RAID 5
18 TB
300 GB
68
1.4 GB/s
EMC CX4-960
2+1 RAID 5
18 TB
300 GB
90
3.0 GB/s
Oracle Exadata
ASM Mirroring
18 TB
450 GB
80
6.8 GB/s

* I’m not sure that the CX4-960 array head is capable of 6.1 GB/s so it likley takes at least 2 CX4-960 array heads to deliver this throughput to the host(s).

Sizing By Scan Rate

Storage RAID Total Scan Rate Disk Drive Number of Drives Total Usable Space
EMC CX4-960
8+1 RAID 5
3.00 GB/s
146 GB
150
19.46 TB
EMC CX4-960
2+1 RAID 5
3.00 GB/s
146 GB
90
8.76 TB
EMC CX4-960
8+1 RAID 5
3.00 GB/s
300 GB
150
40.00 TB
EMC CX4-960
2+1 RAID 5
3.00 GB/s
300 GB
90
18.00 TB
Oracle Exadata
ASM Mirroring
3.00 GB/s
450 GB
36
8.10 TB

A Few Comments On The Above Data Points

Please note that “Total Usable Space” is a rough number for the total protected disk space one can use for a database if you filled each drive up to capacity. It does not take into consideration things like loss for formatting, space for sort/temp, etc, etc. I would use a 60% rule for estimating data space for database vs. total usable space. This means that 18 TB of total usable space would equate to 10 TB (max) of space for database data (compression not accounted for).

I’d also like to note that in the Sizing By Capacity table the “Total Scan Rate” is a disk only calculation. Whether or not a single CX4-960 array head can move data at that rate is in question. Based on the numbers in the EMC whitepaper it would appear CX4-960 head is capable of 3 GB/s but I would question if it is capable of much more than that, hence the reason for the asterisk(*).

Looking At The Numbers

If you look at the number for Sizing By Capacity, you can see that for the given fixed size, Exadata provides the fastest scan rate while using only 80 disk drives. The next closest scan rate is just 700 MB/s less but it uses 105 more disk drives (80 vs. 185). Quite a big difference.

When it comes to delivering I/O bandwidth, Exadata clearly stands out. Targeting a scan rate of 3 GB/s, Exadata delivers this using only 36 drives, just 3 Exadata Storage Servers. If one wanted to deliver this scan rate with the CX4 it would take 2.5X as many drives (90 vs. 36) using 2+1 RAID 5.

So are storage arrays the bottleneck? You can draw your own conclusions, but I think the numbers speak to the performance advantage with Oracle Exadata when it comes to delivering I/O bandwidth and fast scan rates. Consider this: What would the storage topology look like if you wanted to deliver a scan rate of 74 GB/s as we did for Oracle OpenWorld with 84 HP Oracle Exadata Storage Servers (6 HP Oracle Database Machines)? Honestly I would struggle to think where I would put the 185 or so 4Gb HBAs to achieve that.

Space Saving RAID or Wasteful Mirroring

This leads me to another comment by Chuck in his second post:

“[with Exadata] The disk is mirrored, no support of any space-saving RAID options — strange, for such a large machine”

And this one in his first post:

“If it were me, I’d want a RAID 5 (or 6) option.”

And his comment on Kevin’s blog:

“The fixed ratio of 12 disks (6 usable) per server element strikes us as a bit wasteful….And, I know this only matters to storage people, but there’s the minor matter of having two copies of everything, rather than the more efficient parity RAID approaches. Gets your attention when you’re talking 10-40TB usable, it does.”

Currently Exadata uses ASM mirroring for fault tolerance so there is a 2:1 ratio of raw disk to usable disk, however I don’t think it matters much. The logic behind that comment is that when one is sizing for a given scan rate, Exadata uses less spindles than the other configurations even though the disk protection is mirroring and not space-saving RAID 5. I guess I think it is strange to worry about space savings when disks just keep getting bigger and many are keeping the same performance characteristics as their predecessors. Space is cheap. Spindles are expensive. When one builds a configuration that satisfies the I/O scan rate requirement, chances are you have well exceeded the storage capacity requirement, even when using mirroring.

Perhaps Chuck likes space-saving RAID 5, but I think using less drives (0.4 as many, 36 vs. 90) to deliver the same scan rate is hardly wasteful. You know what really gets my attention? Having 40 TB of total usable space on 15 HP Oracle Exadata Storage Servers (180 450GB SAS drives) and being able to scan it at 15 GB/s compared to say having a CX4 with 200 drives @ 300GB using 2+1 R5 and only being able to scan them at 6.6 GB/s. I’d also be willing to bet that would require at least 2 if not 3 CX4-960 array heads and at least 30 4Gb HBAs running at wire speed (400 MB/s).

Exadata Is Smart Storage

Chuck comments:

“Leaving hardware issues aside, how much of the software functionality shown here is available on generic servers, operating systems and storage that Oracle supports today? I was under the impression that most of this great stuff was native to Oracle products, and not a function of specific tin …

If the Exadata product has unique and/or specialized Oracle logic, well, that’s a different case.”

After reading that I would said Chuck has not read the Technical Overview of the HP Oracle Exadata Storage Server. Not only does Exadata have a very fast scan rate, it has intelligence. A combination of brawn and brains which is not available with other storage platforms. The Oracle Exadata Storage Server Software (say that 5 times fast!!!) is not an Oracle database. It is storage software not database software. The intelligence and specialized logic is that Exadata Smart Scans return only the relevant rows and columns of a query, allowing for better use of I/O bandwidth and increased database performance because the database host(s) are not issuing I/O requests for data that is not needed for the query and then processing it post-fact. There are a couple slides (18 & 19) referencing a simple example of the benifits of Smart Scans in the HP Oracle Exadata Storage Server technical overview slide deck. It is worth the read.

It Will Be Interesting Indeed

Chuck concludes his second post with:

“The real focus here should be software, not hardware.”

Personally I think the focus should be on solutions that perform and scale and I think the HP Oracle Exadata Storage Server is a great solution for Oracle data warehouses that require large amounts of I/O bandwidth.

Ending On A Good Note

While many comments by Chuck do not seem to be well researched I would comment that having a conventional mid-range storage array that can deliver 3 GB/s is not a bad thing at all. I’ve seen many Oracle customers that have only a fraction of that and there are probably some small data warehouses out there that may run fine with 3 GB/s of I/O bandwidth. However, I think that those would run even faster with Oracle Exadata and I’ve never had a customer complain about queries running too fast.