<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
><channel><title>Structured Data &#187; Data Warehousing</title> <atom:link href="http://structureddata.org/category/oracle/data-warehousing/feed/" rel="self" type="application/rss+xml" /><link>http://structureddata.org</link> <description>Oracle Database Performance and Scalability Blog</description> <lastBuildDate>Mon, 06 Sep 2010 04:50:38 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.0.1</generator> <item><title>Oracle Exadata and Netezza TwinFin Compared – An Engineer’s Analysis</title><link>http://structureddata.org/2010/08/10/oracle-exadata-and-netezza-twinfin-compared-%e2%80%93-an-engineer%e2%80%99s-analysis/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=oracle-exadata-and-netezza-twinfin-compared-%25e2%2580%2593-an-engineer%25e2%2580%2599s-analysis</link> <comments>http://structureddata.org/2010/08/10/oracle-exadata-and-netezza-twinfin-compared-%e2%80%93-an-engineer%e2%80%99s-analysis/#comments</comments> <pubDate>Wed, 11 Aug 2010 04:07:15 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Exadata]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Netezza]]></category> <category><![CDATA[Teradata]]></category> <category><![CDATA[TwinFin]]></category><guid
isPermaLink="false">http://structureddata.org/?p=1053</guid> <description><![CDATA[There seems to be little debate that Oracle&#8217;s launch of the Oracle Exadata Storage Server and the Sun Oracle Database Machine has created buzz in the database marketplace. Apparently there is so much buzz and excitement around these products that two competing vendors, Teradata and Netezza, have both authored publications that contain a significant amount of discussion about the Oracle Database with Real Application Clusters (RAC) and Oracle Exadata. Both of these vendor papers are well structured but make no mistake, these are marketing publications written with the intent to be critical of Exadata and discuss how their product is potentially better. Hence, both of these papers are obviously biased to support their purpose. My intent with this blog post is simply to discuss some of the claims, analyze them for factual accuracy, and briefly comment on them. After all, Netezza clearly states in their publication: The information shared in this paper is made available in the spirit of openness. Any inaccuracies result from our mistakes, not an intent to mislead. In the interest of full disclosure, my employer is Oracle Corporation, however, this is a personal blog and what I write here are my own ideas and words (see [...]]]></description> <wfw:commentRss>http://structureddata.org/2010/08/10/oracle-exadata-and-netezza-twinfin-compared-%e2%80%93-an-engineer%e2%80%99s-analysis/feed/</wfw:commentRss> <slash:comments>11</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing – Set Processing vs Row Processing</title><link>http://structureddata.org/2010/07/20/the-core-performance-fundamentals-of-oracle-data-warehousing-%e2%80%93-set-processing-vs-row-processing/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-%25e2%2580%2593-set-processing-vs-row-processing</link> <comments>http://structureddata.org/2010/07/20/the-core-performance-fundamentals-of-oracle-data-warehousing-%e2%80%93-set-processing-vs-row-processing/#comments</comments> <pubDate>Tue, 20 Jul 2010 09:00:38 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Exadata]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[SQL Tuning]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[Oracle Exadata]]></category> <category><![CDATA[row processing]]></category> <category><![CDATA[set processing]]></category><guid
isPermaLink="false">http://structureddata.org/?p=939</guid> <description><![CDATA[[back to Introduction] In over six years of doing data warehouse POCs and benchmarks for clients there is one area that I frequently see as problematic: &#8220;batch jobs&#8221;.  Most of the time these &#8220;batch jobs&#8221; take the form of some PL/SQL procedures and packages that generally perform some data load, transformation, processing or something similar.  The reason these are so problematic is that developers have hard-coded &#8220;slow&#8221; into them.  I&#8217;m generally certain these developers didn&#8217;t know they had done this when they coded their PL/SQL, but none the less it happened. So How Did &#8220;Slow&#8221; Get Hard-Coded Into My PL/SQL? Generally &#8220;slow&#8221; gets hard-coded into PL/SQL because the PL/SQL developer(s) took the business requirements and did a &#8220;literal translation&#8221; of each rule/requirement one at a time instead of looking at the &#8220;before picture&#8221; and the &#8220;after picture&#8221; and determining the most efficient way to make those data changes.  Many times this can surface as cursor based row-by-row processing, but it also can appear as PL/SQL just running a series of often poorly thought out SQL commands. Hard-Coded Slow Case Study The following is based on a true story. Only the facts names have been changed to protect the innocent. Here is [...]]]></description> <wfw:commentRss>http://structureddata.org/2010/07/20/the-core-performance-fundamentals-of-oracle-data-warehousing-%e2%80%93-set-processing-vs-row-processing/feed/</wfw:commentRss> <slash:comments>21</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing &#8211; Data Loading</title><link>http://structureddata.org/2010/04/23/the-core-performance-fundamentals-of-oracle-data-warehousing-data-loading/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-data-loading</link> <comments>http://structureddata.org/2010/04/23/the-core-performance-fundamentals-of-oracle-data-warehousing-data-loading/#comments</comments> <pubDate>Fri, 23 Apr 2010 16:00:33 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[data loading]]></category> <category><![CDATA[external tables]]></category> <category><![CDATA[sql*loader]]></category> <category><![CDATA[sqlldr]]></category><guid
isPermaLink="false">http://structureddata.org/?p=878</guid> <description><![CDATA[[back to Introduction] Getting flat file data into your Oracle data warehouse is likely a daily (or more possibly frequent) task, but it certainly does not have to be a difficult one.  Bulk loading data rates are governed by the following operations and hardware resources: How fast can the data be read How fast can data be written out How much CPU power is available I&#8217;m always a bit amazed (and depressed) when I hear people complain that their data loading rates are slow and they proceed to tell me things like: The source files reside on a shared NFS filer (or similar) and it has just a single GbE (1 Gigabit Ethernet) network path to the Oracle database host(s). The source files reside on this internal disk volume which consists of a two disk mirror (or a volume with very few spindles). Maybe it&#8217;s not entirely obvious so let me spell it out (as I did in this tweet): One can not load data into a database faster than it can be delivered from the source. Database systems must obey the laws of physics! Or putting it another way: Don&#8217;t fall victim to slow data loading because of a slow performing data source. [...]]]></description> <wfw:commentRss>http://structureddata.org/2010/04/23/the-core-performance-fundamentals-of-oracle-data-warehousing-data-loading/feed/</wfw:commentRss> <slash:comments>13</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing &#8211; Parallel Execution</title><link>http://structureddata.org/2010/04/19/the-core-performance-fundamentals-of-oracle-data-warehousing-parallel-execution/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-parallel-execution</link> <comments>http://structureddata.org/2010/04/19/the-core-performance-fundamentals-of-oracle-data-warehousing-parallel-execution/#comments</comments> <pubDate>Mon, 19 Apr 2010 15:00:25 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Parallel Execution]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[parallel query]]></category> <category><![CDATA[scalability]]></category><guid
isPermaLink="false">http://structureddata.org/?p=818</guid> <description><![CDATA[[back to Introduction] Leveraging Oracle&#8217;s Parallel Execution (PX) in your Oracle data warehouse is probably the most important feature/technology one can use to speed up operations on large data sets.  PX is not, however, &#8220;go fast&#8221; magic pixi dust for any old operation (if thats what you think, you probably don&#8217;t understand the parallel computing paradigm). With Oracle PX, a large task is broken up into smaller parts, sub-tasks if you will, and each sub-task is then worked on in parallel.  The goal of Oracle PX: divide and conquer.  This allows a significant amount of hardware resources to be engaged in solving a single problem and is what allows the Oracle database to scale up and out when working with large data sets. I though I&#8217;d touch on some basics and add my observations but this is by far not an exhaustive write up on Oracle&#8217;s Parallel Execution.  There is an entire chapter in the Oracle Database documentation on PX as well as several white papers.  I&#8217;ve listed all these in the Resources section at the bottom of this post.  Read them, but as always, feel free to post questions/comments here.  Discussion adds great value. A Basic Example of Parallel Execution [...]]]></description> <wfw:commentRss>http://structureddata.org/2010/04/19/the-core-performance-fundamentals-of-oracle-data-warehousing-parallel-execution/feed/</wfw:commentRss> <slash:comments>8</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing &#8211; Partitioning</title><link>http://structureddata.org/2010/01/25/the-core-performance-fundamentals-of-oracle-data-warehousing-partitioning/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-partitioning</link> <comments>http://structureddata.org/2010/01/25/the-core-performance-fundamentals-of-oracle-data-warehousing-partitioning/#comments</comments> <pubDate>Mon, 25 Jan 2010 12:00:01 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[managability]]></category> <category><![CDATA[partitioning]]></category><guid
isPermaLink="false">http://structureddata.org/?p=816</guid> <description><![CDATA[[back to Introduction] Partitioning is an essential performance feature for an Oracle data warehouse because partition elimination (or partition pruning) generally results in the elimination of a significant amount of table data to be scanned. This results in a need for less system resources and improved query performance. Someone once told me &#8220;the fastest I/O is the one that never happens.&#8221; This is precisely the reason that partitioning is a must for Oracle data warehouses &#8211; it&#8217;s a huge I/O eliminator. I frequently refer to partition elimination as the anti-index. An index is used to find a small amount data that is required; partitioning is used to eliminate vasts amounts of data that is not required. Main Uses For Partitioning I would classify the main reasons to use partitioning in your Oracle data warehouse into these four areas: Data Elimination Partition-Wise Joins Manageability (Partition Exchange Load, Local Indexes, etc.) Information Lifecycle Management (ILM) Partitioning Basics The most common partitioning design pattern found in Oracle data warehouses is to partition the fact tables by range (or interval) on the event date/time column. This allows for partition elimination of all the data not in the desired time window in queries. For example: If I have a [...]]]></description> <wfw:commentRss>http://structureddata.org/2010/01/25/the-core-performance-fundamentals-of-oracle-data-warehousing-partitioning/feed/</wfw:commentRss> <slash:comments>10</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing &#8211; Table Compression</title><link>http://structureddata.org/2010/01/19/the-core-performance-fundamentals-of-oracle-data-warehousing-table-compression/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-table-compression</link> <comments>http://structureddata.org/2010/01/19/the-core-performance-fundamentals-of-oracle-data-warehousing-table-compression/#comments</comments> <pubDate>Tue, 19 Jan 2010 12:00:55 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[compression]]></category> <category><![CDATA[data warehouse]]></category><guid
isPermaLink="false">http://structureddata.org/?p=787</guid> <description><![CDATA[[back to Introduction] Editor&#8217;s note: This blog post does not cover Exadata Hybrid Columnar Compression. The first thing that comes to most people&#8217;s mind when database table compression is mentioned is the savings it yields in terms of disk space. While reducing the footprint of data on disk is relevant, I would argue it is the lesser of the benefits for data warehouses. Disk capacity is very cheap and generally plentiful, however, disk bandwidth (scan speed) is proportional to the number of spindles, no mater what the disk capacity and thus is more expensive. Table compression reduces the footprint on the disk drives that a given data set occupies so the amount of physical data that must be read off the disk platters is reduced when compared to the uncompressed version. For example, if 4000 GB of raw data can compress to 1000 GB, it can be read off the same disk drives 4X as fast because it is reading and transferring 1/4 of the data off the spindles (relative to the uncompressed size). Likewise, table compression allows for the database buffer cache to contain more data without having to increase the memory allocation because more rows can be stored [...]]]></description> <wfw:commentRss>http://structureddata.org/2010/01/19/the-core-performance-fundamentals-of-oracle-data-warehousing-table-compression/feed/</wfw:commentRss> <slash:comments>8</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing – Balanced Hardware Configuration</title><link>http://structureddata.org/2009/12/22/the-core-performance-fundamentals-of-oracle-data-warehousing-balanced-hardware-configuration/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-balanced-hardware-configuration</link> <comments>http://structureddata.org/2009/12/22/the-core-performance-fundamentals-of-oracle-data-warehousing-balanced-hardware-configuration/#comments</comments> <pubDate>Tue, 22 Dec 2009 22:00:54 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[capacity planing]]></category> <category><![CDATA[data warehouse]]></category> <category><![CDATA[io bandwidth]]></category> <category><![CDATA[scan rate]]></category><guid
isPermaLink="false">http://structureddata.org/2009/12/13/the-core-performance-fundamentals-of-oracle-data-warehousing-balanced-hardware-configuration/</guid> <description><![CDATA[[back to Introduction] If you want to build a house that will stand the test of time, you need to build on a solid foundation. The same goes for architecting computer systems that run databases. If the underlying hardware is not sized appropriately it will likely lead to people blaming software. All too often I see data warehouse systems that are poorly architected for the given workload requirements. I frequently tell people, &#8220;you can&#8217;t squeeze blood from a turnip&#8220;, meaning if the hardware resources are not there for the software to use, how can you expect the software to scale? Undersizing data warehouse systems has become an epidemic with open platforms &#8211; platforms that let you run on any brand and configuration of hardware. This problem has been magnified over time as the size of databases have grown significantly, and generally outpacing the experience of those managing them. This has caused the &#8220;big three&#8221; database vendors to come up with suggested or recommended hardware configurations for their database platforms: Oracle: Optimized Warehouse Initiative Microsoft: SQL Server Fast Track Data Warehouse IBM: Balanced Configuration Unit (BCU)   Simply put, the reasoning behind those initiatives was to help customers architect systems that [...]]]></description> <wfw:commentRss>http://structureddata.org/2009/12/22/the-core-performance-fundamentals-of-oracle-data-warehousing-balanced-hardware-configuration/feed/</wfw:commentRss> <slash:comments>16</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing &#8211; Introduction</title><link>http://structureddata.org/2009/12/14/the-core-performance-fundamentals-of-oracle-data-warehousing-introduction/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-introduction</link> <comments>http://structureddata.org/2009/12/14/the-core-performance-fundamentals-of-oracle-data-warehousing-introduction/#comments</comments> <pubDate>Mon, 14 Dec 2009 16:00:20 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Exadata]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[data warehouse]]></category><guid
isPermaLink="false">http://structureddata.org/?p=668</guid> <description><![CDATA[At the 2009 Oracle OpenWorld Unconference back in October I lead a chalk and talk session entitled The Core Performance Fundamentals Of Oracle Data Warehousing. Since this was a chalk and talk I spared the audience any powerpoint slides but I had several people request that make it into a presentation so they could share it with others. After some thought, I decided that a series of blog posts would probably be a better way to share this information, especially since I tend to use slides as a speaking outline, not a condensed version of a white paper. This will be the first of a series of posts discussing what I consider to be the key features and technologies behind well performing Oracle data warehouses. Introduction As an Oracle database performance engineer who has done numerous customer data warehouse benchmarks and POCs over the past 5+ years, I&#8217;ve seen many data warehouse systems that have been plagued with problems on nearly every DBMS commonly used in data warehousing. Interestingly enough, many of these systems were facing many of the same problems. I&#8217;ve compiled a list of topics that I consider to be key features and/or technologies for Oracle data warehouses: [...]]]></description> <wfw:commentRss>http://structureddata.org/2009/12/14/the-core-performance-fundamentals-of-oracle-data-warehousing-introduction/feed/</wfw:commentRss> <slash:comments>16</slash:comments> </item> <item><title>Oracle Parallel Execution: Interconnect Myths And Misunderstandings</title><link>http://structureddata.org/2009/07/06/oracle-parallel-execution-interconnect-myths-and-misunderstandings/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=oracle-parallel-execution-interconnect-myths-and-misunderstandings</link> <comments>http://structureddata.org/2009/07/06/oracle-parallel-execution-interconnect-myths-and-misunderstandings/#comments</comments> <pubDate>Tue, 07 Jul 2009 00:00:17 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Parallel Execution]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[interconnect traffic]]></category> <category><![CDATA[parallel query]]></category><guid
isPermaLink="false">http://structureddata.org/?p=602</guid> <description><![CDATA[A number of weeks back I had come across a paper/presentation by Riyaj Shamsudeen entitled Battle of the Nodes: RAC Performance Myths (avaiable here). As I was looking through it I saw one example that struck me as very odd (Myth #3 &#8211; Interconnect Performance) and I contacted him about it. After further review Riyaj commented that he had made a mistake in his analysis and offered up a new example. I thought I&#8217;d take the time to discuss this as parallel execution seems to be one of those areas where many misconceptions and misunderstandings exist. The Original Example I thought I&#8217;d quickly discuss why I questioned the initial example. The original query Riyaj cited is this one: select /*+ full(tl) parallel (tl,4) */ avg (n1), max (n1), avg (n2), max (n2), max (v1) from t_large tl; As you can see this is a very simple single table aggregation without a group by. The reason that I questioned the validity of this example in the context of interconnect performance is that the parallel execution servers (parallel query slaves) will each return exactly one row from the aggregation and then send that single row to the query coordinator (QC) which will [...]]]></description> <wfw:commentRss>http://structureddata.org/2009/07/06/oracle-parallel-execution-interconnect-myths-and-misunderstandings/feed/</wfw:commentRss> <slash:comments>15</slash:comments> </item> <item><title>Exadata Snippits From Oracle F4Q09 Earnings Call</title><link>http://structureddata.org/2009/06/23/exadata-snippits-from-oracle-f4q09-earnings-call/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=exadata-snippits-from-oracle-f4q09-earnings-call</link> <comments>http://structureddata.org/2009/06/23/exadata-snippits-from-oracle-f4q09-earnings-call/#comments</comments> <pubDate>Wed, 24 Jun 2009 04:51:00 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Exadata]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[database machine]]></category> <category><![CDATA[Netezza]]></category> <category><![CDATA[Teradata]]></category><guid
isPermaLink="false">http://structureddata.org/?p=590</guid> <description><![CDATA[Oracle Corporation had its F4Q09 earnings call today and the Exadata comments started right away with the earnings press release: “The Exadata Database Machine is well on its way to being the most successful new product launch in Oracle’s 30 year history,” said Oracle CEO Larry Ellison. “Several of Teradata’s largest customers are performance testing &#8212; then buying &#8212; Oracle Exadata Database Machines. In a recent competitive benchmark, a Teradata machine took over six hours to process a query that our Exadata Database Machine ran in less than 30 minutes. They bought Exadata.” During the earnings call Larry Ellison discusses Exadata and the competition: &#8230;I’m going to talk about Exadata again. I said last quarter that Exadata is shaping up to be our most exciting and successful new product introduction in Oracle’s 30 year history and [in the] last quarter Exadata continues to grow and win competitive deals in the marketplace against our three primarily competitors. It&#8217;s turning out that Teradata is our number one competitor&#8230;Netezza and IBM are kind of tied for second. Ellison describes some of the Exadata sales from this quarter which include: A well-known California SmartPhone and computer manufacturer (win vs. Netezza) who commented that Exadata [...]]]></description> <wfw:commentRss>http://structureddata.org/2009/06/23/exadata-snippits-from-oracle-f4q09-earnings-call/feed/</wfw:commentRss> <slash:comments>8</slash:comments> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk
Page Caching using disk (user agent is rejected)
Database Caching 5/12 queries in 0.008 seconds using disk

Served from: structureddata.org @ 2010-09-07 02:17:55 -->