<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
><channel><title>Structured Data &#187; VLDB</title> <atom:link href="http://structureddata.org/category/oracle/vldb/feed/" rel="self" type="application/rss+xml" /><link>http://structureddata.org</link> <description>Oracle Database Performance and Scalability Blog</description> <lastBuildDate>Mon, 06 Sep 2010 04:50:38 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.0.1</generator> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing – Set Processing vs Row Processing</title><link>http://structureddata.org/2010/07/20/the-core-performance-fundamentals-of-oracle-data-warehousing-%e2%80%93-set-processing-vs-row-processing/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-%25e2%2580%2593-set-processing-vs-row-processing</link> <comments>http://structureddata.org/2010/07/20/the-core-performance-fundamentals-of-oracle-data-warehousing-%e2%80%93-set-processing-vs-row-processing/#comments</comments> <pubDate>Tue, 20 Jul 2010 09:00:38 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Exadata]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[SQL Tuning]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[Oracle Exadata]]></category> <category><![CDATA[row processing]]></category> <category><![CDATA[set processing]]></category><guid
isPermaLink="false">http://structureddata.org/?p=939</guid> <description><![CDATA[[back to Introduction] In over six years of doing data warehouse POCs and benchmarks for clients there is one area that I frequently see as problematic: &#8220;batch jobs&#8221;.  Most of the time these &#8220;batch jobs&#8221; take the form of some PL/SQL procedures and packages that generally perform some data load, transformation, processing or something similar.  The reason these are so problematic is that developers have hard-coded &#8220;slow&#8221; into them.  I&#8217;m generally certain these developers didn&#8217;t know they had done this when they coded their PL/SQL, but none the less it happened. So How Did &#8220;Slow&#8221; Get Hard-Coded Into My PL/SQL? Generally &#8220;slow&#8221; gets hard-coded into PL/SQL because the PL/SQL developer(s) took the business requirements and did a &#8220;literal translation&#8221; of each rule/requirement one at a time instead of looking at the &#8220;before picture&#8221; and the &#8220;after picture&#8221; and determining the most efficient way to make those data changes.  Many times this can surface as cursor based row-by-row processing, but it also can appear as PL/SQL just running a series of often poorly thought out SQL commands. Hard-Coded Slow Case Study The following is based on a true story. Only the facts names have been changed to protect the innocent. Here is [...]]]></description> <wfw:commentRss>http://structureddata.org/2010/07/20/the-core-performance-fundamentals-of-oracle-data-warehousing-%e2%80%93-set-processing-vs-row-processing/feed/</wfw:commentRss> <slash:comments>21</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing &#8211; Data Loading</title><link>http://structureddata.org/2010/04/23/the-core-performance-fundamentals-of-oracle-data-warehousing-data-loading/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-data-loading</link> <comments>http://structureddata.org/2010/04/23/the-core-performance-fundamentals-of-oracle-data-warehousing-data-loading/#comments</comments> <pubDate>Fri, 23 Apr 2010 16:00:33 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[data loading]]></category> <category><![CDATA[external tables]]></category> <category><![CDATA[sql*loader]]></category> <category><![CDATA[sqlldr]]></category><guid
isPermaLink="false">http://structureddata.org/?p=878</guid> <description><![CDATA[[back to Introduction] Getting flat file data into your Oracle data warehouse is likely a daily (or more possibly frequent) task, but it certainly does not have to be a difficult one.  Bulk loading data rates are governed by the following operations and hardware resources: How fast can the data be read How fast can data be written out How much CPU power is available I&#8217;m always a bit amazed (and depressed) when I hear people complain that their data loading rates are slow and they proceed to tell me things like: The source files reside on a shared NFS filer (or similar) and it has just a single GbE (1 Gigabit Ethernet) network path to the Oracle database host(s). The source files reside on this internal disk volume which consists of a two disk mirror (or a volume with very few spindles). Maybe it&#8217;s not entirely obvious so let me spell it out (as I did in this tweet): One can not load data into a database faster than it can be delivered from the source. Database systems must obey the laws of physics! Or putting it another way: Don&#8217;t fall victim to slow data loading because of a slow performing data source. [...]]]></description> <wfw:commentRss>http://structureddata.org/2010/04/23/the-core-performance-fundamentals-of-oracle-data-warehousing-data-loading/feed/</wfw:commentRss> <slash:comments>13</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing &#8211; Parallel Execution</title><link>http://structureddata.org/2010/04/19/the-core-performance-fundamentals-of-oracle-data-warehousing-parallel-execution/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-parallel-execution</link> <comments>http://structureddata.org/2010/04/19/the-core-performance-fundamentals-of-oracle-data-warehousing-parallel-execution/#comments</comments> <pubDate>Mon, 19 Apr 2010 15:00:25 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Parallel Execution]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[parallel query]]></category> <category><![CDATA[scalability]]></category><guid
isPermaLink="false">http://structureddata.org/?p=818</guid> <description><![CDATA[[back to Introduction] Leveraging Oracle&#8217;s Parallel Execution (PX) in your Oracle data warehouse is probably the most important feature/technology one can use to speed up operations on large data sets.  PX is not, however, &#8220;go fast&#8221; magic pixi dust for any old operation (if thats what you think, you probably don&#8217;t understand the parallel computing paradigm). With Oracle PX, a large task is broken up into smaller parts, sub-tasks if you will, and each sub-task is then worked on in parallel.  The goal of Oracle PX: divide and conquer.  This allows a significant amount of hardware resources to be engaged in solving a single problem and is what allows the Oracle database to scale up and out when working with large data sets. I though I&#8217;d touch on some basics and add my observations but this is by far not an exhaustive write up on Oracle&#8217;s Parallel Execution.  There is an entire chapter in the Oracle Database documentation on PX as well as several white papers.  I&#8217;ve listed all these in the Resources section at the bottom of this post.  Read them, but as always, feel free to post questions/comments here.  Discussion adds great value. A Basic Example of Parallel Execution [...]]]></description> <wfw:commentRss>http://structureddata.org/2010/04/19/the-core-performance-fundamentals-of-oracle-data-warehousing-parallel-execution/feed/</wfw:commentRss> <slash:comments>8</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing &#8211; Partitioning</title><link>http://structureddata.org/2010/01/25/the-core-performance-fundamentals-of-oracle-data-warehousing-partitioning/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-partitioning</link> <comments>http://structureddata.org/2010/01/25/the-core-performance-fundamentals-of-oracle-data-warehousing-partitioning/#comments</comments> <pubDate>Mon, 25 Jan 2010 12:00:01 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[managability]]></category> <category><![CDATA[partitioning]]></category><guid
isPermaLink="false">http://structureddata.org/?p=816</guid> <description><![CDATA[[back to Introduction] Partitioning is an essential performance feature for an Oracle data warehouse because partition elimination (or partition pruning) generally results in the elimination of a significant amount of table data to be scanned. This results in a need for less system resources and improved query performance. Someone once told me &#8220;the fastest I/O is the one that never happens.&#8221; This is precisely the reason that partitioning is a must for Oracle data warehouses &#8211; it&#8217;s a huge I/O eliminator. I frequently refer to partition elimination as the anti-index. An index is used to find a small amount data that is required; partitioning is used to eliminate vasts amounts of data that is not required. Main Uses For Partitioning I would classify the main reasons to use partitioning in your Oracle data warehouse into these four areas: Data Elimination Partition-Wise Joins Manageability (Partition Exchange Load, Local Indexes, etc.) Information Lifecycle Management (ILM) Partitioning Basics The most common partitioning design pattern found in Oracle data warehouses is to partition the fact tables by range (or interval) on the event date/time column. This allows for partition elimination of all the data not in the desired time window in queries. For example: If I have a [...]]]></description> <wfw:commentRss>http://structureddata.org/2010/01/25/the-core-performance-fundamentals-of-oracle-data-warehousing-partitioning/feed/</wfw:commentRss> <slash:comments>10</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing &#8211; Table Compression</title><link>http://structureddata.org/2010/01/19/the-core-performance-fundamentals-of-oracle-data-warehousing-table-compression/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-table-compression</link> <comments>http://structureddata.org/2010/01/19/the-core-performance-fundamentals-of-oracle-data-warehousing-table-compression/#comments</comments> <pubDate>Tue, 19 Jan 2010 12:00:55 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[compression]]></category> <category><![CDATA[data warehouse]]></category><guid
isPermaLink="false">http://structureddata.org/?p=787</guid> <description><![CDATA[[back to Introduction] Editor&#8217;s note: This blog post does not cover Exadata Hybrid Columnar Compression. The first thing that comes to most people&#8217;s mind when database table compression is mentioned is the savings it yields in terms of disk space. While reducing the footprint of data on disk is relevant, I would argue it is the lesser of the benefits for data warehouses. Disk capacity is very cheap and generally plentiful, however, disk bandwidth (scan speed) is proportional to the number of spindles, no mater what the disk capacity and thus is more expensive. Table compression reduces the footprint on the disk drives that a given data set occupies so the amount of physical data that must be read off the disk platters is reduced when compared to the uncompressed version. For example, if 4000 GB of raw data can compress to 1000 GB, it can be read off the same disk drives 4X as fast because it is reading and transferring 1/4 of the data off the spindles (relative to the uncompressed size). Likewise, table compression allows for the database buffer cache to contain more data without having to increase the memory allocation because more rows can be stored [...]]]></description> <wfw:commentRss>http://structureddata.org/2010/01/19/the-core-performance-fundamentals-of-oracle-data-warehousing-table-compression/feed/</wfw:commentRss> <slash:comments>10</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing – Balanced Hardware Configuration</title><link>http://structureddata.org/2009/12/22/the-core-performance-fundamentals-of-oracle-data-warehousing-balanced-hardware-configuration/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-balanced-hardware-configuration</link> <comments>http://structureddata.org/2009/12/22/the-core-performance-fundamentals-of-oracle-data-warehousing-balanced-hardware-configuration/#comments</comments> <pubDate>Tue, 22 Dec 2009 22:00:54 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[capacity planing]]></category> <category><![CDATA[data warehouse]]></category> <category><![CDATA[io bandwidth]]></category> <category><![CDATA[scan rate]]></category><guid
isPermaLink="false">http://structureddata.org/2009/12/13/the-core-performance-fundamentals-of-oracle-data-warehousing-balanced-hardware-configuration/</guid> <description><![CDATA[[back to Introduction] If you want to build a house that will stand the test of time, you need to build on a solid foundation. The same goes for architecting computer systems that run databases. If the underlying hardware is not sized appropriately it will likely lead to people blaming software. All too often I see data warehouse systems that are poorly architected for the given workload requirements. I frequently tell people, &#8220;you can&#8217;t squeeze blood from a turnip&#8220;, meaning if the hardware resources are not there for the software to use, how can you expect the software to scale? Undersizing data warehouse systems has become an epidemic with open platforms &#8211; platforms that let you run on any brand and configuration of hardware. This problem has been magnified over time as the size of databases have grown significantly, and generally outpacing the experience of those managing them. This has caused the &#8220;big three&#8221; database vendors to come up with suggested or recommended hardware configurations for their database platforms: Oracle: Optimized Warehouse Initiative Microsoft: SQL Server Fast Track Data Warehouse IBM: Balanced Configuration Unit (BCU)   Simply put, the reasoning behind those initiatives was to help customers architect systems that [...]]]></description> <wfw:commentRss>http://structureddata.org/2009/12/22/the-core-performance-fundamentals-of-oracle-data-warehousing-balanced-hardware-configuration/feed/</wfw:commentRss> <slash:comments>16</slash:comments> </item> <item><title>The Core Performance Fundamentals Of Oracle Data Warehousing &#8211; Introduction</title><link>http://structureddata.org/2009/12/14/the-core-performance-fundamentals-of-oracle-data-warehousing-introduction/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=the-core-performance-fundamentals-of-oracle-data-warehousing-introduction</link> <comments>http://structureddata.org/2009/12/14/the-core-performance-fundamentals-of-oracle-data-warehousing-introduction/#comments</comments> <pubDate>Mon, 14 Dec 2009 16:00:20 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Exadata]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[data warehouse]]></category><guid
isPermaLink="false">http://structureddata.org/?p=668</guid> <description><![CDATA[At the 2009 Oracle OpenWorld Unconference back in October I lead a chalk and talk session entitled The Core Performance Fundamentals Of Oracle Data Warehousing. Since this was a chalk and talk I spared the audience any powerpoint slides but I had several people request that make it into a presentation so they could share it with others. After some thought, I decided that a series of blog posts would probably be a better way to share this information, especially since I tend to use slides as a speaking outline, not a condensed version of a white paper. This will be the first of a series of posts discussing what I consider to be the key features and technologies behind well performing Oracle data warehouses. Introduction As an Oracle database performance engineer who has done numerous customer data warehouse benchmarks and POCs over the past 5+ years, I&#8217;ve seen many data warehouse systems that have been plagued with problems on nearly every DBMS commonly used in data warehousing. Interestingly enough, many of these systems were facing many of the same problems. I&#8217;ve compiled a list of topics that I consider to be key features and/or technologies for Oracle data warehouses: [...]]]></description> <wfw:commentRss>http://structureddata.org/2009/12/14/the-core-performance-fundamentals-of-oracle-data-warehousing-introduction/feed/</wfw:commentRss> <slash:comments>16</slash:comments> </item> <item><title>Oracle OpenWorld 2009: The Real-World Performance Group</title><link>http://structureddata.org/2009/07/20/oracle-openworld-2009-the-real-world-performance-group/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=oracle-openworld-2009-the-real-world-performance-group</link> <comments>http://structureddata.org/2009/07/20/oracle-openworld-2009-the-real-world-performance-group/#comments</comments> <pubDate>Tue, 21 Jul 2009 04:25:00 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Exadata]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[openworld 2009]]></category> <category><![CDATA[oracle database machine]]></category> <category><![CDATA[Real-World Performance Group]]></category><guid
isPermaLink="false">http://structureddata.org/?p=647</guid> <description><![CDATA[Even though Oracle OpenWorld 2009 is a few months away, I thought I would take a moment to mention that the Oracle Real-World Performance Group will again be hosting three sessions. Hopefully you are no stranger to our Oracle database performance sessions and this year we have what I think will be a very exciting and enlightening session: The Terabyte Hour with the Real-World Performance Group. If you are the slightest bit interested in seeing just how fast the Oracle Database Machine really is and how it can devour flat files in no time, rip through and bend data at amazing speeds, this is the session for you. All the operations will be done live for you to observe. No smoke. No mirrors. Pure Exadata performance revealed.]]></description> <wfw:commentRss>http://structureddata.org/2009/07/20/oracle-openworld-2009-the-real-world-performance-group/feed/</wfw:commentRss> <slash:comments>6</slash:comments> </item> <item><title>Oracle Parallel Execution: Interconnect Myths And Misunderstandings</title><link>http://structureddata.org/2009/07/06/oracle-parallel-execution-interconnect-myths-and-misunderstandings/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=oracle-parallel-execution-interconnect-myths-and-misunderstandings</link> <comments>http://structureddata.org/2009/07/06/oracle-parallel-execution-interconnect-myths-and-misunderstandings/#comments</comments> <pubDate>Tue, 07 Jul 2009 00:00:17 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[Oracle]]></category> <category><![CDATA[Parallel Execution]]></category> <category><![CDATA[Performance]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[interconnect traffic]]></category> <category><![CDATA[parallel query]]></category><guid
isPermaLink="false">http://structureddata.org/?p=602</guid> <description><![CDATA[A number of weeks back I had come across a paper/presentation by Riyaj Shamsudeen entitled Battle of the Nodes: RAC Performance Myths (avaiable here). As I was looking through it I saw one example that struck me as very odd (Myth #3 &#8211; Interconnect Performance) and I contacted him about it. After further review Riyaj commented that he had made a mistake in his analysis and offered up a new example. I thought I&#8217;d take the time to discuss this as parallel execution seems to be one of those areas where many misconceptions and misunderstandings exist. The Original Example I thought I&#8217;d quickly discuss why I questioned the initial example. The original query Riyaj cited is this one: select /*+ full(tl) parallel (tl,4) */ avg (n1), max (n1), avg (n2), max (n2), max (v1) from t_large tl; As you can see this is a very simple single table aggregation without a group by. The reason that I questioned the validity of this example in the context of interconnect performance is that the parallel execution servers (parallel query slaves) will each return exactly one row from the aggregation and then send that single row to the query coordinator (QC) which will [...]]]></description> <wfw:commentRss>http://structureddata.org/2009/07/06/oracle-parallel-execution-interconnect-myths-and-misunderstandings/feed/</wfw:commentRss> <slash:comments>15</slash:comments> </item> <item><title>Facebook: Hive &#8211; A Petabyte Scale Data Warehouse Using Hadoop</title><link>http://structureddata.org/2009/06/10/facebook-hive-a-petabyte-scale-data-warehouse-using-hadoop/?utm_source=rss&amp;utm_medium=rss&amp;utm_campaign=facebook-hive-a-petabyte-scale-data-warehouse-using-hadoop</link> <comments>http://structureddata.org/2009/06/10/facebook-hive-a-petabyte-scale-data-warehouse-using-hadoop/#comments</comments> <pubDate>Wed, 10 Jun 2009 23:21:59 +0000</pubDate> <dc:creator>Greg Rahn</dc:creator> <category><![CDATA[Data Warehousing]]></category> <category><![CDATA[VLDB]]></category> <category><![CDATA[data warehouse]]></category> <category><![CDATA[facebook]]></category> <category><![CDATA[hadoop]]></category> <category><![CDATA[hive]]></category> <category><![CDATA[MapReduce]]></category> <category><![CDATA[petabyte scale]]></category><guid
isPermaLink="false">http://structureddata.org/?p=584</guid> <description><![CDATA[Today, June 10th, marks the Yahoo! Hadoop Summit &#8217;09 and the crew at Facebook have a writeup on the Facebook Engineering page entitled: Hive &#8211; A Petabyte Scale Data Warehouse Using Hadoop. I found this an very interesting read given some of the Hadoop/MapReduce comments from David J. DeWitt and Michael Stonebraker as well as their SIGMOD 2009 paper, A Comparison of Approaches to Large-Scale Data Analysis. Now I&#8217;m not about to jump into this whole dbms-is-better-than-mapreduce argument but I found Facebook&#8217;s story line interesting: When we started at Facebook in 2007 all of the data processing infrastructure was built around a data warehouse built using a commercial RDBMS. The data that we were generating was growing very fast &#8211; as an example we grew from a 15TB data set in 2007 to a 2PB data set today. The infrastructure at that time was so inadequate that some daily data processing jobs were taking more than a day to process and the situation was just getting worse with every passing day. We had an urgent need for infrastructure that could scale along with our data and it was at that time we then started exploring Hadoop as a way to [...]]]></description> <wfw:commentRss>http://structureddata.org/2009/06/10/facebook-hive-a-petabyte-scale-data-warehouse-using-hadoop/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk
Page Caching using disk (user agent is rejected)
Database Caching 5/12 queries in 0.008 seconds using disk

Served from: structureddata.org @ 2010-09-09 10:14:13 -->