<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: The Impact Of Good Table And Query Design</title>
	<atom:link href="http://structureddata.org/2009/03/19/the-impact-of-good-table-and-query-design/feed/" rel="self" type="application/rss+xml" />
	<link>http://structureddata.org/2009/03/19/the-impact-of-good-table-and-query-design/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-impact-of-good-table-and-query-design</link>
	<description>Data, Databases, Performance &#38; Scalability</description>
	<lastBuildDate>Mon, 30 Jan 2012 17:05:12 -0500</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Greg Rahn</title>
		<link>http://structureddata.org/2009/03/19/the-impact-of-good-table-and-query-design/#comment-535</link>
		<dc:creator>Greg Rahn</dc:creator>
		<pubDate>Mon, 22 Jun 2009 23:09:40 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=436#comment-535</guid>
		<description>&lt;a href=&quot;#comment-9080&quot; rel=&quot;nofollow&quot;&gt;@Paul Kelley&lt;/a&gt;

I would agree - 125,000 column table might be a bit of a burden.  =)

I&#039;ve not heard of the EAV model, nor seen it, but I may have to do some research to see what it is all about.</description>
		<content:encoded><![CDATA[<p><a href="#comment-9080" rel="nofollow">@Paul Kelley</a></p>
<p>I would agree &#8211; 125,000 column table might be a bit of a burden.  =)</p>
<p>I&#8217;ve not heard of the EAV model, nor seen it, but I may have to do some research to see what it is all about.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Paul Kelley</title>
		<link>http://structureddata.org/2009/03/19/the-impact-of-good-table-and-query-design/#comment-534</link>
		<dc:creator>Paul Kelley</dc:creator>
		<pubDate>Sun, 21 Jun 2009 06:35:10 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=436#comment-534</guid>
		<description>At a different level, I work for a retail company that sells many thousands of different products.  There might be a few column_tab spreadsheets to be found here and there on buyers&#039; desktops, but aside from that I think it&#039;s safe to say that the sheer number of products makes column_tab as described above impossible to implement in this company. A division with 125,000 products would need a table with 125,000 product columns.   We do just fine with time/location/product tables.

Just guessing but I suppose that 11g unpivot would be useful if one encounters data in a column_tab model that must be translated to rowtab.

Have you encountered the EAV Entity-Attribute-Value data model?  That&#039;s the model where each row contains data and metadata - it contains both the values and names of data ( the joins are horrible to look at).   Supposedly this model is good for lab experiments e.g. mass screening,  where - over simplifying - no two assays have the same shape.    So it would be possible to create a new table for each assay or else use EAV and put all the assays and their results in one table.   I worked with such a application before I knew the model had a name.  In 1994 we weren&#039;t convinced that it was the best way to store experimental data.  I switched over to retail in &#039;96, so I don&#039;t know they changed the model since then.</description>
		<content:encoded><![CDATA[<p>At a different level, I work for a retail company that sells many thousands of different products.  There might be a few column_tab spreadsheets to be found here and there on buyers&#8217; desktops, but aside from that I think it&#8217;s safe to say that the sheer number of products makes column_tab as described above impossible to implement in this company. A division with 125,000 products would need a table with 125,000 product columns.   We do just fine with time/location/product tables.</p>
<p>Just guessing but I suppose that 11g unpivot would be useful if one encounters data in a column_tab model that must be translated to rowtab.</p>
<p>Have you encountered the EAV Entity-Attribute-Value data model?  That&#8217;s the model where each row contains data and metadata &#8211; it contains both the values and names of data ( the joins are horrible to look at).   Supposedly this model is good for lab experiments e.g. mass screening,  where &#8211; over simplifying &#8211; no two assays have the same shape.    So it would be possible to create a new table for each assay or else use EAV and put all the assays and their results in one table.   I worked with such a application before I knew the model had a name.  In 1994 we weren&#8217;t convinced that it was the best way to store experimental data.  I switched over to retail in &#8217;96, so I don&#8217;t know they changed the model since then.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Greg Rahn</title>
		<link>http://structureddata.org/2009/03/19/the-impact-of-good-table-and-query-design/#comment-530</link>
		<dc:creator>Greg Rahn</dc:creator>
		<pubDate>Tue, 24 Mar 2009 20:35:58 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=436#comment-530</guid>
		<description>&lt;a href=&quot;#comment-6787&quot; rel=&quot;nofollow&quot;&gt;@Randolf Geist&lt;/a&gt;
I would comment that a column-wise design should have no benefit when it comes to cardinality estimates.  I would have to think long and hard about changing a design of a table simply for cardinality.  That is why there are hints and sql profiles.

Also consider this: How would you partition a column-wise pivot table?  How could you possibly index it?  You really cannot.  The row-wise design lends itself very well to partitioning.  One could easily partition by list on the PRODUCT column or even partition by date range/interval on the RECENCY_TS column, or both.

The other point that I did not dive into was loading.  The PL/SQL code that loaded the original pivot table was cursor loop based and it did not scale.  It also did not support having more than one date for any product for a given customer.  At this site I was able to load the row-wise table and generate a matching pivot table orders of magnitude faster than the PL/SQL could load the pivot table.</description>
		<content:encoded><![CDATA[<p><a href="#comment-6787" rel="nofollow">@Randolf Geist</a><br />
I would comment that a column-wise design should have no benefit when it comes to cardinality estimates.  I would have to think long and hard about changing a design of a table simply for cardinality.  That is why there are hints and sql profiles.</p>
<p>Also consider this: How would you partition a column-wise pivot table?  How could you possibly index it?  You really cannot.  The row-wise design lends itself very well to partitioning.  One could easily partition by list on the PRODUCT column or even partition by date range/interval on the RECENCY_TS column, or both.</p>
<p>The other point that I did not dive into was loading.  The PL/SQL code that loaded the original pivot table was cursor loop based and it did not scale.  It also did not support having more than one date for any product for a given customer.  At this site I was able to load the row-wise table and generate a matching pivot table orders of magnitude faster than the PL/SQL could load the pivot table.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Maggie Nelson</title>
		<link>http://structureddata.org/2009/03/19/the-impact-of-good-table-and-query-design/#comment-529</link>
		<dc:creator>Maggie Nelson</dc:creator>
		<pubDate>Mon, 23 Mar 2009 18:37:53 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=436#comment-529</guid>
		<description>&lt;blockquote&gt;The problem with such designs is that it severely limits the usefulness of that data, as queries that were not known at the time of design often time become problematic.&lt;/blockquote&gt;

Thank you for posting these examples.  When designing databases for clients, our biggest challenge tends to be the client changing their mind about business rules, often and unexpectedly.  Understanding that it happens helps us build databases defensively so they can take the strain of completely different approaches.

Also, it&#039;s great to see an example of pretty complicated business logic represented in more-or-less &quot;standard&quot; SQL so it&#039;s portable between Oracle and other RDBMS.  (Non-Oracle peeps tend to complain that Oracle&#039;s features are a hinderance.)</description>
		<content:encoded><![CDATA[<blockquote><p>The problem with such designs is that it severely limits the usefulness of that data, as queries that were not known at the time of design often time become problematic.</p></blockquote>
<p>Thank you for posting these examples.  When designing databases for clients, our biggest challenge tends to be the client changing their mind about business rules, often and unexpectedly.  Understanding that it happens helps us build databases defensively so they can take the strain of completely different approaches.</p>
<p>Also, it&#8217;s great to see an example of pretty complicated business logic represented in more-or-less &#8220;standard&#8221; SQL so it&#8217;s portable between Oracle and other RDBMS.  (Non-Oracle peeps tend to complain that Oracle&#8217;s features are a hinderance.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Randolf Geist</title>
		<link>http://structureddata.org/2009/03/19/the-impact-of-good-table-and-query-design/#comment-532</link>
		<dc:creator>Randolf Geist</dc:creator>
		<pubDate>Fri, 20 Mar 2009 15:40:46 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=436#comment-532</guid>
		<description>Greg,

I second your points in general, but I think one important aspect of the column-wise storage is that you can gather statistics on these columns and therefore potentially get more accurate selectivity/cardinality estimates of the cost based optimizer.

If your example was part of a larger query then the accuracy of the cardinality estimate can be very important for the overall plan generated.

Although there are cases where you can alleviate this by using dynamic sampling I don&#039;t think that it&#039;s going to help in this particular case since you don&#039;t have a single table predicate when using the pivot approach.

So as always it depends and one has to weigh flexibility and assess performance of the different approaches for each particular and individual situation.

Regards,
Randolf</description>
		<content:encoded><![CDATA[<p>Greg,</p>
<p>I second your points in general, but I think one important aspect of the column-wise storage is that you can gather statistics on these columns and therefore potentially get more accurate selectivity/cardinality estimates of the cost based optimizer.</p>
<p>If your example was part of a larger query then the accuracy of the cardinality estimate can be very important for the overall plan generated.</p>
<p>Although there are cases where you can alleviate this by using dynamic sampling I don&#8217;t think that it&#8217;s going to help in this particular case since you don&#8217;t have a single table predicate when using the pivot approach.</p>
<p>So as always it depends and one has to weigh flexibility and assess performance of the different approaches for each particular and individual situation.</p>
<p>Regards,<br />
Randolf</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: coskan</title>
		<link>http://structureddata.org/2009/03/19/the-impact-of-good-table-and-query-design/#comment-531</link>
		<dc:creator>coskan</dc:creator>
		<pubDate>Fri, 20 Mar 2009 10:30:08 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=436#comment-531</guid>
		<description>thank you for this brilliant explanation of how to design table and how to query depending on design.</description>
		<content:encoded><![CDATA[<p>thank you for this brilliant explanation of how to design table and how to query depending on design.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Oracle Infogram: Exadata, Business, ACM, Design, PeopleSoft, Scripts, Performance, Security</title>
		<link>http://structureddata.org/2009/03/19/the-impact-of-good-table-and-query-design/#comment-533</link>
		<dc:creator>Oracle Infogram: Exadata, Business, ACM, Design, PeopleSoft, Scripts, Performance, Security</dc:creator>
		<pubDate>Thu, 19 Mar 2009 21:26:45 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=436#comment-533</guid>
		<description>[...] business needs so a computer can understand what you want and give it to you, gift-wrapped. The posting discusses some of the many factors involved in the process of design.PeopleSoftA link to the links. Some valuable resources on editing PeopleCode.&#160;at [...]</description>
		<content:encoded><![CDATA[<p>[...] business needs so a computer can understand what you want and give it to you, gift-wrapped. The posting discusses some of the many factors involved in the process of design.PeopleSoftA link to the links. Some valuable resources on editing PeopleCode.&#160;at [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced (User agent is rejected)
Database Caching 1/3 queries in 0.001 seconds using disk: basic
Object Caching 373/374 objects using disk: basic

Served from: structureddata.org @ 2012-02-09 18:09:46 -->
