<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Partway Researched With A Chance Of FUD</title>
	<atom:link href="http://structureddata.org/2010/01/04/partway-researched-with-a-chance-of-fud/feed/" rel="self" type="application/rss+xml" />
	<link>http://structureddata.org/2010/01/04/partway-researched-with-a-chance-of-fud/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=partway-researched-with-a-chance-of-fud</link>
	<description>Data, Databases, Performance &#38; Scalability</description>
	<lastBuildDate>Mon, 30 Jan 2012 17:05:12 -0500</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Kevin Closson</title>
		<link>http://structureddata.org/2010/01/04/partway-researched-with-a-chance-of-fud/#comment-11084</link>
		<dc:creator>Kevin Closson</dc:creator>
		<pubDate>Wed, 13 Jan 2010 20:13:03 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=708#comment-11084</guid>
		<description>&lt;a href=&quot;#comment-11080&quot; rel=&quot;nofollow&quot;&gt;@Barry Zane&lt;/a&gt; 

Barry and company stated that our knowledge (not belief) of how 3.5 600Gb 15K RPM SAS drives perform was in error. Barry further said:

&quot;If we see in the neighborhood of 125GB/sec from the newer drives, I’m buying the beer. J&quot;

Barry, I feel you got a Get Out of Jail Free card and you do owe me a beer. You folks should not have closed down responses on the Partway There post especially after admitting that testing finally had proven to you what everyone else already knew.

Bad mojo. But I still want my beer.






The views expressed in this comment are my own and do not necessarily reflect the views of Oracle. The views and opinions expressed by others on this comment thread are theirs, not mine.</description>
		<content:encoded><![CDATA[<p><a href="#comment-11080" rel="nofollow">@Barry Zane</a> </p>
<p>Barry and company stated that our knowledge (not belief) of how 3.5 600Gb 15K RPM SAS drives perform was in error. Barry further said:</p>
<p>&#8220;If we see in the neighborhood of 125GB/sec from the newer drives, I’m buying the beer. J&#8221;</p>
<p>Barry, I feel you got a Get Out of Jail Free card and you do owe me a beer. You folks should not have closed down responses on the Partway There post especially after admitting that testing finally had proven to you what everyone else already knew.</p>
<p>Bad mojo. But I still want my beer.</p>
<p>The views expressed in this comment are my own and do not necessarily reflect the views of Oracle. The views and opinions expressed by others on this comment thread are theirs, not mine.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Greg Rahn</title>
		<link>http://structureddata.org/2010/01/04/partway-researched-with-a-chance-of-fud/#comment-11082</link>
		<dc:creator>Greg Rahn</dc:creator>
		<pubDate>Wed, 13 Jan 2010 06:41:03 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=708#comment-11082</guid>
		<description>@Barry

Are you referring to the point of this blog post?  If so, it baffles me that you don&#039;t (or choose not to) follow my point, but let me spell it out for you as simply as I can: Rick made and you supported incorrect assertions, pushed them as fact, without any research or data.  If you don&#039;t have either experience or data on something, you certainly should not be blogging on it unless you really have a desire for zero credibility.  Take the time, do the research, get some numbers from the lab and then speak/blog intelligently.

You specifically mention:&lt;blockquote&gt;&quot;We see higher numbers [than 75MB/s] in disk tests, but never anywhere near 125MB/sec.”&lt;/blockquote&gt;

Perhaps you would like to elaborate on how ParAccel runs their disk tests (and on what hardware).  To demonstrate how poorly researched your comment is, I&#039;ve run a simple disk test (output below) on drive in an Exadata V1 HP ProLiant DL180 G5 server which is able to perform at over 170MB/s, almost 100MB/s more than you have claimed to observed in tests!!!  So it really raises the question: Why are your numbers so off base?  Granted, application performance won&#039;t achieve what a micro benchmark can, but still...  It&#039;s simply a matter of physics, not fiction.

There really is nothing interesting in that &quot;white paper&quot; from Teradata other than the amount of rubbish and FUD it contains.  Papers are written by people, not companies, so the author&#039;s experience and knowledge has little to nothing to do with how many dollars of hardware Teradata has sold.   Have you any idea the I/O rates a Teradata system gets per drive?  They&#039;d be lucky to get 20MB/s with their I/O patterns (around 64K and random).  Curt Monash reports &quot;&lt;strong&gt;15 MB/second on their fastest disks&lt;/strong&gt;&quot; in &lt;a href=&quot;http://www.dbms2.com/2009/10/25/teradata-hardware-strategy-and-tactics/&quot; rel=&quot;nofollow&quot;&gt;this post&lt;/a&gt;.  Why do you think that the Teradata 5555 system has 100 146GB (and maybe now the 300GB) 15K RPM FCAL drives per node (A 3/5 clique has 3 two socket quad-core Harpertown nodes [plus 1 spare] with 5 arrays each with 60 HDDs in it).  Perhaps it would now seem obvious why Teradata is excited about SSD.  With I/O patterns like that I would be too!

I digress...I&#039;ve probably done more than enough technical research for you at this point... 

 
&lt;pre&gt;
[root@exadata-v1 ~]# dd if=/dev/cciss/c0d11 of=/dev/null bs=1M iflag=direct count=50000
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB) copied, 302.569 seconds, 173 MB/s

[root@exadata-v1 ~]# collectl -sD &#124; grep c0d11
# DISK STATISTICS (/sec)
#           Pct
#Name       KBytes Merged  IOs Size  KBytes Merged  IOs Size  RWSize  QLen  Wait SvcTim Util
cciss/c0d11 170604      0 1499  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 173180      0 1522  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 172684      0 1518  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 171488      0 1507  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 172156      0 1513  114       0      0    0    0     113     5     3      0   98
cciss/c0d11 172932      0 1520  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 172700      0 1517  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 170512      0 1499  114       0      0    0    0     113     5     3      0   98
cciss/c0d11 172776      0 1518  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 173584      0 1526  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 170356      0 1497  114       0      0    0    0     113     5     3      0   99
&lt;/pre&gt;</description>
		<content:encoded><![CDATA[<p>@Barry</p>
<p>Are you referring to the point of this blog post?  If so, it baffles me that you don&#8217;t (or choose not to) follow my point, but let me spell it out for you as simply as I can: Rick made and you supported incorrect assertions, pushed them as fact, without any research or data.  If you don&#8217;t have either experience or data on something, you certainly should not be blogging on it unless you really have a desire for zero credibility.  Take the time, do the research, get some numbers from the lab and then speak/blog intelligently.</p>
<p>You specifically mention:<br />
<blockquote>&#8220;We see higher numbers [than 75MB/s] in disk tests, but never anywhere near 125MB/sec.”</p></blockquote>
<p>Perhaps you would like to elaborate on how ParAccel runs their disk tests (and on what hardware).  To demonstrate how poorly researched your comment is, I&#8217;ve run a simple disk test (output below) on drive in an Exadata V1 HP ProLiant DL180 G5 server which is able to perform at over 170MB/s, almost 100MB/s more than you have claimed to observed in tests!!!  So it really raises the question: Why are your numbers so off base?  Granted, application performance won&#8217;t achieve what a micro benchmark can, but still&#8230;  It&#8217;s simply a matter of physics, not fiction.</p>
<p>There really is nothing interesting in that &#8220;white paper&#8221; from Teradata other than the amount of rubbish and FUD it contains.  Papers are written by people, not companies, so the author&#8217;s experience and knowledge has little to nothing to do with how many dollars of hardware Teradata has sold.   Have you any idea the I/O rates a Teradata system gets per drive?  They&#8217;d be lucky to get 20MB/s with their I/O patterns (around 64K and random).  Curt Monash reports &#8220;<strong>15 MB/second on their fastest disks</strong>&#8221; in <a href="http://www.dbms2.com/2009/10/25/teradata-hardware-strategy-and-tactics/" rel="nofollow">this post</a>.  Why do you think that the Teradata 5555 system has 100 146GB (and maybe now the 300GB) 15K RPM FCAL drives per node (A 3/5 clique has 3 two socket quad-core Harpertown nodes [plus 1 spare] with 5 arrays each with 60 HDDs in it).  Perhaps it would now seem obvious why Teradata is excited about SSD.  With I/O patterns like that I would be too!</p>
<p>I digress&#8230;I&#8217;ve probably done more than enough technical research for you at this point&#8230; </p>
<pre>
[root@exadata-v1 ~]# dd if=/dev/cciss/c0d11 of=/dev/null bs=1M iflag=direct count=50000
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB) copied, 302.569 seconds, 173 MB/s

[root@exadata-v1 ~]# collectl -sD | grep c0d11
# DISK STATISTICS (/sec)
#           Pct
#Name       KBytes Merged  IOs Size  KBytes Merged  IOs Size  RWSize  QLen  Wait SvcTim Util
cciss/c0d11 170604      0 1499  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 173180      0 1522  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 172684      0 1518  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 171488      0 1507  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 172156      0 1513  114       0      0    0    0     113     5     3      0   98
cciss/c0d11 172932      0 1520  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 172700      0 1517  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 170512      0 1499  114       0      0    0    0     113     5     3      0   98
cciss/c0d11 172776      0 1518  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 173584      0 1526  114       0      0    0    0     113     5     3      0   99
cciss/c0d11 170356      0 1497  114       0      0    0    0     113     5     3      0   99
</pre>
]]></content:encoded>
	</item>
	<item>
		<title>By: Barry Zane</title>
		<link>http://structureddata.org/2010/01/04/partway-researched-with-a-chance-of-fud/#comment-11080</link>
		<dc:creator>Barry Zane</dc:creator>
		<pubDate>Tue, 12 Jan 2010 22:57:52 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=708#comment-11080</guid>
		<description>Greg, I’m not sure I follow what your point is. As described in the full response, for any database query that is disk-bound, faster disks will improve performance proportionally for any database. So, if one database system is 10X faster than another database system, then it is still 10X faster if each leverages the continuous improvements made in hardware (e.g. disks, CPUs, controllers, etc). I’m not naming names, though. 

Interesting that Teradata’s white paper also quotes 80 MB/sec. This is a company that has sold billions of dollars worth of hardware systems with disk drives. It reinforces that their experience matches my experience, but are now eclipsed by the newer drives. Additionally, it wouldn’t be in their interest to downplay the performance of the drives they rely on either. We should all thank Seagate and the other drive vendors for their improvements. These are the class of drives our customers deploy.

However, it is certainly unclear what he means by “saturate”. Certainly, if a drive delivers 125MB/sec and there are 5 concurrent requests, then each request will get 25MB/sec, more or less. On the other hand, combining that with the point of 8MB blocks is a little confusing unless he’s saying that big reads allow the percentage of time spent seeking to be ignored, which is true.

Thanks for posting the full link. I invite any reader to go to it.  Good, bad or indifferent, you probably should have included a link to the Teradata document to allow the reader to draw their own conclusions - http://www.teradata.com/t/assets/0/206/276/87e1747c-7ccf-4be3-b812-1dba03dce5d7.pdf</description>
		<content:encoded><![CDATA[<p>Greg, I’m not sure I follow what your point is. As described in the full response, for any database query that is disk-bound, faster disks will improve performance proportionally for any database. So, if one database system is 10X faster than another database system, then it is still 10X faster if each leverages the continuous improvements made in hardware (e.g. disks, CPUs, controllers, etc). I’m not naming names, though. </p>
<p>Interesting that Teradata’s white paper also quotes 80 MB/sec. This is a company that has sold billions of dollars worth of hardware systems with disk drives. It reinforces that their experience matches my experience, but are now eclipsed by the newer drives. Additionally, it wouldn’t be in their interest to downplay the performance of the drives they rely on either. We should all thank Seagate and the other drive vendors for their improvements. These are the class of drives our customers deploy.</p>
<p>However, it is certainly unclear what he means by “saturate”. Certainly, if a drive delivers 125MB/sec and there are 5 concurrent requests, then each request will get 25MB/sec, more or less. On the other hand, combining that with the point of 8MB blocks is a little confusing unless he’s saying that big reads allow the percentage of time spent seeking to be ignored, which is true.</p>
<p>Thanks for posting the full link. I invite any reader to go to it.  Good, bad or indifferent, you probably should have included a link to the Teradata document to allow the reader to draw their own conclusions &#8211; <a href="http://www.teradata.com/t/assets/0/206/276/87e1747c-7ccf-4be3-b812-1dba03dce5d7.pdf" rel="nofollow">http://www.teradata.com/t/assets/0/206/276/87e1747c-7ccf-4be3-b812-1dba03dce5d7.pdf</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: chet</title>
		<link>http://structureddata.org/2010/01/04/partway-researched-with-a-chance-of-fud/#comment-11073</link>
		<dc:creator>chet</dc:creator>
		<pubDate>Thu, 07 Jan 2010 03:25:38 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=708#comment-11073</guid>
		<description>Yes Obi Wan.  :)</description>
		<content:encoded><![CDATA[<p>Yes Obi Wan.  :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Greg Rahn</title>
		<link>http://structureddata.org/2010/01/04/partway-researched-with-a-chance-of-fud/#comment-11072</link>
		<dc:creator>Greg Rahn</dc:creator>
		<pubDate>Wed, 06 Jan 2010 06:12:58 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=708#comment-11072</guid>
		<description>@Chet  “Always two there are, a master and an apprentice.” -Yoda</description>
		<content:encoded><![CDATA[<p>@Chet  “Always two there are, a master and an apprentice.” -Yoda</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: chet</title>
		<link>http://structureddata.org/2010/01/04/partway-researched-with-a-chance-of-fud/#comment-11071</link>
		<dc:creator>chet</dc:creator>
		<pubDate>Wed, 06 Jan 2010 02:22:46 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=708#comment-11071</guid>
		<description>can I do an internship with you or something?  do you have an opening as a mentor?

Now you&#039;re going to make me learn hardware.  Like I don&#039;t have enough to do?  :)</description>
		<content:encoded><![CDATA[<p>can I do an internship with you or something?  do you have an opening as a mentor?</p>
<p>Now you&#8217;re going to make me learn hardware.  Like I don&#8217;t have enough to do?  :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Greg Rahn</title>
		<link>http://structureddata.org/2010/01/04/partway-researched-with-a-chance-of-fud/#comment-11070</link>
		<dc:creator>Greg Rahn</dc:creator>
		<pubDate>Mon, 04 Jan 2010 21:46:48 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=708#comment-11070</guid>
		<description>@Chet
Unfortunately in the data warehousing arena, it seems common (and seemingly acceptable) to make up some FUD and sling it at some other vendor.  Much of the stuff is just so silly, it isn&#039;t worth an engineer&#039;s time to blast holes in it.  Take for instance these statements from a &quot;white paper&quot; authored by Richard Burns, Senior Consultant at Teradata that was published around this time last year relating to Exadata V1:
&lt;blockquote&gt;
The enterprise class SAS disks used by Exadata, rotating at 15K RPM, are capable of delivering data to requestors at about 80 MBps. To maximize I/O throughput, Exadata reads data off disk in large chunks. Exadata defaults to 8MB data blocks, with an option for 4MB blocks. At an 8MB block size, ten concurrent I/Os saturate a drive, even without allowances for seek time.
&lt;/blockquote&gt;
Out of those four sentences, three of them contain inaccurate (wrong) assertions.  The most entertaining bit is the last sentence.  Richard Burns apparently thinks that the throughput capacity of a HDD (in MB/s) divided by the I/O size results in the number of concurrent I/Os...but he would be very wrong.  This is, however, some very creative math.  I guess that is the difference between a consultant and an engineer.  Engineers usually write papers for SIGMOD, VLDB, or similar and consultants write &quot;white paper&quot; FUD.</description>
		<content:encoded><![CDATA[<p>@Chet<br />
Unfortunately in the data warehousing arena, it seems common (and seemingly acceptable) to make up some FUD and sling it at some other vendor.  Much of the stuff is just so silly, it isn&#8217;t worth an engineer&#8217;s time to blast holes in it.  Take for instance these statements from a &#8220;white paper&#8221; authored by Richard Burns, Senior Consultant at Teradata that was published around this time last year relating to Exadata V1:</p>
<blockquote><p>
The enterprise class SAS disks used by Exadata, rotating at 15K RPM, are capable of delivering data to requestors at about 80 MBps. To maximize I/O throughput, Exadata reads data off disk in large chunks. Exadata defaults to 8MB data blocks, with an option for 4MB blocks. At an 8MB block size, ten concurrent I/Os saturate a drive, even without allowances for seek time.
</p></blockquote>
<p>Out of those four sentences, three of them contain inaccurate (wrong) assertions.  The most entertaining bit is the last sentence.  Richard Burns apparently thinks that the throughput capacity of a HDD (in MB/s) divided by the I/O size results in the number of concurrent I/Os&#8230;but he would be very wrong.  This is, however, some very creative math.  I guess that is the difference between a consultant and an engineer.  Engineers usually write papers for SIGMOD, VLDB, or similar and consultants write &#8220;white paper&#8221; FUD.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: chet</title>
		<link>http://structureddata.org/2010/01/04/partway-researched-with-a-chance-of-fud/#comment-11069</link>
		<dc:creator>chet</dc:creator>
		<pubDate>Mon, 04 Jan 2010 14:28:19 +0000</pubDate>
		<guid isPermaLink="false">http://structureddata.org/?p=708#comment-11069</guid>
		<description>Since I have zero expertise on the top 90% of your post, I won&#039;t say anything.  :)

&lt;b&gt;And Isn&#039;t It Ironic...&lt;/b&gt;

I&#039;m not sure why people don&#039;t get this yet.  If you do offer an opinion on something you should clearly state, IANAExpert or something along those lines.

I&#039;ve never had a problem with not knowing something, but I tend to keep my mouth shut about it.

BTW, I think you should do more &quot;editorial&quot; type pieces...very well done.</description>
		<content:encoded><![CDATA[<p>Since I have zero expertise on the top 90% of your post, I won&#8217;t say anything.  :)</p>
<p><b>And Isn&#8217;t It Ironic&#8230;</b></p>
<p>I&#8217;m not sure why people don&#8217;t get this yet.  If you do offer an opinion on something you should clearly state, IANAExpert or something along those lines.</p>
<p>I&#8217;ve never had a problem with not knowing something, but I tend to keep my mouth shut about it.</p>
<p>BTW, I think you should do more &#8220;editorial&#8221; type pieces&#8230;very well done.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced (User agent is rejected)
Database Caching using disk: basic
Object Caching 384/385 objects using disk: basic

Served from: structureddata.org @ 2012-02-09 18:06:52 -->
