<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: SAN vs DAS(JBOD) on data node in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131502#M94186</link>
    <description>&lt;P&gt;Although I have heard the argument that over time, the cost of replacing disk and managing DAS disk  with 3 factor replication, makes SAN cheaper, from a TCO perspective&lt;/P&gt;</description>
    <pubDate>Tue, 09 Feb 2016 02:46:45 GMT</pubDate>
    <dc:creator>amcbarnett</dc:creator>
    <dc:date>2016-02-09T02:46:45Z</dc:date>
    <item>
      <title>SAN vs DAS(JBOD) on data node</title>
      <link>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131497#M94181</link>
      <description>&lt;P&gt;I am looking deepen my understanding on type of storage disk used for data nodes.  outside of single point of failure (SAN box goes down) are they any other reason not to use SAN storage on data nodes?  Spindle dedicated on SAN? Is that even possible?  How is performance san vs das (dedicated attached storage)?  Any insights you can share would be appreciated.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Feb 2016 02:15:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131497#M94181</guid>
      <dc:creator>sunile_manjee</dc:creator>
      <dc:date>2016-02-09T02:15:18Z</dc:date>
    </item>
    <item>
      <title>Re: SAN vs DAS(JBOD) on data node</title>
      <link>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131498#M94182</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1486/smanjee.html" nodeid="1486"&gt;@Sunile Manjee&lt;/A&gt;&lt;/P&gt;&lt;P&gt; SAN is terrible for Hadoop go with direct attached or Isilon NAS. SAN suffers from busy neighbor aside from being a shared pool of storage, it's outside of namenode's control so blocks can move around and lose data locality, latency can also be an issue and final thought, direct attached disk is redundant by many disks, so you can tolerate failure by having more disk. Quick search led to this &lt;A href="http://hortonworks.com/blog/thinking-about-the-hdfs-vs-other-storage-technologies/" target="_blank"&gt;http://hortonworks.com/blog/thinking-about-the-hdfs-vs-other-storage-technologies/&lt;/A&gt; and more &lt;A href="http://www.infoworld.com/article/2609694/application-development/never--ever-do-this-to-hadoop.html" target="_blank"&gt;http://www.infoworld.com/article/2609694/application-development/never--ever-do-this-to-hadoop.html&lt;/A&gt; Here's our official doc &lt;A href="http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_cluster-planning-guide/content/hardware-for-slave.1.html" target="_blank"&gt;http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_cluster-planning-guide/content/hardware-for-slave.1.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I guess one thing more to mention is that busy neighbor is also a problem in reverse, you will affect the other applications running on your SAN.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Feb 2016 02:21:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131498#M94182</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-09T02:21:50Z</dc:date>
    </item>
    <item>
      <title>Re: SAN vs DAS(JBOD) on data node</title>
      <link>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131499#M94183</link>
      <description>&lt;P&gt;Hadoop is  Shared Nothing architecture.  SAN Storage usually goes against the grain for distributed storage in a distributed compute environment.
 
The only central storage we support so far is Isilon because we did some joint engineering with them.  Even then, DAS has its advantages (as well as disadvantages mainly because of 3 factor replicator).&lt;/P&gt;&lt;P&gt;The main issue is that compute nodes where YARN spins up containers, for every data access needs, having it on separate SAN disk means that every query or access would then have to go over network speeds and would no longer be distributed across the spindles on the storage nodes.  That not only decreases access time it introduces more points of failure through switches and creates additional potential for bottleneck.&lt;/P&gt;&lt;P&gt;
Normally I would have also compromise  a bit for master nodes but I
 just came from a client who did VMs  with SAN for master nodes and 
performance started great but once multiple users came on board and the 
master nodes needed to handle more blocks, performance tanked.  We 
wasted a week and a half moving the master components to physical nodes 
on a cluster with data.  Painful.&lt;/P&gt;&lt;P&gt;
See a good discussion here: &lt;A href="http://searchstorage.techtarget.com/video/Understanding-storage-in-the-Hadoop-cluster" target="_blank"&gt;http://searchstorage.techtarget.com/video/Understanding-storage-in-the-Hadoop-cluster&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Feb 2016 02:25:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131499#M94183</guid>
      <dc:creator>amcbarnett</dc:creator>
      <dc:date>2016-02-09T02:25:16Z</dc:date>
    </item>
    <item>
      <title>Re: SAN vs DAS(JBOD) on data node</title>
      <link>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131500#M94184</link>
      <description>&lt;P&gt;While it is possible (and makes sense in some cases) to use SAN for Master Nodes I would strongly encourage you not to do this with Datanodes. Use bare metal machines with directly attached storage for Datanodes to optimize throughput and performance. &lt;/P&gt;&lt;P&gt;We have seen some very poor performance in environments where the Datanodes used SAN.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Feb 2016 02:25:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131500#M94184</guid>
      <dc:creator>jstraub</dc:creator>
      <dc:date>2016-02-09T02:25:48Z</dc:date>
    </item>
    <item>
      <title>Re: SAN vs DAS(JBOD) on data node</title>
      <link>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131501#M94185</link>
      <description>&lt;P&gt;cost of typical disk vs. SAN backed disk would be cost prohibitive. &lt;A rel="user" href="https://community.cloudera.com/users/1486/smanjee.html" nodeid="1486"&gt;@Sunile Manjee&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Feb 2016 02:32:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131501#M94185</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-09T02:32:33Z</dc:date>
    </item>
    <item>
      <title>Re: SAN vs DAS(JBOD) on data node</title>
      <link>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131502#M94186</link>
      <description>&lt;P&gt;Although I have heard the argument that over time, the cost of replacing disk and managing DAS disk  with 3 factor replication, makes SAN cheaper, from a TCO perspective&lt;/P&gt;</description>
      <pubDate>Tue, 09 Feb 2016 02:46:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131502#M94186</guid>
      <dc:creator>amcbarnett</dc:creator>
      <dc:date>2016-02-09T02:46:45Z</dc:date>
    </item>
    <item>
      <title>Re: SAN vs DAS(JBOD) on data node</title>
      <link>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131503#M94187</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/15332/san-vs-dasjbod-on-data-node.html#"&gt;@Sunile Manjee&lt;/A&gt; though I have no personal experience with them there are companies like &lt;A href="http://www.bluedata.com/"&gt;BlueData&lt;/A&gt; who abstract the storage component and provide a interesting private cloud experience based on containers. An interesting read on this subject is a book by Google called &lt;A href="http://www.morganclaypool.com/doi/abs/10.2200/S00516ED2V01Y201306CAC024"&gt;Datacenter as a Computer&lt;/A&gt;.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Feb 2016 05:05:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131503#M94187</guid>
      <dc:creator>SQLShaw</dc:creator>
      <dc:date>2016-02-09T05:05:13Z</dc:date>
    </item>
    <item>
      <title>Re: SAN vs DAS(JBOD) on data node</title>
      <link>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131504#M94188</link>
      <description>&lt;P&gt;This answerhub thread is an example of how AWESOME answerhub is.  Thanks all for great great great info.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Feb 2016 11:08:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131504#M94188</guid>
      <dc:creator>sunile_manjee</dc:creator>
      <dc:date>2016-02-09T11:08:19Z</dc:date>
    </item>
    <item>
      <title>Re: SAN vs DAS(JBOD) on data node</title>
      <link>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131505#M94189</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1486/smanjee.html" nodeid="1486"&gt;@Sunile Manjee&lt;/A&gt; I think Ancil answer is best one &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;  You are the judge. &lt;/P&gt;</description>
      <pubDate>Tue, 09 Feb 2016 11:41:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/SAN-vs-DAS-JBOD-on-data-node/m-p/131505#M94189</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-02-09T11:41:58Z</dc:date>
    </item>
  </channel>
</rss>

