<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Sanity Check / Cluster Validation documents? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sanity-Check-Cluster-Validation-documents/m-p/94941#M8199</link>
    <description>&lt;P&gt;Would like to call out that with the latest version of tests-jar, it is better that we call out the path for test result file with the -resFile parameter as shown below. Pls note that this is the path in local directory (not HDFS directory)&lt;/P&gt;&lt;P&gt;hadoop jar hadoop-mapreduce-client-jobclient-2.7.1.2.3.4.0-3485-tests.jar TestDFSIO -read -nrFiles 5 -fileSize 1000 -resFile /tmp/TestDFSIO_results.log&lt;/P&gt;</description>
    <pubDate>Mon, 22 Feb 2016 11:44:10 GMT</pubDate>
    <dc:creator>hduraiswamy</dc:creator>
    <dc:date>2016-02-22T11:44:10Z</dc:date>
    <item>
      <title>Sanity Check / Cluster Validation documents?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sanity-Check-Cluster-Validation-documents/m-p/94938#M8196</link>
      <description>&lt;P&gt;Do we have any public-consumable documents for "Sanity Checking" a cluster?  Aside from running the service checks and ensuring all services start and stop properly, are there any other tests that are run in the field to help validate and ensure the cluster is running acceptably?&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Tue, 06 Oct 2015 05:27:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sanity-Check-Cluster-Validation-documents/m-p/94938#M8196</guid>
      <dc:creator>kbaxley</dc:creator>
      <dc:date>2015-10-06T05:27:20Z</dc:date>
    </item>
    <item>
      <title>Re: Sanity Check / Cluster Validation documents?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sanity-Check-Cluster-Validation-documents/m-p/94939#M8197</link>
      <description>&lt;P&gt;Hey Kent,&lt;/P&gt;&lt;P&gt;I've always been a fan of running a TestDFSio (stress spindles) like the following:&lt;/P&gt;&lt;P&gt;hadoop jar /usr/lib/hadoop/hadoop-*test*.jar TestDFSIO -write -nrFiles 64 -fileSize 16GB&lt;/P&gt;&lt;P&gt;Followed up with a Teragen/Terasort job.  First create the data using Teragen then execute terasort (mapreduce job) on the generated teragen data set.&lt;/P&gt;&lt;P&gt;hadoop jar hadoop-*examples*.jar teragen 10000000000 /user/hduser/terasort-input&lt;/P&gt;&lt;P&gt;hadoop jar hadoop-*examples*.jar terasort /user/hduser/terasort-input /user/hduser/terasort-output &lt;/P&gt;&lt;P&gt;Good documentation on this topic located here&lt;/P&gt;&lt;P&gt;&lt;A href="http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/#terasort-benchmark-suite" target="_blank"&gt;http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/#terasort-benchmark-suite&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 06 Oct 2015 06:32:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sanity-Check-Cluster-Validation-documents/m-p/94939#M8197</guid>
      <dc:creator>drice1</dc:creator>
      <dc:date>2015-10-06T06:32:50Z</dc:date>
    </item>
    <item>
      <title>Re: Sanity Check / Cluster Validation documents?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sanity-Check-Cluster-Validation-documents/m-p/94940#M8198</link>
      <description>&lt;P&gt;*** NameNode Exercise &lt;/P&gt;&lt;P&gt;***
login as hdfs user
*** &lt;/P&gt;&lt;P&gt;TestDFSIO Write Test *** &lt;/P&gt;&lt;P&gt;# -fileSize argument is, by default, in units of MB.  This should write 10 GB of files
hadoop jar &lt;/P&gt;&lt;P&gt;/usr/hdp/current/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-*tests.jar TestDFSIO -write -nrFiles 100 -fileSize 100
TestDFSIO Read Test
hadoop jar &lt;/P&gt;&lt;P&gt;/usr/hdp/current/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-*tests.jar TestDFSIO -read -nrFiles 100 -fileSize 100
TestDFSIO Cleanup
hadoop jar &lt;/P&gt;&lt;P&gt;/usr/hdp/current/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-*tests.jar TestDFSIO -clean&lt;/P&gt;&lt;P&gt;Read/Write Data&lt;/P&gt;&lt;P&gt;
*** TeraGen *** &lt;/P&gt;&lt;P&gt;hdfs dfs -mkdir /benchmarks
hdfs dfs -mkdir /benchmarks/terasort &lt;/P&gt;&lt;P&gt;# This will generate 1,000,000 100 byte records as input for Terasort
 hadoop jar &lt;/P&gt;&lt;P&gt;/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar teragen 1000000 /benchmarks/terasort/terasort-input &lt;/P&gt;&lt;P&gt;*** TeraSort *** &lt;/P&gt;&lt;P&gt;# Sort the 1,000,000 records generated by TeraGen &lt;/P&gt;&lt;P&gt;hadoop jar /usr/hdp/current/hadoop-mapreduce/hadoop-mapreduce-examples.jar terasort /benchmarks/terasort/terasort-input /benchmarks/terasort/terasort-output &lt;/P&gt;&lt;P&gt;*** TeraValidate *** &lt;/P&gt;&lt;P&gt;# Validate that the sort was successful and correct &lt;/P&gt;&lt;P&gt;hadoop jar /usr/hdp/current/hadoop-mapreduce/hadoop-mapreduce-examples.jar teravalidate /benchmarks/terasort/terasort-output /benchmarks/terasort/teravalidate-output&lt;/P&gt;</description>
      <pubDate>Tue, 06 Oct 2015 19:49:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sanity-Check-Cluster-Validation-documents/m-p/94940#M8198</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2015-10-06T19:49:38Z</dc:date>
    </item>
    <item>
      <title>Re: Sanity Check / Cluster Validation documents?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sanity-Check-Cluster-Validation-documents/m-p/94941#M8199</link>
      <description>&lt;P&gt;Would like to call out that with the latest version of tests-jar, it is better that we call out the path for test result file with the -resFile parameter as shown below. Pls note that this is the path in local directory (not HDFS directory)&lt;/P&gt;&lt;P&gt;hadoop jar hadoop-mapreduce-client-jobclient-2.7.1.2.3.4.0-3485-tests.jar TestDFSIO -read -nrFiles 5 -fileSize 1000 -resFile /tmp/TestDFSIO_results.log&lt;/P&gt;</description>
      <pubDate>Mon, 22 Feb 2016 11:44:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sanity-Check-Cluster-Validation-documents/m-p/94941#M8199</guid>
      <dc:creator>hduraiswamy</dc:creator>
      <dc:date>2016-02-22T11:44:10Z</dc:date>
    </item>
    <item>
      <title>Re: Sanity Check / Cluster Validation documents?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sanity-Check-Cluster-Validation-documents/m-p/94942#M8200</link>
      <description>&lt;P&gt;I haven't seen a full document that covers sanity checking the entire cluster. This is often performed by the PS team at customer engagements. Side note: the most important common individual component test I use to smoke test a cluster is &lt;A href="https://github.com/hortonworks/hive-testbench"&gt;Hive-TestBench&lt;/A&gt;. &lt;/P&gt;</description>
      <pubDate>Wed, 25 May 2016 17:40:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sanity-Check-Cluster-Validation-documents/m-p/94942#M8200</guid>
      <dc:creator>wfloyd</dc:creator>
      <dc:date>2016-05-25T17:40:38Z</dc:date>
    </item>
  </channel>
</rss>

