question Re: Sanity Check / Cluster Validation documents? in Archives of Support Questions (Read Only)

Sanity Check / Cluster Validation documents?

kbaxley — Tue, 06 Oct 2015 05:27:20 GMT

Do we have any public-consumable documents for "Sanity Checking" a cluster? Aside from running the service checks and ensuring all services start and stop properly, are there any other tests that are run in the field to help validate and ensure the cluster is running acceptably?

Thanks!

Re: Sanity Check / Cluster Validation documents?

drice1 — Tue, 06 Oct 2015 06:32:50 GMT

Hey Kent,

I've always been a fan of running a TestDFSio (stress spindles) like the following:

hadoop jar /usr/lib/hadoop/hadoop-*test*.jar TestDFSIO -write -nrFiles 64 -fileSize 16GB

Followed up with a Teragen/Terasort job. First create the data using Teragen then execute terasort (mapreduce job) on the generated teragen data set.

hadoop jar hadoop-*examples*.jar teragen 10000000000 /user/hduser/terasort-input

hadoop jar hadoop-*examples*.jar terasort /user/hduser/terasort-input /user/hduser/terasort-output

Good documentation on this topic located here

http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/#terasort-benchmark-suite

Re: Sanity Check / Cluster Validation documents?

nsabharwal — Tue, 06 Oct 2015 19:49:38 GMT

*** NameNode Exercise

*** login as hdfs user ***

TestDFSIO Write Test ***

# -fileSize argument is, by default, in units of MB. This should write 10 GB of files hadoop jar

/usr/hdp/current/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-*tests.jar TestDFSIO -write -nrFiles 100 -fileSize 100 TestDFSIO Read Test hadoop jar

/usr/hdp/current/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-*tests.jar TestDFSIO -read -nrFiles 100 -fileSize 100 TestDFSIO Cleanup hadoop jar

/usr/hdp/current/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-*tests.jar TestDFSIO -clean

Read/Write Data

*** TeraGen ***

hdfs dfs -mkdir /benchmarks hdfs dfs -mkdir /benchmarks/terasort

# This will generate 1,000,000 100 byte records as input for Terasort hadoop jar

/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar teragen 1000000 /benchmarks/terasort/terasort-input

*** TeraSort ***

# Sort the 1,000,000 records generated by TeraGen

hadoop jar /usr/hdp/current/hadoop-mapreduce/hadoop-mapreduce-examples.jar terasort /benchmarks/terasort/terasort-input /benchmarks/terasort/terasort-output

*** TeraValidate ***

# Validate that the sort was successful and correct

hadoop jar /usr/hdp/current/hadoop-mapreduce/hadoop-mapreduce-examples.jar teravalidate /benchmarks/terasort/terasort-output /benchmarks/terasort/teravalidate-output

Re: Sanity Check / Cluster Validation documents?

hduraiswamy — Mon, 22 Feb 2016 11:44:10 GMT

Would like to call out that with the latest version of tests-jar, it is better that we call out the path for test result file with the -resFile parameter as shown below. Pls note that this is the path in local directory (not HDFS directory)

hadoop jar hadoop-mapreduce-client-jobclient-2.7.1.2.3.4.0-3485-tests.jar TestDFSIO -read -nrFiles 5 -fileSize 1000 -resFile /tmp/TestDFSIO_results.log

Re: Sanity Check / Cluster Validation documents?

wfloyd — Wed, 25 May 2016 17:40:38 GMT

I haven't seen a full document that covers sanity checking the entire cluster. This is often performed by the PS team at customer engagements. Side note: the most important common individual component test I use to smoke test a cluster is Hive-TestBench.