I am interested in concepts for benchmark/performance testing of a new CDP 7.x environment.
Does teragen,terasort and others still work with CDP? I would like to stress my HDFS for example.
I was also looking for the "Testing the installation" chapter in CDP DC documentation without success.
It would be great to get some hints from you.
For detailed checking disk performance you find here appropriate testing procedures:
Further good benchmark are here the TCP-DS tests for Impala:
and for Hive TCP-DS and TCP-H:
Also recommend to check network performance via Cloudera Manager by the function
Inspect Cluster Network Performance
Hope this helps,