Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to perform TPC-DS test/workloads on impala,hive and spark?

avatar
Contributor

Hi Team,

 

How to perform TPC-DS test/workloads on hive,impala,spark? how to analyze the results? 

 

Thanks,

Harish

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi @Harish19 , the best place for information regarding TCP-DS tests on Impala would be (follow the README.md):

 

https://github.com/cloudera/impala-tpcds-kit

 

Once the data is populated in HDFS and tables are created, you likely can run most the same queries in tree/master/queries/ on Hive and/or Hive on Spark to test.

 

IBM and Databricks have githubs with some SparkSQL tests, which you can Google for, but I have not personally evaluated them, or know if they work.

 

Thanks,



Robert Justice, Technical Resolution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

View solution in original post

1 REPLY 1

avatar
Expert Contributor

Hi @Harish19 , the best place for information regarding TCP-DS tests on Impala would be (follow the README.md):

 

https://github.com/cloudera/impala-tpcds-kit

 

Once the data is populated in HDFS and tables are created, you likely can run most the same queries in tree/master/queries/ on Hive and/or Hive on Spark to test.

 

IBM and Databricks have githubs with some SparkSQL tests, which you can Google for, but I have not personally evaluated them, or know if they work.

 

Thanks,



Robert Justice, Technical Resolution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service