Created on 05-09-2019 02:29 AM - edited 09-16-2022 07:22 AM
Hi Team,
How to perform TPC-DS test/workloads on hive,impala,spark? how to analyze the results?
Thanks,
Harish
Created on 07-15-2019 09:02 AM - edited 07-15-2019 09:02 AM
Hi @Harish19 , the best place for information regarding TCP-DS tests on Impala would be (follow the README.md):
https://github.com/cloudera/impala-tpcds-kit
Once the data is populated in HDFS and tables are created, you likely can run most the same queries in tree/master/queries/ on Hive and/or Hive on Spark to test.
IBM and Databricks have githubs with some SparkSQL tests, which you can Google for, but I have not personally evaluated them, or know if they work.
Thanks,
Robert Justice, Technical Resolution Manager
Created on 07-15-2019 09:02 AM - edited 07-15-2019 09:02 AM
Hi @Harish19 , the best place for information regarding TCP-DS tests on Impala would be (follow the README.md):
https://github.com/cloudera/impala-tpcds-kit
Once the data is populated in HDFS and tables are created, you likely can run most the same queries in tree/master/queries/ on Hive and/or Hive on Spark to test.
IBM and Databricks have githubs with some SparkSQL tests, which you can Google for, but I have not personally evaluated them, or know if they work.
Thanks,
Robert Justice, Technical Resolution Manager