Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Using different SQL On Hadoop engines to query TPC_H and TPC-DS

Using different SQL On Hadoop engines to query TPC_H and TPC-DS

New Contributor

I would like to familiarize myself more with the HDP platform, Would you please guide me where is the best place of documentation, examples or demos to start.

Mainly I would like to to have APACHE Hive, Cloudera Impala, Spark SQL, and SPARK/Shark in HDP and loading TPC-H and TPC-DS or anyother workloads and start query these datasets using the mentioned SQL on hadoop engines.

I tried one of the samples with uploading csv files and query them using hive, I would like to have more examples with the above.

Many Thanks and appricate the help.

Regards,

Mohammed

3 REPLIES 3

Re: Using different SQL On Hadoop engines to query TPC_H and TPC-DS

Mentor

@Mohammed Syam

You can run all the rest on HDP except Impala which as you know is Cloudera product. Hortonworks Data Platform (HDP) provides a Sandbox (VM) with most of the components installed this is a quick easy way to start a deep dive. To avoid frustration you should have to my experience at least 12GB of RAM though 8GB is minimum recommended.

You will choose between VMware, Docker or VirtualBox whichever you find convenient.

The link below will take you through all the steps learning_ropes_of_HDP_sandbox

Hope that helps.

Re: Using different SQL On Hadoop engines to query TPC_H and TPC-DS

New Contributor

Thanks @Geoffrey Shelton Okot

I already did this, I want to know if there is any documentation of how I can upload TPC-H or TPC-DS data-sets and start querying them.

Re: Using different SQL On Hadoop engines to query TPC_H and TPC-DS

Contributor

Why don't you refer following link ?

https://github.com/hortonworks/hive-testbench

Don't have an account?
Coming from Hortonworks? Activate your account here