Support Questions
Find answers, ask questions, and share your expertise

Can we use tez as the execution engine in the HDPCD Exam?

Solved Go to solution
Highlighted

Can we use tez as the execution engine in the HDPCD Exam?

When running a pig script or hive query in the HDPCD exam can we use tez rather than mapreduce as the execution engine? Given that the exam is hosted on a single node cluser on a rather weak machine and there is a 2 hour time limit to complete 7-10 task this could save time in the execution of scripts/queries especially considering the results of the exam are exclusively based on the output of our scripts.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Can we use tez as the execution engine in the HDPCD Exam?

Guru

Yes - you can run the scripts however you like. Keep in mind there may be a task that requires you to use Tez. But if nothing is mentioned specifically in the task instructions, then you can run your Pig and Hive scripts using Tez or not. I would recommend using Tez though for every task when applicable. Like you said - why waste precious exam time.

I will take offense to the "weak machine" comment. The instances we use are extremely large for the small amount of processing that happens on them - c3.4xlarge EC2 instances. The datasets on the exam are purposely small so that time is not wasted in processing a lot of data. The longest queries you run will take less than 90 seconds, and that is w/out using Tez.

View solution in original post

1 REPLY 1

Re: Can we use tez as the execution engine in the HDPCD Exam?

Guru

Yes - you can run the scripts however you like. Keep in mind there may be a task that requires you to use Tez. But if nothing is mentioned specifically in the task instructions, then you can run your Pig and Hive scripts using Tez or not. I would recommend using Tez though for every task when applicable. Like you said - why waste precious exam time.

I will take offense to the "weak machine" comment. The instances we use are extremely large for the small amount of processing that happens on them - c3.4xlarge EC2 instances. The datasets on the exam are purposely small so that time is not wasted in processing a lot of data. The longest queries you run will take less than 90 seconds, and that is w/out using Tez.

View solution in original post