Created on 08-18-2017 03:18 AM - edited 09-16-2022 05:06 AM
Hi there,
I am currently trying to assess whether I am ready to take the HDPCD exam. I have used the AWS instance with the practice exam and have gone through them. I have also installed the standard VM locally to familiarize myself with the Hortonworks components over the past few months. I have also perused other questions on the topic to try and make up my mind on this. I have some experience with hadoop, hive, pig and have used various rdbms for years so most of my issues revolve around the intricacies of the syntax of each tool (e.g., chararray vs string...) rather than what is being achieved.
I am wondering whether the tasks in the sample exam represent (1) the difficulty and (2) the scope of actual exam questions. I know that the exam questions are not the ones in the VM (duh!) but when I am able to complete the tasks within the 2 hour time limit, can I take it as a sign that it may time to give it a shot? I'm just trying to avoid a situation where I think I am ready and arrive at the exam only to freak out because I realize that it is waaaay over the level of difficulty of the practice questions.
Thanks in advance,
Greg
Created 08-18-2017 06:46 PM
When I took the certification two years ago, it was almost the exact same as the AWS practice exam, so there should be no surprises. The questions aren't graded on the tool you use, just the validity of the end data. If you would like additional practice, then see if you can address a problem using a completely different set of tools. For example, you could write Spark or even MapReduce instead of Pig or HiveQL to solve certain issues. For others, you may find that there is only one available tool that provides the features you need.
The important thing that I would suggest is getting a good "mental map" of each of the documentation pages since you won't have access to the internet or a search engine. Know the keywords to actually "Ctrl+F" for when you are stuck, and have a good grasp on the commonly used functions/syntax of HDFS CLI, Pig, Hive, Sqoop, etc. In my opinion, the Flume documentation is very searchable because it is all on a single page, but for Hive and Pig, it takes a few clicks to get where you need.
Good luck!
Created 08-18-2017 06:46 PM
When I took the certification two years ago, it was almost the exact same as the AWS practice exam, so there should be no surprises. The questions aren't graded on the tool you use, just the validity of the end data. If you would like additional practice, then see if you can address a problem using a completely different set of tools. For example, you could write Spark or even MapReduce instead of Pig or HiveQL to solve certain issues. For others, you may find that there is only one available tool that provides the features you need.
The important thing that I would suggest is getting a good "mental map" of each of the documentation pages since you won't have access to the internet or a search engine. Know the keywords to actually "Ctrl+F" for when you are stuck, and have a good grasp on the commonly used functions/syntax of HDFS CLI, Pig, Hive, Sqoop, etc. In my opinion, the Flume documentation is very searchable because it is all on a single page, but for Hive and Pig, it takes a few clicks to get where you need.
Good luck!
Created 08-19-2017 05:06 PM
Thanks for the reply! That is exactly my feeling. Because everything is on one page, a search in the Flume docs is easy. Hive is the worst for this... But once you know where you want to get it, it's relatively easy to find it. One question I had was whether when they say "write a hive query", whether that meant that the query had to be one statement? Because I would be able to do it in 2/3 statements but not sure how to do it in only one statement. Of course when you save the query file, it's all executed in one batch but if they only care about the result, I guess it would work either way.
Thanks for the info, I guess I'll practice a bit more and sign up for the exam.
Cheers
 
					
				
				
			
		
