Member since
08-12-2015
10
Posts
10
Kudos Received
0
Solutions
10-17-2015
05:20 PM
I think you could have tried to use Java if that was an option. I would actually prefer that to attempting it in Python. I have generally had better luck doing SerDes and UDFs in Java.
... View more
08-15-2015
07:14 AM
1 Kudo
Hi Devon. Thank you for your response; that makes sense. I was under the assumption that the beta period was over since the price has gone back to $400.00 from $300.00 https://university.cloudera.com/content/DE575 Do you have an update as to when the beta period will be over? There is no indication on the sign up page that the exam is still in beta.
... View more
08-14-2015
06:51 AM
The website states that the Exam Question Format is as follows:
You are given 5 to 8 customer problems each with a unique, large data set, a 7-node high performance CDH5 cluster, and 4 hours (240 minutes).
For each problem, you must implement a technical solution with a high degree of precision that meets all the requirements.
You may use any tool or combination of tools on the cluster (see list below) -- you get to pick the tool(s) that are right for the job.
You must possess enough industry knowledge to analyze the problem and arrive at an optimal approach given the time allowed.
You need to know what you should do and then do it on a live cluster under rigorous conditions, including a time limit and while being watched by a proctor.
I have a couple of questions about this:
Question 1 - as we complete each section, do we get immediate feedback if that portion is correct and complete or do we get the feedback at the end of the 4 hours?
Question 2 - if everything is completed and submitted but some of it is incorrect but we still have sufficient time left, do we get to fix it and re-submit before 240 minutes is up?
Question 3 - based on your response to question 1 and 2, how many times can the solutions be resubmitted within the 240-minute time limit?
... View more
Labels:
- Labels:
-
Certification
-
Training
08-14-2015
06:26 AM
1 Kudo
The Workflow portion of the exam has the following expectations:
The ability to create and execute various jobs and actions that move data towards greater value and use in a system.
This includes the following skills:
Create and execute a linear workflow with actions that include Hadoop jobs, Hive jobs, Pig jobs, custom actions, etc.
Create and execute a branching workflow with actions that include Hadoop jobs, Hive jobs, Pig jobs, custom action, etc.
Orchestrate a workflow to execute regularly at predefined times, including workflows that have data dependencies.
Would it be acceptable if we use a combination of bash scripts and cronjobs for this portion?
... View more
Labels:
- Labels:
-
Certification
-
Training
08-14-2015
06:18 AM
For my prep and practice, I am currently using JDK 1.8.0_45 running on Ubuntu Trusty (14.04) - Long-Term Support (LTS) 64-bit
However, for the exam, it is not specified which operating system we will be using.
CDH 5.3.2 supports Red Hat Enterprise Linux 6.5, CentOS 6.5, Ubuntu 14.04 LTS.
Do we get to pick which operating system we want to use?
I would prefer to use Ubuntu Trusty (14.04) for the exam.
Also JDK 7 has reached its end of life. The recommended version from Oracle now is JDK 8 and I am using JDK 8 in my practice enviroment.
Can my 7-node cluster be set up with JDK 8 as well to minimize surprises during the test?
A prompt response would be highly appreciated.
Thank you for your help.
References
http://www.oracle.com/technetwork/java/eol-135779.html
http://www.cloudera.com/content/cloudera/en/downloads/cdh/cdh-5-3-2.html#SystemRequirements
http://cloudera.com/content/cloudera/en/training/certification/ccp-data-engineer.html
... View more
Labels:
- Labels:
-
Certification
-
Training
08-13-2015
03:05 AM
I was reviewing the list of tools available for the CCP: Data Engineer Exam (DE575)
http://www.cloudera.com/content/cloudera/en/documentation/core/v5-3-x/topics/cdh_vd_cdh_package_tarball.html
It is not obvious if Maven can be used during the exam to compile the JAR files.
I typically use Maven to build my Jars and I use some open source dependencies in my Map/Reduce logic.
http://www.cloudera.com/content/cloudera/en/training/certification/ccp-data-engineer.html
I would like to know if Maven is available during this test and if I can point my pom file to
https://repository.cloudera.com/artifactory/cloudera-repos so that I can use open source libraries such as Google GSON, Joda Time and Apache Common utils during the exam in my M/R code.
Please let me know at your earliest convenience.
Thanks.
... View more
Labels:
- Labels:
-
Certification
-
Training