08-18-2015 04:27 PM
First off, I'm a big fan of this new CCP: Data Engineer exam approach. I know Cloudera is working things out right now and I have a few questions perhaps someone can provide some input on:
10-17-2015 05:51 AM
You can set up a single node running Hadoop in Pseudo-Distributed mode using VMs or Amazon EC2 instances with the Cloudera software and other tools running on them.
Then you can come up with projects yourself to practice.
Once you set up the tools, try to practice solving hypothetical problems that are called out here
They are not really hard to come up with.
The Data Ingestion portion talks about Flume, HDFS console commands and Sqoop.
The Transform, Stage, Store sections covers pig, Hive, Map/Reduce and Spark skills.
For the Data Analysis part, you need to know how to create tables in Hive that uses SerDe and other custom settings.
The Workflow portion covers skills you need from Apache Oozie.
02-08-2016 04:43 PM
02-09-2016 08:26 AM
Step 1: Download the Quickstart VM http://www.cloudera.com/downloads/quickstart_vms.html
Step 2: Go to the certification page to learn what testing objectives to practice
06-23-2016 04:53 AM - edited 06-23-2016 04:54 AM
The data tranformation part will have to be done in pig/hive and spark ,all three or its per our choice?
I mean do we have flexibility in choose tools or its mandatory as per given in the exam?
07-08-2016 07:07 AM
All the skills you need to be prepared to use in the hands on exam are on the CCP Data Engineer page. The page lists the exam delivery and cluster, documentation available and even a sample exam question to give you a feel for the exam.