Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

CCP: DS - details?

Re: CCP: DS - details?

New Contributor

Hi,

 

The link you have shared is broken looks like. I found below link however I can't find Study Guide reference. It would be really helpful to have reference/pointes for preparation. The syllabus listed covered verity and wide rang of topic. It is good to have Study similar to DS200. 

 

http://www.cloudera.com/content/www/en-us/training/certification/ccp-ds/exams.html

 

Regards

P

Highlighted

Re: CCP: DS - details?

Contributor
Hi P,

Here is the link to all of the students who have passed the Data Science Exams: http://www.cloudera.com/content/www/en-us/training/certification/ccp-ds/class.html.

Please write in to certification@cloudera.com with any other questions.

Thanks,

Re: CCP: DS - details?

New Contributor

Thank you.

 

The link you have provided is not working however I have sent email as suggested. 

 

Regards

Pragnesh

Re: CCP: DS - details?

New Contributor

I totally agree. I've went throught the lestest reference list. It's actually a 2-year-of-work, if you treat every entry seriously.

 

If there is any more precise guidance ?

 

The url of blogs, who have passed the new exams, doesn't work, by the way.

 

BRs

xiaojing

Re: CCP: DS - details?

Expert Contributor

The main page:  http://cloudera.com/training/certification/ccp-ds.html

Certified data scientists:  http://cloudera.com/training/certification/ccp-ds/class.html

 

It is the most elite certification in the field, with a few dozen in the world passing.  I would think that knowing everything you need to know in order to be a Data Scientist is more than a two year endeavor.

Re: CCP: DS - details?

New Contributor
Hi Juddimal,

It's rather weird that the most elite certification in the world can't even
provide sample problems or any meaningful preparation guide. Seems rather
like an undercooked product.

Re: CCP: DS - details?

Super Collaborator
There's over 100 pages of examples and a fully complete solution kit and
problem set right there on the page.

Re: CCP: DS - details?

New Contributor
Thanks! Do you mean this?
http://certification.cloudera.com/prep/dsc1sk/intro.html

Isn't it supposed to be for the old exam, not for the new ones?

Re: CCP: DS - details?

Super Collaborator
there is no old or new. The exams are the same; the delivery format
changed but the content didn't.

Re: CCP: DS - details?

New Contributor

Hi,

 

I have been going through DS200 Solution Kit and this forum to understand the objectives of new CCP-DS.  I am a little disheartened to know that the content of the new exams is same as that of the old exam. 

 

Since then, I have been a little curious to find out what tools are used for datascience and machine learning in CCP-DS exam.

 

As for the DS 200 Solution Kit, the emphasis is on

 

Data Exploration of JSON Datasets uses MapReduce Streaming (Python)
Data Cleaning of JSON Datasets uses MapReduce Streaming (Python)
Classifiying using Simlink Algorthm
Clustering using Cloudera ML
A Recommender system is built using Mahout

 

But I observed that many organizations, including mine have started using more and more of Spark RDD's and Dataframes for Data Exploration, Cleaning and Transformation using Spark's scala or Python API's.

Most of the common Machine learning techniques such as can classifictaion, clustering and Collaborative filtering can be implemented using Spark mllib or using H20 on spark. 

 

Also the CCP-DS certification page at http://www.cloudera.com/training/certification/ccp-ds/exams.html

lists "Data Science at Scale Using Spark and Hadoop" as one of the study resources. This makes me more curious and just wanted to know,

 

Can I use Spark RDD or Dataframes (Scala / Python API) & Spark mllib in the actual CCP-DS exam ?

Is Spark Scala API supported or should I use only Spark Python API ?

Also can I use Python with scikit & Pandas to solve some inferential and decriptive statistic problems in the exam?

 

Can the resource guide be updated to include alternative technologies such as Spark istead of MapReduce & Mahout ? If not I would say its such a fumble to start learning Mahout from the scratch after learning and implementing ML problems in Python scikit or R all these years and lately moving on to Spark mllib. I woul bet there  would be several datascience / ML folks who will be in the same boat as mine.

 

Any help or answer in this regard would be highly appreciated and will recieve a big welcome by many Datascience folks looking out for a similar answer in their endeavour to achieve a CCP-DS certification.

 

Don't have an account?
Coming from Hortonworks? Activate your account here