Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

About CCP Data Scientist Exam - Tools

About CCP Data Scientist Exam - Tools

New Contributor
Hello , I need advice on the tools for the CCP Data Scientist Certification. here are my questions based on the website description below, - although, R is stated in the list of tools, I couldn't find any mention of RStudio. Personally, I prefer to use RStudio to code as opposed to baseR. I did download the Cloudera Solution Kit and installed the VirtualBox but the VB is unable to connect to the internet and thus I'm unable to install it myself. How do I solve this puzzle? - will it be the same in the actual exam? even though the description states, the cluster is open to internet and you can use other software. I assume, I'm not restricted to using the r-packages in the cluster and can use the ones as part of my regular workflow. ############## Website Description ################# All CCP: Data Scientist exams are remote-proctored and available anywhere, anytime. See the FAQ for more information and system requirements. Exams are hands-on, practical exams using data science tools on Cloudera technologies. Each user is given their own 7-node, high-performance CDH5 (currently 5.3.2) cluster pre-loaded with Spark, Impala, Crunch, Hive, Pig, Sqoop, Kafka, Flume, Kite, Hue, Oozie, DataFu, and many others (See a full list). In addition the cluster also comes with Python (2.6 and 3.4), Perl 5.10, Elephant Bird, Cascading 2.6, Brickhouse, Hive Swarm, Scala 2.11, Scalding, IDEA, Sublime, Eclipse, NetBeans, scikit-learn, octave, NumPy, SciPy, Anaconda, R, plyr, dplyrimpaladb, SparkML, vowpal wabbit, clouderML, oryx, impyla, CoreNLP, The Stanford Parser: A statistical parser, Stanford Log-linear Part-Of-Speech Tagger, Stanford Named Entity Recognizer (NER), Stanford Word Segmenter, opennlp, H2O, java-ml, RapidMiner, caffe, Weka, NLTK, matplotlib, ggplot, d3py, SparkingPandas, randomforest, R: ggplot2, Sparkling water. Currently, the cluster is open to the internet and there are no restrictions on tools you can install or websites or resources you may use.

Re: About CCP Data Scientist Exam - Tools

Expert Contributor

The Data Scientist exam is a proctored, remote exam.  It does not use Virtual Box or a virtual machine; it has a completely different environment.


During the exam you will have access to the Internet and can install any tool that you would like to use.