Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

CCA 175 Revised Syllabus

avatar

Hi,

I am CCA Spark and Hadoop Certification aspirant.As per the revised syllabus for CCA 175 do we need to prepare Spark with Scala as well as Python or Spark with Scala is enough for the Certification?

Previous Syllabus mandate Spark with Scala and python but new Syllabus does not provide clarity on the same.

 

1 ACCEPTED SOLUTION

avatar
Community Manager

@ashishdiwan03 wrote:

@saranvisa

 

During the exam, it is possible to get templates so that all coding is not done from scratch.
In order to speed up development time of Spark questions, a template is often provided that contains a skeleton of the solution, asking the candidate to fill in the missing lines with functional code. This template is written in either Scala or Python.

According to certification team knowldge of Scala and python is must for the examination.

To clarify a bit on the Scala and Python questions here is a snippet from our community knowledge article Cloudera Certifications FAQ:

 

Q - Do I need to know both Scala and Python for the CCA Spark and Hadoop Developer (CCA 175) Certification?  

 

A - The answer is yes, there are questions using both languages.

 

However, please remember that the goal of the exam is to test your Spark knowledge, not your Scala and Python knowledge.  The development questions typically provide you some code and ask you to fill in TO-DO sections.  So the key is to understand the Spark API.  You must have some knowledge of programming, as you will need to be able to read the existing code and understand how to store and retrieve the results you get back from calling the API, but the focus will be on you adding the Spark calls.

 

 

 

 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

View solution in original post

17 REPLIES 17

avatar
Champion

@gsjunior86

 

You need to apply scala plug-in in Eclipse (Scala Perspective). Scala is not a default one

avatar
New Contributor

@saranvisa, yeah sure, no probleam with that.

 

My question is if in the cluster available for the test, the plugin will come installed,

because i think that external access will not be available, isn't?

avatar
Champion

@gsjunior86

 

You don't need to use IDEs like eclipse, InteliJ, etc... also no need to create jar as part of the exam. You can login as spark-shell (or) pyspark and execute your commands one by one. All they need is your result and this will save your time

avatar
Contributor

I understand this is an old thread, hopefully my reply here will still get attention.

 

My questions are:

 

1. The above discussion is confusing: the "marked" answer says template will be provided (not mandate to be used), the other user says multiple channels confirmed to him/her that no template was provided in the actual exam, which is true?

 

2. Is Flume excluded from the exam?

 

3. Is Internet browsing like Google allowed in the exam? I ask this question because often the case we come to the need to check documentation user guide for each component like Spark, Hive, etc.

 

Thank you.

 

 

avatar
Community Manager

Hi @axie,

 

If you look at the CCA175 certification page you will see the following wording on templates

 

Exam Question Format

Each CCA question requires you to solve a particular scenario. In some cases, a tool such as Impala or Hive may be used. In other cases, coding is required. In order to speed up development time of Spark questions, a template is often provided that contains a skeleton of the solution, asking the candidate to fill in the missing lines with functional code. This template is written in either Scala or Python.

You are not required to use the template and may solve the scenario using a language you prefer. Be aware, however, that coding every problem from scratch may take more time than is allocated for the exam.

 

As for Flume and availability of documentations you can see at the bottom of that page (I put Flume items in red):

 

Exam delivery and cluster information

CCA175 is a remote-proctored exam available anywhere, anytime. See the FAQ for more information and system requirements.

CCA175 is a hands-on, practical exam using Cloudera technologies. Each user is given their own CDH5 (currently 5.10.0) cluster pre-loaded with Spark 1.6, Impala, Crunch, Hive, Pig, Sqoop, Kafka, Flume, Kite, Hue, Oozie, DataFu, and many others (See a full list). In addition the cluster also comes with Python (2.6, 2.7, and 3.4), Perl 5.10, Elephant Bird, Cascading 2.6, Brickhouse, Hive Swarm, Scala 2.11, Scalding, IDEA, Sublime, Eclipse, and NetBeans.

Documentation Available online during the exam

Cloudera Product Documentation 
Apache Hadoop 
Apache Hive 
Apache Impala (Incubating) 
Apache Sqoop 
Spark 
Apache Crunch 
Apache Pig 
Kite SDK 
Apache Avro 
Apache Parquet 
Cloudera HUE 
Apache Oozie 
Apache Flume 
DataFu 
JDK 7 API Docs 
Python 2.7 Documentation 
Python 3.4 Documentation 
Scala Documentation 

Only the documentation, links, and resources listed above are accessible during the exam. All other websites, including Google/search functionality is disabled. You may not use notes or other exam aids.

 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Contributor

Thanks Mr. Jervis,

 

There have been a lots complaint about the cluster/VM performance, do you have any update from the certification department, I know you have been talking to them about those complaints.

avatar
Community Manager

The certification team advised me that they have reviewed each of the exams they were contacted about and didn't find any performance issues. 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Contributor

I am doing some practices, for some questions, there could be various solutions, for example, I can use RDD operations to do some filtering, sorting, and grouping; with DataFrame and SparkSQL, it is even easier to me to get the same result.

 

My question is will there be a requirement in the exam that some questions must be resolved using RDD, not DataFrame+SparkSQL. or vice versa?

 

Thank you.