@saranvisa, yeah sure, no probleam with that.
My question is if in the cluster available for the test, the plugin will come installed,
because i think that external access will not be available, isn't?
You don't need to use IDEs like eclipse, InteliJ, etc... also no need to create jar as part of the exam. You can login as spark-shell (or) pyspark and execute your commands one by one. All they need is your result and this will save your time
I understand this is an old thread, hopefully my reply here will still get attention.
My questions are:
1. The above discussion is confusing: the "marked" answer says template will be provided (not mandate to be used), the other user says multiple channels confirmed to him/her that no template was provided in the actual exam, which is true?
2. Is Flume excluded from the exam?
3. Is Internet browsing like Google allowed in the exam? I ask this question because often the case we come to the need to check documentation user guide for each component like Spark, Hive, etc.
If you look at the CCA175 certification page you will see the following wording on templates
Each CCA question requires you to solve a particular scenario. In some cases, a tool such as Impala or Hive may be used. In other cases, coding is required. In order to speed up development time of Spark questions, a template is often provided that contains a skeleton of the solution, asking the candidate to fill in the missing lines with functional code. This template is written in either Scala or Python.
You are not required to use the template and may solve the scenario using a language you prefer. Be aware, however, that coding every problem from scratch may take more time than is allocated for the exam.
As for Flume and availability of documentations you can see at the bottom of that page (I put Flume items in red):
CCA175 is a remote-proctored exam available anywhere, anytime. See the FAQ for more information and system requirements.
CCA175 is a hands-on, practical exam using Cloudera technologies. Each user is given their own CDH5 (currently 5.10.0) cluster pre-loaded with Spark 1.6, Impala, Crunch, Hive, Pig, Sqoop, Kafka, Flume, Kite, Hue, Oozie, DataFu, and many others (See a full list). In addition the cluster also comes with Python (2.6, 2.7, and 3.4), Perl 5.10, Elephant Bird, Cascading 2.6, Brickhouse, Hive Swarm, Scala 2.11, Scalding, IDEA, Sublime, Eclipse, and NetBeans.
Cloudera Product Documentation
Apache Impala (Incubating)
JDK 7 API Docs
Python 2.7 Documentation
Python 3.4 Documentation
Only the documentation, links, and resources listed above are accessible during the exam. All other websites, including Google/search functionality is disabled. You may not use notes or other exam aids.
Thanks Mr. Jervis,
There have been a lots complaint about the cluster/VM performance, do you have any update from the certification department, I know you have been talking to them about those complaints.
The certification team advised me that they have reviewed each of the exams they were contacted about and didn't find any performance issues.
I am doing some practices, for some questions, there could be various solutions, for example, I can use RDD operations to do some filtering, sorting, and grouping; with DataFrame and SparkSQL, it is even easier to me to get the same result.
My question is will there be a requirement in the exam that some questions must be resolved using RDD, not DataFrame+SparkSQL. or vice versa?