About stefan_frankenh

stefan_frankenh · ‎07-10-2017

No, I don`t think so. You also need some RDD knowledge, for example to read an CSV file and transform it to a DataFrame.

stefan_frankenh · ‎05-26-2017

You write your code in the Spark shell. SparkContext and SqlContext are already available.

stefan_frankenh · ‎12-20-2016

I had my exam three days ago. Let me answer my own question. I do not know which HDP version it was. The default version `running `spark-shell` in the terminal was Spark 1.6. I did not try to change it. Yes, I was solving the tasks with Scala in the Spark Shell. However you have to save all you commands in the provided text files. It was not necessary to build a JAR manually to submit it. But there could be a task to submit a provided JAR to YARN. I do not know. You can use `hadoop fs` commands in the terminal to browse the HDFS. I do not think so. You do not have to. Since the VM is running in your browser it automatically uses your local one. Further information: I think there were some links on the desktop to the documentation. But I did no use it. You do not have to write CSV files with a header. Read carefully, the delimiter do not have to be the same in all tasks. The general question pattern is: read this from here, do something with it, write the results to here. Because only the output counts you have to read the tasks carefully (ordering of columns, sorting of the data, delimiter in CSV files, ...) It is up to you how to solve the tasks. You can use RDD or SparkSQL API. The exam is not really difficult if you work through the exam objectives.

stefan_frankenh · ‎12-20-2016

I had my exam three days ago. Since the VM is running in your Browser you do not have to change any keyboard layout. The VM automatically uses your local one.

stefan_frankenh · ‎12-13-2016

Thank you @William Gonzalez. I assume this works for every exam, even the HDPCD Spark exam, right?

stefan_frankenh · ‎12-12-2016

I assume the keyboard layout of the HDPCD exam environment is US. Since I am used to work with another keyboard layout (DE) I would like to change it in the exam. Is it possible to change it for example in the terminal with "loadkeys de"?

stefan_frankenh · ‎12-07-2016

@rich You have answered other questions regarding the Spark exam. We would be very grateful if you could answer some questions here.

stefan_frankenh · ‎12-07-2016

Thank you very much @Don Jernigan. Your answer helps me a lot. However I have further questions. Using Python it is simple to submit a job to Yarn, because you do not need more than a .py file. But when I want to use Scala it is necessary to build a .jar file with Maven, sbt or something like that. I am not sure if we have these build tools available in the exam. Did someone use Scale in the exam? Do I have to write csv files with an header line describing the column names? If yes, I think it is no that easy in a distributed environment. Is the general question pattern "Read this file(s), do something with it and write the result to here"? At the end only the results will be checked.

stefan_frankenh · ‎12-06-2016

I have some additional questions about the Spark exam not been answered by other questions here. The current sandbox is HDP 2.5. Is this the version used in the exam? HDP 2.5 comes with Spark 1.6 and 2.0. Can I choose which version I would like to use to solve the tasks? (2.0 supports writing and reading of csv files out of the box.) Do I only have to use the Spark shell? If yes, why is there the exam objective "Initialize a Spark application"? Since using the Spark shell I do not have to do that manually. Further more there is "Run a Spark job on YARN". How should this be tested? Do I have something like Ambari to look at Hive tables or the files in the HDFS? Is there Zeppelin and can I use it? Can I change the keyboard layout? I do have project experience with Spark but feel quite uncomfortable not knowing what to expect in the exam.

stefan_frankenh · ‎12-06-2016

@Qi Wang I think we do not have the Databrick CSV library available in the exam. Your approach with mkString() works well if there is no header required in the output csv file. Can I assume that in the exam tasks?

Online	Offline
Last Visited	‎07-10-2017 08:55 AM

Member Since	‎11-25-2016 12:28 PM
Last Visited	‎07-10-2017 08:55 AM
Posts	10
Kudos received	4

Cloudera Community

Re: HDPCD Spark Exam

Re: HDPCD Spark Exam

Re: HDPCD Spark Exam

Re: HDPCD Spark Exam

Re: Is it possible to change keyboard layout in HD...

Re: Is it possible to change keyboard layout in HD...

Is it possible to change keyboard layout in HDPCD ...

Re: HDPCD Spark Exam

Re: HDPCD Spark Exam

HDPCD Spark Exam

Re: How to save dataframe as text file