Member since
09-23-2015
151
Posts
110
Kudos Received
50
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
645 | 08-12-2016 04:05 AM | |
1286 | 08-07-2016 03:58 AM | |
618 | 07-27-2016 06:24 PM | |
1096 | 07-20-2016 03:14 PM | |
726 | 07-18-2016 12:54 PM |
04-17-2016
11:10 AM
I don't know, but try a different a region and see if you have any success.
... View more
04-16-2016
03:26 PM
Which Amazon region are getting the AMI from? Perhaps there is an AMI that is not working. Try starting up one in a different region and see if that works.
... View more
04-16-2016
11:18 AM
1. Restart ambari-agent 2. Run the start_all_services.sh script 3. Wait 5 minutes and see if the cluster is up and running If those steps fail, then terminate your EC2 instance and build a new one. That will be quicker than trying to debug the issue.
... View more
03-31-2016
06:58 PM
On the exam you should always use Ambari when possible, especially for tasks like enabling NameNode HA.
... View more
03-30-2016
08:04 PM
Try using your hostname instead of the IP address. From the machine running the spark master process, run: $ hostname Then use that value instead of 127.0.0.1.
... View more
03-30-2016
12:54 PM
2 Kudos
@John Cod - Try adding the following environment variables to spark-env.sh (found in the /conf folder of your Spark install) - using the appropriate IP address of course if Spark is running on a machine other than localhost: export SPARK_MASTER_IP=127.0.0.1
export SPARK_LOCAL_IP=127.0.0.1
... View more
03-29-2016
07:21 PM
@Anshu Kumar - this is not a concern that you need to have on the exam. In order for a task on the exam to be marked as correct, you must have already executed your code and the output must exist in HDFS. If the desired output folder specified by the task instructions is not in HDFS, then that task will be immediately marked as wrong. Do not worry about how the evaluator grades the exam. Focus on generating the desired output by carefully following the task instructions, making sure that you run your code, and making sure your code generated the desired output.
... View more
03-29-2016
06:50 PM
You are correct - the output, location, and framework are the grading criteria. We don't care "how" you get the answer as long as you get the correct answer using the correct framework.
... View more
03-29-2016
06:48 PM
@Anshu Kumar - this is good question but it's buried in your current one. Can you ask this in a new, separate question on the forum? (Put "HDPCD" in the subject and I will get notified immediately of your question).
... View more
03-21-2016
12:40 AM
There's a discussion here that answers your question. http://stackoverflow.com/questions/11130145/hadoop-multipleinputs-fails-with-classcastexception
... View more
03-21-2016
12:21 AM
An easier solution would be to read every row, and then add a check to see if it's a header row by checking one of the columns that you know only appears in a header row.
... View more
03-21-2016
12:03 AM
I'm not sure why you are getting that exception, but I do know that you do not need to get the path of any input files on the exam. What are you trying to do? Showing more of your code might help me to provide more insight.
... View more
03-18-2016
04:05 AM
1 Kudo
There is some good information in this thread, but I worry that the discussion about the MultiStorage class in the piggybank is going to seem like it's needed on the HDPCD exam. The MultiStorage class is not a part of the exam objectives. For the exam, you need to know how to use the PARALLEL operator, which if used at the right time in a Pig script can determine the number of output files. So to summarize: the HDPCD exam does not require the use of MutliStorage, but may require the use of PARALLEL.
... View more
03-16-2016
02:42 PM
3 Kudos
Sorting in MR applies to two areas: Sort output by keys: this done "naturally" in the sense that the keys are sorted as they come into the reducer. The compareTo method in the key class determines this natural sorting Secondary sort: both the keys and values are sorted. That involves writing a group comparator class and then registering that class with the MR Job using the setGroupingComparator class The exam objective you listed above is referring to both. The first one is fairly straightforward - you implement the compareTo method in your key class. The secondary sort involves a bit more work. There is a nice blog here that has an example of how to implement a secondary sort: https://vangjee.wordpress.com/2012/03/20/secondary-sorting-aka-sorting-values-in-hadoops-mapreduce-programming-paradigm/
... View more
03-16-2016
12:07 PM
1 Kudo
@Sanjay Ungarala - thank you for pointing this out. We decided a while back to remove this objective from the HDPCD:Java exam, but the website was not changed accordingly. I just updated the website and removed that objective.
LocalResource is used heavily in YARN, but it is beyond the scope of our Java exam.
... View more
03-15-2016
11:45 AM
1 Kudo
You are given access to docs.hortonworks.com on the real exam, and that is the only website you can access. The practice exam is meant to give you an idea of what the real exam tasks are like and also to become familiar with the environment of the real exam. It is not enough to pass the exam though. Like I said, to be fully prepared, you should be able to perform all of the tasks listed in the exam objectives on our website.
... View more
03-12-2016
02:52 PM
1 Kudo
@Mahesh Deshmukh I'm not sure what your need is, but null values should be filtered first. The general rule of thumb in Pig is to "filter early and often" to minimize the amount of data that gets shuffled and sorted, so before the foreach: For example: a = "some Pig relation"
b = filter a by $1 is not null; //filter out tuples where the $1 field is null
c = foreach b generate ... //no need to worry about $1 being null The term "empty" refers to bags typically, and in particular you can use the isEmpty function to check if a bag is empty. You normally do this after a GROUP command: a = "some Pig relation"
b = group a by $3;
c = filter b by not IsEmpty(group); What are you trying to accomplish?
... View more
03-11-2016
07:56 PM
3 Kudos
Sure - the following simple script uses 3 reducers on the last operation, so there will be 3 output files: a = load 'something';
b = order a by $1 parallel 3;
store b into 'somewhere'; PARALLEL is not an option on STORE, but it is an option on a lot of other Pig operations.
... View more
03-11-2016
12:13 PM
3 Kudos
In case someone is searching for this in regards to the Hortonworks Certified Developer exam, the question was asked here also: https://community.hortonworks.com/questions/22439/where-do-i-get-references-for-piggybank.html Many of the Pig operators have a PARALLEL option for specifying the number of reducers, which also determines the number of output files. For the intent of the certification exam, using PARALLEL is all you need to accomplish this task, plus it is much simpler than trying to register the piggybank and use a special output class.
... View more
03-11-2016
12:09 PM
2 Kudos
Many of the Pig operators have a PARALLEL option for specifying the number of reducers, which also determines the number of output files. For the intent of the practice exam and the real exam, using PARALLEL is all you need to accomplish this task.
... View more
03-08-2016
01:48 PM
1 Kudo
@Gurjinder Singh - to fully be prepared for the exam, you should be able to perform all of the exam objectives listed here: http://hortonworks.com/training/class/hdp-certified-developer-hdpcd-exam/ I highly recommend also taking the practice exam. The details are on that webpage above. It uses an environment similar to the real exam and contains tasks similar to the real exam, and it's a great way to test your knowledge and see how ready you are for the exam. Keep in mind the practice exam does not cover every exam objective, so use it only as an aid to prepare. Like I said first - to be fully prepared make sure you can perform all of the exam objectives listed on our website. If you need coding experience and are looking for ideas, check out the tutorials on our website: http://hortonworks.com/tutorials/
... View more
03-05-2016
09:21 PM
2 Kudos
You ask a good question, so I tested it out. I just ran through the wizard to enable NN HA, and then I ran the following command from Step 15: hive --config /etc/hive/conf.server --service metatool -listFSRoot The FS root was not the old value (which the documentation says it will be), but instead it was set to the correct value of my new Nameservice ID. I tested this using Ambari 1.7, so if you are using at least 1.7 you can ignore that last step. I have a hunch that the documentation page is out of date and just needs to be updated.
... View more
03-03-2016
01:16 PM
1 Kudo
@Gurjinder Singh Make sure you have a table in Hive named 'emp1' in the 'default' database.
... View more
03-03-2016
01:13 PM
2 Kudos
@Saurabh Singh The real environment looks the same as the practice environment The tasks are displayed differently though - on the real exam, the tasks appear in a separate pop-up window that you can move around Each task will tell you where to save your answer. For example, Task 1 might tell you to save your script in a file named /home/horton/solutions/task1 The answers are evaluated by the output. There are always different ways to solve problems in programming. It is not the code we are interested in - it's the result. You need to execute your code on the real exam and it needs to generate the desired output described in the task instructions. (If a task has no output, it is immediately marked as wrong.) Each task is worth 1 point and it is either correct or wrong - there is no partial credit. The current HDPCD exam has 7 tasks and you need to get 5 correct to pass. (This is subject to change at any time.)
... View more
03-02-2016
02:02 PM
1 Kudo
To fully prepare for an exam, I recommend two areas of focus: You should be able to perform all of the exam objectives listed on our website You should attempt the corresponding practice exam Taking the practice exam alone is not enough - it is only meant to give you an idea of what the exam environment is like and what the tasks are like. To be fully prepared, you should be able to perform every task listed in the exam objectives. Good idea! I like the idea of collaborating and sharing ideas for exam prep, as long as everyone is careful not to share actual exam details - which would be a violation of the candidate agreement. The best way to collaborate would be here on the Hortonworks Community Connection and tagging all posts with "HDPCA". I will start tagging exam posts as they come in.
... View more
03-01-2016
04:06 AM
3 Kudos
Good question. If you look at the current exam description, it uses Ambari 1.7 which does not have Views for Pig and Hive: http://hortonworks.com/training/class/hdp-certified-developer-hdpcd-exam/ However, the HDPCD exam is about to be updated to HDP 2.3 with Ambari 2.2, so anyone preparing for our exams - make sure you know which version of Ambari and HDP your exam will be on. This information will always be available on our website: http://hortonworks.com/training/certification/ That being said, your exam is graded by the output of your code, so it is irrelevant how you write the code. You can use any text editor and run the scripts in any way you choose.
... View more
02-24-2016
02:16 PM
1 Kudo
Hi @Raghunadh Chellinki Sorry about the confusion. The option for "ec2-classic" is only available if your Amazon account is more than 2 years old. If you had to create a new AWS account, then you will have to launch your instance into a new VPC. Just make sure you enable "Auto-assign Public IP address" - then everything will work fine from there. Let me know if you have any other questions. Thanks, Rich
... View more
02-22-2016
06:25 PM
The tasks for the practice exam are a part of the Amazon AMI image. The details are here: http://hortonworks.com/wp-content/uploads/2015/02/HDPCD-PracticeExamGuide1.pdf
... View more
02-22-2016
06:03 PM
Please share your Sqoop command. Have you tried comparing your Sqoop command to the solution in the /home/horton/solutions folder?
... View more
02-22-2016
05:51 PM
You do not run the Sqoop command as root on the namenode. Run the Sqoop command as the "horton" user on the Ubuntu client.
... View more