Member since
01-16-2014
336
Posts
43
Kudos Received
31
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3277 | 12-20-2017 08:26 PM | |
3292 | 03-09-2017 03:47 PM | |
2749 | 11-18-2016 09:00 AM | |
4754 | 05-18-2016 08:29 PM | |
3680 | 02-29-2016 01:14 AM |
03-04-2019
06:53 PM
vmem checks have been disabled in CDH almost since their introduction. The vmem check is not stable and highly dependent on Linux version and distro. If you run CDH you are already running with it disabled. Wilfred
... View more
07-24-2018
08:08 PM
Hi Prav, These types of errors are network / connection related. It might have been a slow response on the remote side service and not a congested network. It could be a lot of work if you want to track it down. Looking at the DN logs on the remote side might give you some insight. It is not really a YARN issue, the HDFS community will be in a better position to help you. Wilfred
... View more
12-20-2017
08:26 PM
1 Kudo
There is no alternate for the issue at the moment. Some people have tried to work around it by hacking the oozie shared libs but that has not really been succesful. For now I would recommend that you stick with Spark 1 in Oozie and Spark 2 from the command line. Wilfred
... View more
12-20-2017
04:55 AM
I think you wanted to ask can we run spark 1 and spark 2 jobs in the same cluster?
Simple answer: yes you can have them both installed, see the docs You can not have different minor versions of the same major version in one cluster. (i.e. 1.5 and 1.6, or 2.1 and 2.2)
Wilfred
... View more
12-20-2017
04:48 AM
This looks like it is a duplicate of what I already answered in: this thread
... View more
12-19-2017
11:29 PM
The placement rules are executed as the original user. That means the job will be added to the correct pool. The end user can not override that because the mapred.job.queuename property should be blacklisted. The hive user should never be accessible for any user, it is a service principal and allowing it to be used by end users will give you far bigger issues. I thus do not see how adding hive as a user to the acl breaks it. Wilfred
... View more
12-08-2017
05:46 AM
What they have done is turn on the partial aggregation via the setting: yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds. That will allow you to grab some of the logs using the command line. We do not support this in CDH although we have the exact same code in CDH is available upstream. We have tested the setting and found that it breaks log access via the different UIs in multiple ways. So you get a working command line in 99% of the cases but when you try to use the RM or AM UIs it breaks almost always. The way it breaks changes over time for the same application. That is not a feature that we can support in the state it is at the moment. Wilfred
... View more
12-08-2017
05:29 AM
You will need to shade the guava that you use in your application. There is no way to replace the guava that is part of CDH with a later release, it will break a number of things. What it looked like from the previous message is that they did not shade it correctly. Wilfred
... View more
12-08-2017
03:50 AM
You must be on 5.8.0 or later for both CDH and CM Wilfred
... View more
12-08-2017
03:44 AM
Setting the memory to 0 means that you are not scheduling on memory any more and that also turns of container size checks. This is not the right way to fix the issue. It could cause all kinds of problems on the NMs Your AM is using more that the container allows so increase the setting yarn.app.mapreduce.am.resource.mb from 1 GB to 1.5GB or 2GB. Use increments the size of what you have set the scheduler increment to when you increase the container size and run the application again. Wilfred
... View more