About Wilfred

Wilfred · ‎03-04-2019

vmem checks have been disabled in CDH almost since their introduction. The vmem check is not stable and highly dependent on Linux version and distro. If you run CDH you are already running with it disabled. Wilfred

Wilfred · ‎07-24-2018

Hi Prav, These types of errors are network / connection related. It might have been a slow response on the remote side service and not a congested network. It could be a lot of work if you want to track it down. Looking at the DN logs on the remote side might give you some insight. It is not really a YARN issue, the HDFS community will be in a better position to help you. Wilfred

Wilfred · ‎12-20-2017

There is no alternate for the issue at the moment. Some people have tried to work around it by hacking the oozie shared libs but that has not really been succesful. For now I would recommend that you stick with Spark 1 in Oozie and Spark 2 from the command line. Wilfred

Wilfred · ‎12-20-2017

I think you wanted to ask can we run spark 1 and spark 2 jobs in the same cluster? Simple answer: yes you can have them both installed, see the docs You can not have different minor versions of the same major version in one cluster. (i.e. 1.5 and 1.6, or 2.1 and 2.2) Wilfred

Wilfred · ‎12-20-2017

This looks like it is a duplicate of what I already answered in: this thread

Wilfred · ‎12-19-2017

The placement rules are executed as the original user. That means the job will be added to the correct pool. The end user can not override that because the mapred.job.queuename property should be blacklisted. The hive user should never be accessible for any user, it is a service principal and allowing it to be used by end users will give you far bigger issues. I thus do not see how adding hive as a user to the acl breaks it. Wilfred

Wilfred · ‎12-08-2017

What they have done is turn on the partial aggregation via the setting: yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds. That will allow you to grab some of the logs using the command line. We do not support this in CDH although we have the exact same code in CDH is available upstream. We have tested the setting and found that it breaks log access via the different UIs in multiple ways. So you get a working command line in 99% of the cases but when you try to use the RM or AM UIs it breaks almost always. The way it breaks changes over time for the same application. That is not a feature that we can support in the state it is at the moment. Wilfred

Wilfred · ‎12-08-2017

You will need to shade the guava that you use in your application. There is no way to replace the guava that is part of CDH with a later release, it will break a number of things. What it looked like from the previous message is that they did not shade it correctly. Wilfred

Wilfred · ‎12-08-2017

You must be on 5.8.0 or later for both CDH and CM Wilfred

Wilfred · ‎12-08-2017

Setting the memory to 0 means that you are not scheduling on memory any more and that also turns of container size checks. This is not the right way to fix the issue. It could cause all kinds of problems on the NMs Your AM is using more that the container allows so increase the setting yarn.app.mapreduce.am.resource.mb from 1 GB to 1.5GB or 2GB. Use increments the size of what you have set the scheduler increment to when you increase the container size and run the application again. Wilfred

Online	Offline
Last Visited	‎02-15-2023 08:41 PM

Member Since	‎01-16-2014 10:22 PM
Last Visited	‎02-15-2023 08:41 PM
Posts	336
Kudos received	43

Cloudera Community

Re: Shall we run multiple spark version jobs innoo...

Re: CompositeGroupsMapping

Re: Yarn Fair Scheduler Allocation file not found ...

Re: Odd behavior when pending mappers get stuck on...

Re: Have various Spark version running on the clus...

Re: Yarn Application failed on out of memory

Re: YARN - occasional Error message

Re: Shall we run multiple spark version jobs innoo...

Re: Shall we run multiple spark version jobs innoo...

Re: oozie ssh-action execute command failed

Re: sentry + hive + kerberos resource management

Re: Browsing logs of running YARN app from yarn lo...

Re: Guava library conflict

Re: sentry + hive + kerberos resource management

Re: Hive query failing because container is using ...