- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
difference between 'mapreduce.application.classpath' and 'yarn.application.classpath'
- Labels:
-
Apache YARN
-
MapReduce
Created on ‎02-12-2019 05:40 AM - edited ‎09-16-2022 07:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All, it may be a trivial question for many, but could you explain what is the difference or relation between classpath defined in yarn.application.classpath and mapreduce.application.classpath? Does the latter overwrite the former for mapreduce applications? There is also variable MR2_CLASSPATH that is included by default in mapreduce.application.classpath. Where is taken from? Is the mapreduce.application.classpath relevant only for gateways from were application is submitted to yarn?
Created ‎02-12-2019 07:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, at least as of CDH 5.x, the two are additive. The
yarn.application.classpath value goes on early (adding Common, HDFS and
YARN), followed by mapreduce.application.classpath (adding just MR2).
The reason they are separate is tied to another feature (available in CM
6.x) that lets you supply all framework jars as an archive along with the
job rather than rely on local, pre-installed locations on all worker hosts
that are subject to change anytime outside of a container's runtime.
> There is also variable MR2_CLASSPATH that is included by default in
mapreduce.application.classpath. Where is taken from?
This is exclusive to Cloudera Manager managed environments, and is a
reserved env-var name used to assist Parcels that may choose to supply some
jars as 'plugins' to an app or a service. All such env-vars are listed
here:
https://github.com/cloudera/cm_ext/wiki/Plugin-parcel-environment-variables.
In most cases you can ignore this env-var, as it will be empty usually.
> Is the mapreduce.application.classpath relevant only for gateways from
were application is submitted to yarn?
No, the values are just variable names, and are not substituted at the
gateway. They are substituted only on the NodeManager when the prepared
container command/script actually executes. This lets you manage different
install paths on different worker hosts, where local environments point to
actual locations of jars.
Created ‎02-12-2019 07:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, at least as of CDH 5.x, the two are additive. The
yarn.application.classpath value goes on early (adding Common, HDFS and
YARN), followed by mapreduce.application.classpath (adding just MR2).
The reason they are separate is tied to another feature (available in CM
6.x) that lets you supply all framework jars as an archive along with the
job rather than rely on local, pre-installed locations on all worker hosts
that are subject to change anytime outside of a container's runtime.
> There is also variable MR2_CLASSPATH that is included by default in
mapreduce.application.classpath. Where is taken from?
This is exclusive to Cloudera Manager managed environments, and is a
reserved env-var name used to assist Parcels that may choose to supply some
jars as 'plugins' to an app or a service. All such env-vars are listed
here:
https://github.com/cloudera/cm_ext/wiki/Plugin-parcel-environment-variables.
In most cases you can ignore this env-var, as it will be empty usually.
> Is the mapreduce.application.classpath relevant only for gateways from
were application is submitted to yarn?
No, the values are just variable names, and are not substituted at the
gateway. They are substituted only on the NodeManager when the prepared
container command/script actually executes. This lets you manage different
install paths on different worker hosts, where local environments point to
actual locations of jars.
Created ‎02-13-2019 03:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Harsh J,
thanks you very much for your explanations. I asked the question starting with some assumptions that turned out to be false. Thanks for showing the right answers. Last question from my side: could you point to the documentation where mentioned CM 6.x feature for supplying framework jars is described? It sounds interesting.
Created ‎02-13-2019 08:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
upgrades (when the job jars are part of the job exclusively, changes to
locally installed binaries will not affect it during upgrades). A release
note item is documented here for this:
https://www.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cm_600_new_features.html...
Created ‎02-14-2019 12:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great! Thank you very much!
