Support Questions

Find answers, ask questions, and share your expertise

How to put several options in mapreduce.map.java.opts

avatar
Explorer

I am trying to add more than one option to mapreduce.map.java.opts without success. E.g.:

yarn jar My.jar Myclass.class -Dmapreduce.map.java.opts=-Xss5M -agentlib:jdwp=transport=dt_socket,server=y,address=8787 -Dmapreduce.map.memory.mb=6000 ...

followed by more options and maybe more than 2 arguments for mapreduce.map.java.opts.

How do I have to do this that -Xss and -agentlib in my example are both treated as arguments for -Dmapreduce.map.memory.mb and not for the yarn jar job?

I tried everything I could think of - single quotes, double quotes, ...

Thanks for every hint,

Eddie

1 ACCEPTED SOLUTION

avatar
Expert Contributor
@Eddie

Generally specifying the mapreduce.map.java.opts in quotes will work for all the example jobs. The following command running a pi job worked.

yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar pi -Dmapreduce.map.java.opts="-XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCDateStamps -XX:SurvivorRatio=8" 1 1

I see that your command uses your specific class Myclass.class. The example pi job works because it parses the command line options using org.apache.hadoop.util.GenericOptionsParser

Your MyClass.class should use org.apache.hadoop.util.GenericOptionsParser to parse the command line options for it to work properly.

View solution in original post

5 REPLIES 5

avatar

Hi @Eddie

Try like so;

yarn jar My.jar Myclass.class -Dmapreduce.map.java.opts="-Xss5M" \
-Dmapreduce.map.memory.mb=6000 \
etc

I think the agentlib setting should be set through YARN_OPTS, so you could append this on Ambari -> Yarn -> Config -> Advanced Yarn env. Near the bottom of the yarn-env template you'll notice various yarn_opts being set. We can add this;

YARN_OPTS="$YARN_OPTS -agentlib:jdwp=transport=dt_socket,server=y,address=8787"

Unsure if the agentlib can be set upon CLI submit, it didn't work for me anyway and had to add the above in to the yarn-env template. I hope this helps.

avatar
Explorer

Putting the agentlib in YARN_OPTS instead of mapreduce.map.java.opts didn't work for me, debugging is not possible.

I also have other use cases where it would be nice to be able to add more properties to mapreduce.map.java.opts.

Thanks,

Eddie

avatar
Expert Contributor
@Eddie

Generally specifying the mapreduce.map.java.opts in quotes will work for all the example jobs. The following command running a pi job worked.

yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar pi -Dmapreduce.map.java.opts="-XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCDateStamps -XX:SurvivorRatio=8" 1 1

I see that your command uses your specific class Myclass.class. The example pi job works because it parses the command line options using org.apache.hadoop.util.GenericOptionsParser

Your MyClass.class should use org.apache.hadoop.util.GenericOptionsParser to parse the command line options for it to work properly.

avatar
Explorer

This works indeed.

With your example I recognized that my problem was not to provide the parameters to mapreduce.map.java.opts. I supplied the complete opts string with a variable:

export MAPPER_OPTS="-Dmapreduce.map.java.opts='-Xss5M -agentlib:jdwp=transport=dt_socket,server=y,address=8787'" (I tried different combinations of singlequote, double quote and masking with backslash)

Resolving this in my yarn command caused the problems. (Though having this in a variable because it changes often would be easier. 🙂 )

Thanks for your help,

Eddie

avatar
Expert Contributor

Good to know you got it resolved. You can accept the answer if it helped. One more thing to note is that java debug doesn't work if more than one map container is launched in the same node. This is because both map container processes will try to listen on the debug port 8787 and might fail.