Reply
Highlighted
Expert Contributor
Posts: 252
Registered: ‎01-25-2017
Accepted Solution

mapred.submit.replication for jobs that has the Jobtracker as the submitted host

Hi,

 

I see some jobs in my cluster that submitted the job via the Jobtracker node.

 

Looking at all data nodes and mapred.submit.replication is 2, in the job tracker mapred-site.xml there is no mapred.submit.replication property, i added it manually to the file and restarted the job tracker, but still see in the job file for the running jobs  that has job tracker as the Submit Host the mapred.submit.replication is 10 and not 2.

Posts: 1,524
Kudos: 265
Solutions: 232
Registered: ‎07-31-2013

Re: mapred.submit.replication for jobs that has the Jobtracker as the submitted host

That property is job-applied, not server controlled. Wherever you are
submitting your job from, the local or in-code configuration isn't loading
your custom value, so the default value would get used instead. An
application can usually discover your configs if the directory carrying the
config XML files is on the application's classpath. Read more at
https://hadoop.apache.org/docs/current/api/org/apache/hadoop/conf/Configuration.html

BTW you shouldn't be using MRv1 anymore, its deprecated. Use YARN with its
MRv2 (although, the above fact still remains the same).
Backline Customer Operations Engineer
Expert Contributor
Posts: 252
Registered: ‎01-25-2017

Re: mapred.submit.replication for jobs that has the Jobtracker as the submitted host

Hi,

 

Can i enforce this at the cluster level?

 

This is the coord job configuration for the running job, vlpr-mha01 is acts as JT and NN.

 

<configuration>
<property>
<name>jobType</name>
<value>rm</value>
</property>
<property>
<name>dwhType</name>
<value>da</value>
</property>
<property>
<name>oozie.coord.application.path</name>
<value>hdfs://vlpr-mha01:54310/liveperson/code/server_dataaccess_retention/lp-dataaccess-retention-1.0.0.1/sched/</value>
</property>
<property>
<name>recycleBinDir</name>
<value>hdfs://vlpr-mha01:54310/liveperson/data/server_dataaccess_retention/recycle_bin/</value>
</property>
<property>
<name>freq</name>
<value>1440</value>
</property>
<property>
<name>workflowAppUri</name>
<value>hdfs://vlpr-mha01:54310/liveperson/code/server_dataaccess_retention/lp-dataaccess-retention-1.0.0.1/sched/</value>
</property>
<property>
<name>start</name>
<value>2014-03-02T10:24Z</value>
</property>
<property>
<name>user.name</name>
<value>dataaccess</value>
</property>
<property>
<name>jobRoot</name>
<value>hdfs://vlpr-mha01:54310/liveperson/code/server_dataaccess_retention/lp-dataaccess-retention-1.0.0.1</value>
</property>
<property>
<name>workingOnDir</name>
<value>hdfs://vlpr-mha01:54310/liveperson/data/server_dataaccess_retention/recycle_bin/</value>
</property>
<property>
<name>oozie.libpath</name>
<value>hdfs://vlpr-mha01:54310/liveperson/code/server_dataaccess_retention/lp-dataaccess-retention-1.0.0.1/lib</value>
</property>
<property>
<name>nameNode</name>
<value>hdfs://vlpr-mha01:54310</value>
</property>
<property>
<name>end</name>
<value>2020-01-01T00:00Z</value>
</property>
<property>
<name>jobTracker</name>
<value>vlpr-mha01:54311</value>
</property>
</configuration>

 

 

This is an old cluster that trying not make changes at the job level, should be dead in 6 months

Posts: 1,524
Kudos: 265
Solutions: 232
Registered: ‎07-31-2013

Re: mapred.submit.replication for jobs that has the Jobtracker as the submitted host

You can pass custom into your WF actions that affect most
action types that load them automatically. If you use actions within
the workflow however, then Oozie prepares a file that you need to manually
load into the code, this is described at
http://archive.cloudera.com/cdh5/cdh/5/oozie/WorkflowFunctionalSpec.html#a3.2.7_Java_Action

Replication for MR submit files (jars/etc.) is a client-side action, it
cannot be controlled by a central server.
Backline Customer Operations Engineer
Expert Contributor
Posts: 252
Registered: ‎01-25-2017

Re: mapred.submit.replication for jobs that has the Jobtracker as the submitted host

[ Edited ]

I want to find the place that i can disable passing specific host to submit the job through.

 

Is see that the oozie launcher for the job is submitting from slpr-mha01 which is the JT,NN and Oozie node but he the job itself is submitted through DN.

 

The jobs are scheduled using Oozie.

Expert Contributor
Posts: 252
Registered: ‎01-25-2017

Re: mapred.submit.replication for jobs that has the Jobtracker as the submitted host

I manage to solve by adding mapred-site.xml at the oozie server under /etc/hadoop/conf and overwriting the submit replication

Announcements