Support Questions

Find answers, ask questions, and share your expertise

MapReduce timeout

avatar
Contributor

2016-02-14 22:40:05,909 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf

2016-02-14 22:40:06,618 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
2016-02-14 22:40:06,760 - HdfsResource['/user/ambari-qa/mapredsmokeoutput'] {'security_enabled': True, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': '/etc/security/keytabs/hdfs.headless.keytab', 'default_fs': 'hdfs://HDPCA', 'hdfs_site': ..., 'kinit_path_local': '/usr/bin/kinit', 'principal_name': 'hdfs-HDPCA@EXAMPLE.COM', 'user': 'hdfs', 'action': ['delete_on_execute'], 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'directory'}
2016-02-14 22:40:06,860 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-HDPCA@EXAMPLE.COM'] {'user': 'hdfs'}
2016-02-14 22:40:11,788 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://lnx0.localdomain.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpZmSrzL 2>/tmp/tmpxqBP9F''] {'quiet': False}
2016-02-14 22:40:15,601 - call returned (0, '')
2016-02-14 22:40:15,603 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://lnx1.localdomain.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpWcmtSE 2>/tmp/tmp8ueZHF''] {'quiet': False}
2016-02-14 22:40:19,015 - call returned (0, '')
2016-02-14 22:40:19,017 - NameNode HA states: active_namenodes = [(u'nn1', 'lnx0.localdomain.com:50070')], standby_namenodes = [(u'nn2', 'lnx1.localdomain.com:50070')], unknown_namenodes = []
2016-02-14 22:40:19,018 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://lnx0.localdomain.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpEDIPa1 2>/tmp/tmp1Xt3Yx''] {'quiet': False}
2016-02-14 22:40:22,856 - call returned (0, '')
2016-02-14 22:40:22,858 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://lnx1.localdomain.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpKQmmZw 2>/tmp/tmpJ6YFEL''] {'quiet': False}
2016-02-14 22:40:26,162 - call returned (0, '')
2016-02-14 22:40:26,164 - NameNode HA states: active_namenodes = [(u'nn1', 'lnx0.localdomain.com:50070')], standby_namenodes = [(u'nn2', 'lnx1.localdomain.com:50070')], unknown_namenodes = []
2016-02-14 22:40:26,167 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET --negotiate -u : '"'"'http://lnx0.localdomain.com:50070/webhdfs/v1/user/ambari-qa/mapredsmokeoutput?op=GETFILESTATUS&user.name=hdfs'"'"' 1>/tmp/tmpC3nS44 2>/tmp/tmpYhq8nH''] {'logoutput': None, 'quiet': False}
2016-02-14 22:40:29,885 - call returned (0, '')
2016-02-14 22:40:30,159 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L -w '"'"'%{http_code}'"'"' -X DELETE --negotiate -u : '"'"'http://lnx0.localdomain.com:50070/webhdfs/v1/user/ambari-qa/mapredsmokeoutput?op=DELETE&user.name=hdfs&recursive=True'"'"' 1>/tmp/tmpKxdOxa 2>/tmp/tmpNflsmo''] {'logoutput': None, 'quiet': False}
2016-02-14 22:40:34,695 - call returned (0, '')
2016-02-14 22:40:34,697 - HdfsResource['/user/ambari-qa/mapredsmokeinput'] {'security_enabled': True, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': '/etc/security/keytabs/hdfs.headless.keytab', 'source': '/etc/passwd', 'default_fs': 'hdfs://HDPCA', 'hdfs_site': ..., 'kinit_path_local': '/usr/bin/kinit', 'principal_name': 'hdfs-HDPCA@EXAMPLE.COM', 'user': 'hdfs', 'action': ['create_on_execute'], 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'file'}
2016-02-14 22:40:34,699 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-HDPCA@EXAMPLE.COM'] {'user': 'hdfs'}
2016-02-14 22:40:36,030 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://lnx0.localdomain.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpWi075d 2>/tmp/tmpfGORFu''] {'quiet': False}
2016-02-14 22:40:39,946 - call returned (0, '')
2016-02-14 22:40:39,948 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://lnx1.localdomain.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpVeHNPf 2>/tmp/tmpXbZTBA''] {'quiet': False}
2016-02-14 22:40:41,300 - call returned (0, '')
2016-02-14 22:40:41,302 - NameNode HA states: active_namenodes = [(u'nn1', 'lnx0.localdomain.com:50070')], standby_namenodes = [(u'nn2', 'lnx1.localdomain.com:50070')], unknown_namenodes = []
2016-02-14 22:40:41,303 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://lnx0.localdomain.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmp_2srqJ 2>/tmp/tmpaBMZT6''] {'quiet': False}
2016-02-14 22:40:41,588 - call returned (0, '')
2016-02-14 22:40:41,591 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://lnx1.localdomain.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpzAgUPP 2>/tmp/tmp_JVFtg''] {'quiet': False}
2016-02-14 22:40:41,859 - call returned (0, '')
2016-02-14 22:40:41,861 - NameNode HA states: active_namenodes = [(u'nn1', 'lnx0.localdomain.com:50070')], standby_namenodes = [(u'nn2', 'lnx1.localdomain.com:50070')], unknown_namenodes = []
2016-02-14 22:40:41,867 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET --negotiate -u : '"'"'http://lnx0.localdomain.com:50070/webhdfs/v1/user/ambari-qa/mapredsmokeinput?op=GETFILESTATUS&user.name=hdfs'"'"' 1>/tmp/tmp1NN7xj 2>/tmp/tmpdtdIhc''] {'logoutput': None, 'quiet': False}
2016-02-14 22:40:42,493 - call returned (0, '')
2016-02-14 22:40:42,494 - DFS file /user/ambari-qa/mapredsmokeinput is identical to /etc/passwd, skipping the copying
2016-02-14 22:40:42,495 - HdfsResource[None] {'security_enabled': True, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': '/etc/security/keytabs/hdfs.headless.keytab', 'default_fs': 'hdfs://HDPCA', 'hdfs_site': ..., 'kinit_path_local': '/usr/bin/kinit', 'principal_name': 'hdfs-HDPCA@EXAMPLE.COM', 'user': 'hdfs', 'action': ['execute'], 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf'}
2016-02-14 22:40:42,495 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/smokeuser.headless.keytab ambari-qa-HDPCA@EXAMPLE.COM;'] {'user': 'ambari-qa'}
2016-02-14 22:40:42,778 - ExecuteHadoop['jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar wordcount /user/ambari-qa/mapredsmokeinput /user/ambari-qa/mapredsmokeoutput'] {'bin_dir': '/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin:/usr/hdp/current/hadoop-yarn-client/bin', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'logoutput': True, 'try_sleep': 5, 'tries': 1, 'user': 'ambari-qa'}
2016-02-14 22:40:42,881 - Execute['hadoop --config /usr/hdp/current/hadoop-client/conf jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar wordcount /user/ambari-qa/mapredsmokeinput /user/ambari-qa/mapredsmokeoutput'] {'logoutput': True, 'try_sleep': 5, 'environment': {}, 'tries': 1, 'user': 'ambari-qa', 'path': ['/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin:/usr/hdp/current/hadoop-yarn-client/bin']}
WARNING: Use "yarn jar" to launch YARN applications.
16/02/14 22:42:50 INFO impl.TimelineClientImpl: Timeline service address: http://lnx1.localdomain.com:8188/ws/v1/timeline/
16/02/14 22:42:51 INFO client.RMProxy: Connecting to ResourceManager at Lnx1.localdomain.com/192.168.122.40:8050
16/02/14 22:42:53 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 599 for ambari-qa on ha-hdfs:HDPCA
16/02/14 22:42:54 INFO security.TokenCache: Got dt for hdfs://HDPCA; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:HDPCA, Ident: (HDFS_DELEGATION_TOKEN token 599 for ambari-qa)
16/02/14 22:42:54 WARN token.Token: Cannot find class for token kind kms-dt
16/02/14 22:42:54 INFO security.TokenCache: Got dt for hdfs://HDPCA; Kind: kms-dt, Service: 192.168.0.102:9292, Ident: 00 0f 61 6d 62 61 72 69 2d 71 61 2d 48 44 50 43 41 02 72 6d 00 8a 01 52 e3 06 19 54 8a 01 53 07 12 9d 54 04 02
16/02/14 22:43:03 INFO input.FileInputFormat: Total input paths to process : 1
16/02/14 22:43:06 INFO mapreduce.JobSubmitter: number of splits:1
16/02/14 22:43:10 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1455503320604_0004
16/02/14 22:43:10 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:HDPCA, Ident: (HDFS_DELEGATION_TOKEN token 599 for ambari-qa)
16/02/14 22:43:10 WARN token.Token: Cannot find class for token kind kms-dt
16/02/14 22:43:10 WARN token.Token: Cannot find class for token kind kms-dt
Kind: kms-dt, Service: 192.168.0.102:9292, Ident: 00 0f 61 6d 62 61 72 69 2d 71 61 2d 48 44 50 43 41 02 72 6d 00 8a 01 52 e3 06 19 54 8a 01 53 07 12 9d 54 04 02
16/02/14 22:43:19 INFO impl.YarnClientImpl: Application submission is not finished, submitted application application_1455503320604_0004 is still in NEW
16/02/14 22:43:20 INFO impl.YarnClientImpl: Submitted application application_1455503320604_0004
16/02/14 22:43:21 INFO mapreduce.Job: The url to track the job: http://Lnx1.localdomain.com:8088/proxy/application_1455503320604_0004/
16/02/14 22:43:21 INFO mapreduce.Job: Running job: job_1455503320604_0004
1 ACCEPTED SOLUTION

avatar
Contributor

I finally got the solution.

There are few failed MR jobs ate up the resource. After they been killed, the further job run smoothly.

View solution in original post

12 REPLIES 12

avatar
Master Mentor
@wei yang

I don't anything in the logs.

yarn logs -applicationid application_1455503320604_0004

What do you see in the output?

avatar
Contributor

PLS check the attached output.

thx,

wei

output.txt

avatar
Master Mentor

@wei yang Are you exporting data from encrypted zone?

yarn log -applicationid application_1455551404320_0001

avatar
Contributor

No, I was exporting data from secured hdfs to mysql, the cluster has namenode HA enabled.

The job was killed by me. PLS check the attached log.

application-1455503320604-0004.txt

avatar
Master Mentor

@wei yang I wonder if we lost the connectivity to --connect jdbc:mysql://Lnx0.localdomain.com/test

Why did you kill the application? Is it because of following?

16/02/15 11:27:41 INFO mapreduce.Job: Task Id : attempt_1455551404320_0001_m_000002_0, Status : FAILED
AttemptID:attempt_1455551404320_0001_m_000002_0 Timed out after 300 secs

avatar
Contributor

I re-run the job this morning, it's failed and terminated by itself. here is the full log:

appattempt-1455551404320-0001-000001.txt

Thank you !

wei

avatar
Master Mentor

@wei yang Do you have access to support? In the meatime, whats the value of mapreduce.task.timeout?

avatar
Contributor

It's my test env, I don't have support access, mapreduce.task.timeout=300000

thx,

avatar
Contributor

It is HDP-2.3.4.0-3485