Created on 11-20-2017 05:57 AM - edited 09-16-2022 05:32 AM
Hi,
I have 8 node cluster, when i submit job in edge node (Pi program), it creates job in local and executes
hadoop jar /opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/jars/hadoop-examples.jar pi 10 10
Number of Maps = 10
Samples per Map = 10
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
17/11/20 08:47:57 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
17/11/20 08:47:57 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
17/11/20 08:47:58 INFO input.FileInputFormat: Total input paths to process : 10
17/11/20 08:47:58 INFO mapreduce.JobSubmitter: number of splits:10
17/11/20 08:47:58 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local635221628_0001
17/11/20 08:47:58 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
17/11/20 08:47:58 INFO mapreduce.Job: Running job: job_local635221628_0001
17/11/20 08:47:58 INFO mapred.LocalJobRunner: OutputCommitter set in config null
17/11/20 08:47:58 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/11/20 08:47:58 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
17/11/20 08:47:58 INFO mapred.LocalJobRunner: Waiting for map tasks
17/11/20 08:47:58 INFO mapred.LocalJobRunner: Starting task: attempt_local635221628_0001_m_000000_0
17/11/20 08:47:58 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/11/20 08:47:58 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/11/20 08:47:58 INFO mapred.MapTask: Processing split: hdfs://nameservice-ha/user/hduser/QuasiMonteCarlo_1511185676373_1845096796/in/part0:0+118
but it executes succssfully.. But the job id job_localxxx cannot be tracked under Resource manager web ui.
When I run the same job on any other node (Name node or worker node, proper job_id is getting created which will be available in resoure manager web ui)
Also I noticed, when I run mapred job -list in edge node, throws me below error
mapred job -list
17/11/20 08:52:30 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
17/11/20 08:52:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
Exception in thread "main" java.lang.NullPointerException
at org.apache.hadoop.mapreduce.tools.CLI.listJobs(CLI.java:604)
at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:382)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1269)
And when I run
yarn application -list
17/11/20 08:52:59 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/11/20 08:53:00 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
17/11/20 08:53:01 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
17/11/20 08:53:02 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
17/11/20 08:53:03 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
Where as these commands works fine in other nodes, I have oozie service installed and ResourceManager Address is set to 8032.
Can some one tell me what went wrong? How can I fix this issue?
Created 11-20-2017 05:18 PM
Created 11-20-2017 05:18 PM
Created on 11-20-2017 09:27 PM - edited 11-20-2017 10:16 PM
Thanks a lot.. This resolved the issue : )
I have one more doubt, If I get java heap size issue like,
Caused by: java.lang.OutOfMemoryError: Java heap space when running any mapreduce job, how to increase the java heap size runtime? Does “-Dmapreduce.map.java.opts=-Xmx2048m” this really do something there? I dint find any changes. Could you please advice the best way to increase java heap size? Thanks in advance