Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

CDH 5.01 Integration with Datameer Version - Datameer-4.1.0-cdh-5.0.0-mr2

Highlighted

CDH 5.01 Integration with Datameer Version - Datameer-4.1.0-cdh-5.0.0-mr2

Contributor

Hello All,

 

I am having the issue getting the CDH 5.0.1 and Datameer Version 4.1.0 working with MRv2. I am able to configure the intergration and able to see the nodes from datameer  WEB UI but when I try to run the "Health Check Test" from datameer Web UI - it fails with below error . I am suspecting something to do with Java version between CDH 5.0.1 and Datameer. Could you please have a look and give me some idea to troubleshoot this issue?

 

 

1.)           Setting up new cluster on Amazon AWS using Cloudera CDH-5.0.1  & MRv2

2.)           Using Datameer version  - Datameer-4.1.0-cdh-5.0.0-mr2

3.)           Using Kerberos Authentication on Hadoop Cluster

4.)           I can able to see all the Hadoop nodes from Datameer admin page

5.)           When I try to run “Cluster Health Test” it fails with message what we get on this ticket

6.)           I logged on as user Datameer directly on Hadoop Unix box and able to run pig and sample  MRv2 jobs – it runs

bash-4.1$

[anonymous] INFO [2014-06-04 19:48:42.585] [MrPlanRunnerThread-0] (JobSubmitter.java:479) - Submitting tokens for job: job_1401918159543_0003

[anonymous] INFO [2014-06-04 19:48:42.586] [MrPlanRunnerThread-0] (JobSubmitter.java:481) - Kind: HDFS_DELEGATION_TOKEN, Service: 10.22.10.113:8020, Ident: (HDFS_DELEGATION_TOKEN token 88 for datameer)

[anonymous] INFO [2014-06-04 19:48:42.830] [MrPlanRunnerThread-0] (YARNRunner.java:369) - Job jar is not present. Not adding any jar to the list of resources.

[anonymous] INFO [2014-06-04 19:48:43.296] [MrPlanRunnerThread-0] (YarnClientImpl.java:167) - Submitted application application_1401918159543_0003

[anonymous] INFO [2014-06-04 19:48:43.329] [MrPlanRunnerThread-0] (Job.java:1299) - The url to track the job: <URL>

 

[anonymous] INFO [2014-06-04 19:48:43.331] [MrPlanRunnerThread-0] (HadoopMrJobClient.java:57) - Submitted mr-job with name 'System job (16): check_system_job#check_system_job' and id 'job_1401918159543_0003'

[anonymous] INFO [2014-06-04 19:48:43.331] [MrPlanRunnerThread-0] (HadoopMrJobClient.java:89) - Waiting on completion of job: job_1401918159543_0003

[anonymous] INFO [2014-06-04 19:48:43.833] [MrPlanRunnerThread-0] (HadoopMrJobClient.java:105) - job_1401918159543_0003: map 0% reduce n/a (total 0%)

[anonymous] INFO [2014-06-04 19:48:44.355] [pool-6-thread-1] (HdfsUploader.java:42) - Push job-log to hdfs://10.22.10.113:8020/user/datameer/joblogs/16/001-job.log

[anonymous] INFO [2014-06-04 19:49:00.949] [MrPlanRunnerThread-0] (DelegateInputFormat.java:106) - Releasing splits (UUID: afabf0e9-ac5a-4b86-8b39-683d1af2d368) from cache, still cached split-arrays: 0

[anonymous] INFO [2014-06-04 19:49:01.129] [MrPlanRunnerThread-0] (MrPlanRunner.java:216) - Stopped MrPlanRunnerThread-0 with exception

[system] INFO [2014-06-04 19:49:01.129] [JobScheduler worker1-thread-1] (HadoopMrJobClient.java:250) - Setting cancel flag to true.

[system] INFO [2014-06-04 19:49:01.130] [JobScheduler worker1-thread-1] (MrPlanRunner.java:257) - -------------------------------------------

[system] INFO [2014-06-04 19:49:01.130] [JobScheduler worker1-thread-1] (MrPlanRunner.java:258) - executing postprocessing task for failed job

[system] INFO [2014-06-04 19:49:01.130] [JobScheduler worker1-thread-1] (MrPlanRunner.java:259) - -------------------------------------------

[system] INFO [2014-06-04 19:49:01.353] [JobScheduler worker1-thread-1] (MrPlanRunner.java:265) - Completed postprocessing: [0 sec], progress at 100

[system] INFO [2014-06-04 19:49:01.353] [JobScheduler worker1-thread-1] (MrPlanRunner.java:266) - -------------------------------------------

[system] INFO [2014-06-04 19:49:01.357] [JobScheduler worker1-thread-1] (DapJobCounter.java:192) - Job FAILURE with '0' mr-jobs and following counters:

[system] ERROR [2014-06-04 19:49:01.358] [JobScheduler worker1-thread-1] (DasJobCallable.java:139) - Job failed! Execution plan: digraph G {

1 [label = "MrInputNode{check_system_job-fakeInput} - 0 Bytes"];

2 [label = "MrMapNode{datameer.dap.common.job.system.ClusterValidationMapper@2af92c45}"];

3 [label = "MrOutputNode{check_system_job}"];

1 -> 2 [label = "REQUIRED_AS_MAPPER_INPUT"];

2 -> 3 [label = "PRODUCED_BY_MAPPER"];

}

java.util.concurrent.ExecutionException: java.lang.RuntimeException: failed to execute: MrJobNode{check_system_job, includes=[], m/r/m=1/0/0}

at datameer.com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)

at datameer.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)

at datameer.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)

at datameer.dap.common.job.mr.plan.MrPlanRunner.internalGet(MrPlanRunner.java:135)

at datameer.dap.common.job.mr.plan.MrPlanRunner.get(MrPlanRunner.java:166)

at datameer.dap.common.job.mr.plan.MrPlanRunner.get(MrPlanRunner.java:42)

at datameer.dap.common.job.DasJobCallable.call(DasJobCallable.java:108)

at datameer.dap.common.job.DasJobCallable.call(DasJobCallable.java:51)

at datameer.dap.conductor.job.JobSchedulerJob$2.call(JobSchedulerJob.java:121)

at datameer.dap.conductor.job.JobSchedulerJob$2.call(JobSchedulerJob.java:103)

at datameer.dap.conductor.webapp.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:99)

at datameer.dap.conductor.job.JobSchedulerJob.call(JobSchedulerJob.java:103)

at datameer.dap.conductor.job.JobSchedulerJob.call(JobSchedulerJob.java:38)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Caused by: java.lang.RuntimeException: failed to execute: MrJobNode{check_system_job, includes=[], m/r/m=1/0/0}

at datameer.dap.sdk.util.ExceptionUtil.convertToRuntimeException(ExceptionUtil.java:38)

at datameer.dap.common.job.mr.plan.execution.NodeExecutor.execute(NodeExecutor.java:34)

at datameer.dap.common.job.mr.plan.MrPlanRunner$ExecuteMrJobRunnable.run(MrPlanRunner.java:204)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

... 4 more

Caused by: java.io.IOException: Job job_1401918159543_0003 failed! Failure info: Application application_1401918159543_0003 failed 2 times due to AM Container for appattempt_1401918159543_0003_000002 exited with exitCode: 1 due to: Exception from container-launch:

org.apache.hadoop.util.Shell$ExitCodeException:

at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)

at org.apache.hadoop.util.Shell.run(Shell.java:418)

at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)

at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:279)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

main : command provided 1

main : user is datameer

main : requested yarn user is datameer

Container exited with a non-zero exit code 1

.Failing this attempt.. Failing the application.

at datameer.dap.common.job.mr.HadoopMrJobClient.waitUntilJobCompletion(HadoopMrJobClient.java:141)

at datameer.dap.common.job.mr.HadoopMrJobClient.runJobImpl(HadoopMrJobClient.java:58)

at datameer.dap.common.job.mr.MrJobClient.runJob(MrJobClient.java:32)

at datameer.dap.common.job.mr.plan.execution.MrJobExecutor.runJob(MrJobExecutor.java:44)

at datameer.dap.common.job.mr.plan.execution.MrJobExecutor.doExecute(MrJobExecutor.java:35)

at datameer.dap.common.job.mr.plan.execution.MrJobExecutor.doExecute(MrJobExecutor.java:20)

at datameer.dap.common.job.mr.plan.execution.NodeExecutor.execute(NodeExecutor.java:30)

... 6 more

[system] ERROR [2014-06-04 19:49:01.634] [JobScheduler thread-1] (JobScheduler.java:783) - Job 16 failed with exception.

java.util.concurrent.ExecutionException: java.lang.RuntimeException: failed to execute: MrJobNode{check_system_job, includes=[], m/r/m=1/0/0}

at datameer.com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)

at datameer.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)

at datameer.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)

at datameer.dap.common.job.mr.plan.MrPlanRunner.internalGet(MrPlanRunner.java:135)

at datameer.dap.common.job.mr.plan.MrPlanRunner.get(MrPlanRunner.java:166)

at datameer.dap.common.job.mr.plan.MrPlanRunner.get(MrPlanRunner.java:42)

at datameer.dap.common.job.DasJobCallable.call(DasJobCallable.java:108)

at datameer.dap.common.job.DasJobCallable.call(DasJobCallable.java:51)

at datameer.dap.conductor.job.JobSchedulerJob$2.call(JobSchedulerJob.java:121)

at datameer.dap.conductor.job.JobSchedulerJob$2.call(JobSchedulerJob.java:103)

at datameer.dap.conductor.webapp.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:99)

at datameer.dap.conductor.job.JobSchedulerJob.call(JobSchedulerJob.java:103)

at datameer.dap.conductor.job.JobSchedulerJob.call(JobSchedulerJob.java:38)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Caused by: java.lang.RuntimeException: failed to execute: MrJobNode{check_system_job, includes=[], m/r/m=1/0/0}

at datameer.dap.sdk.util.ExceptionUtil.convertToRuntimeException(ExceptionUtil.java:38)

at datameer.dap.common.job.mr.plan.execution.NodeExecutor.execute(NodeExecutor.java:34)

at datameer.dap.common.job.mr.plan.MrPlanRunner$ExecuteMrJobRunnable.run(MrPlanRunner.java:204)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

... 4 more

Caused by: java.io.IOException: Job job_1401918159543_0003 failed! Failure info: Application application_1401918159543_0003 failed 2 times due to AM Container for appattempt_1401918159543_0003_000002 exited with exitCode: 1 due to: Exception from container-launch:

org.apache.hadoop.util.Shell$ExitCodeException:

at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)

at org.apache.hadoop.util.Shell.run(Shell.java:418)

at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)

at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:279)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

main : command provided 1

main : user is datameer

main : requested yarn user is datameer

Container exited with a non-zero exit code 1

.Failing this attempt.. Failing the application.

at datameer.dap.common.job.mr.HadoopMrJobClient.waitUntilJobCompletion(HadoopMrJobClient.java:141)

at datameer.dap.common.job.mr.HadoopMrJobClient.runJobImpl(HadoopMrJobClient.java:58)

at datameer.dap.common.job.mr.MrJobClient.runJob(MrJobClient.java:32)

at datameer.dap.common.job.mr.plan.execution.MrJobExecutor.runJob(MrJobExecutor.java:44)

at datameer.dap.common.job.mr.plan.execution.MrJobExecutor.doExecute(MrJobExecutor.java:35)

at datameer.dap.common.job.mr.plan.execution.MrJobExecutor.doExecute(MrJobExecutor.java:20)

at datameer.dap.common.job.mr.plan.execution.NodeExecutor.execute(NodeExecutor.java:30)

... 6 more

[system] INFO [2014-06-04 19:49:01.658] [JobScheduler thread-1] (JobScheduler.java:853) - Computing after job completion operations for execution 16 (type=NORMAL)

[system] INFO [2014-06-04 19:49:01.667] [JobScheduler thread-1] (JobScheduler.java:857) - Finished computing after job completion operations for execution 16 (type=NORMAL) [0 sec]

[system] WARN [2014-06-04 19:49:01.668] [JobScheduler thread-1] (JobScheduler.java:709) - Job DapJobExecution{id=16, type=NORMAL, status=ERROR} completed with status ERROR.

[system] INFO [2014-06-04 19:49:01.669] [JobScheduler thread-1] (Configuration.java:1009) - fs.default.name is deprecated. Instead, use fs.defaultFS

[system] INFO [2014-06-04 19:49:01.670] [JobScheduler thread-1] (HdfsUploader.java:42) - Push job-log to hdfs://10.22.10.113:8020/user/datameer/joblogs/16/job.log

[system] INFO [2014-06-04 19:49:01.699] [JobScheduler thread-1] (HdfsUploader.java:46) - Deleting temporary log files

2 REPLIES 2

Re: CDH 5.01 Integration with Datameer Version - Datameer-4.1.0-cdh-5.0.0-mr2

Master Guru
Check the syslog/stderr/stdout of the failed app (application_1401918159543_0003, in above case) AM containers via the RM Web UI. They should carry further reason on why they failed to run.

Re: CDH 5.01 Integration with Datameer Version - Datameer-4.1.0-cdh-5.0.0-mr2

Contributor

Hello - below is additional information's. please give the priority to this issue please.

 

I have case opened with cloudera support - # 38222 - I swtiched back to MR1 with kerberos as I am running tight timeline for deliverables.

 

1.) when I run the job from unix command line I do not see any of below error or warning on jobhistory server /var/log/Hadoop-mapreduce and job runs without any trouble.

2.) When I run the job from Datameer front-end I see the below on job-history server /var/log/Hadoop-mapreduce

 

 

[root@<ID> ram]# cat SecurityAuth-mapred.audit

2014-07-19 11:37:15,477 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for <URL> (auth:SIMPLE)

2014-07-19 11:37:15,480 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successful for <URL> (auth:KERBEROS) for protocol=interface org.apache.hadoop.mapreduce.v2.api.HSClientProtocolPB

[root@<ID> ram]#

 

 

[root@<ID> ram]# cat hadoop-cmf-yarn-JOBHISTORY-<ID>

2014-07-19 11:36:13,161 INFO org.apache.hadoop.mapreduce.v2.hs.JobHistory: Starting scan to move intermediate done files

2014-07-19 11:37:15,507 WARN org.apache.hadoop.ipc.Server: IPC Server handler 0 on 10020, call org.apache.hadoop.mapreduce.v2.api.HSClientProtocolPB.getTaskAttemptCompletionEvents from 10.131.108.48:52031 Call#8158 Retry#0: error: java.lang.NullPointerException java.lang.NullPointerException

        at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getTaskAttemptCompletionEvents(HistoryClientService.java:272)

        at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBServiceImpl.java:173)

        at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:283)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)

        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)

2014-07-19 11:37:15,623 WARN org.apache.hadoop.ipc.Server: IPC Server handler 4 on 10020, call org.apache.hadoop.mapreduce.v2.api.HSClientProtocolPB.getTaskAttemptCompletionEvents from 10.131.108.48:52031 Call#8160 Retry#0: error: java.lang.NullPointerException java.lang.NullPointerException

        at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getTaskAttemptCompletionEvents(HistoryClientService.java:272)

        at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBServiceImpl.java:173)

        at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:283)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)

        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)

2014-07-19 11:37:15,739 WARN org.apache.hadoop.ipc.Server: IPC Server handler 9 on 10020, call org.apache.hadoop.mapreduce.v2.api.HSClientProtocolPB.getTaskAttemptCompletionEvents from 10.131.108.48:52031 Call#8162 Retry#0: error: java.lang.NullPointerException java.lang.NullPointerException

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here