Support Questions
Find answers, ask questions, and share your expertise

YARN containers failed when started from Polybase

Environment:

HDP 2.6 with Kerberson enabled

OS RedHat 7.3

MS SQL Server 2017 with Polybase

Polybase works fine when during direct loading from HDFS.

However if I use FORCE EXTERNALPUSHDOWN option (it means that Polybase start MR job on the cluster) - all containers fail with following error

2017-10-05 16:29:25,066 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_e50_1507209892753_0004_01_0000292017-10-05 16:29:25,067 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_e50_1507209892753_0004_01_0000162017-10-05 
16:29:25,068 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:27 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:29 ContRel:0 HostLocal:25 RackLocal:42017-10-05 
16:29:25,075 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1507209892753_0004_m_000016_0 TaskAttempt Transitioned from RUNNING to FAIL_CONTAINER_CLEANUP2017-10-05 
16:29:25,075 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1507209892753_0004_m_000016_0: Exception from container-launch.Container id: container_e50_1507209892753_0004_01_000029Exit code: 1Stack trace: org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException: Launch container failed at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:109) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:89) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:392) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Where to find root cause?

5 REPLIES 5

Re: YARN containers failed when started from Polybase

Cloudera Employee
@Nikita Kiselev

You may have to look at the complete Yarn logs for this application ID which would give more details regarding cause of failure. The logs can be gathered with below command

yarn logs -applicationId application_1507209892753_0004

Re: YARN containers failed when started from Polybase

Hello!

Same logs I can find using web-ui - there is nothing about cause why job got faild. Containers simply

Task Transitioned from RUNNING to KILL_WAIT

Re: YARN containers failed when started from Polybase

Same logs I found in YARN RM console

One more suspicious line - before start container

2017-10-05 18:20:26,264 WARN [main] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]

Re: YARN containers failed when started from Polybase

@Nikita Kiselev

Did you have a valid kerberos ticket before submitting the YARN job?
Share the output of 'klist' command?

Re: YARN containers failed when started from Polybase

Can't do kinit because this job initiated by MS SQL Polybase connector