Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Pig script is hanging

Highlighted

Pig script is hanging

Rising Star

Trying to run a Pig script from Hadoop tutorial. It runs forever...

Yarn shows that script created 2 jobs.

1. Name -TempletonControllerJob, application type - MAPREDUCE

2. Name -PigLatin:script.pig, application type - Tez

The first job (TempletonControllerJob) runs forever, the second job never starts (stays in Accepted state).

The log for the TempletonControllerJob contains the following at the end

2016-12-29 15:40:55,930 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1483024248361_0011_m_000000_0] using containerId: [container_e05_1483024248361_0011_01_000002 on NM: [lenu.dom.hdp:45454]
2016-12-29 15:40:55,936 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1483024248361_0011_m_000000_0 TaskAttempt Transitioned from ASSIGNED to RUNNING
2016-12-29 15:40:55,937 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1483024248361_0011_m_000000 Task Transitioned from SCHEDULED to RUNNING
2016-12-29 15:40:56,473 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1483024248361_0011: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:5120, vCores:1> knownNMs=1
2016-12-29 15:40:59,088 INFO [Socket Reader #1 for port 38857] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1483024248361_0011 (auth:SIMPLE)
2016-12-29 15:40:59,134 INFO [IPC Server handler 1 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1483024248361_0011_m_5497558138882 asked for a task
2016-12-29 15:40:59,138 INFO [IPC Server handler 1 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1483024248361_0011_m_5497558138882 given task: attempt_1483024248361_0011_m_000000_0
2016-12-29 15:41:06,916 INFO [IPC Server handler 1 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:42:07,072 INFO [IPC Server handler 21 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:43:07,233 INFO [IPC Server handler 14 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:44:07,397 INFO [IPC Server handler 1 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:45:07,535 INFO [IPC Server handler 21 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:46:07,666 INFO [IPC Server handler 14 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:47:07,806 INFO [IPC Server handler 29 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:48:07,943 INFO [IPC Server handler 17 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:49:08,076 INFO [IPC Server handler 14 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:50:08,177 INFO [IPC Server handler 6 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:51:08,305 INFO [IPC Server handler 17 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:52:08,440 INFO [IPC Server handler 14 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:53:08,555 INFO [IPC Server handler 1 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:54:08,681 INFO [IPC Server handler 21 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:55:08,810 INFO [IPC Server handler 12 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:56:08,952 INFO [IPC Server handler 4 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:57:09,074 INFO [IPC Server handler 21 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:58:09,180 INFO [IPC Server handler 12 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 15:59:09,296 INFO [IPC Server handler 4 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 16:00:06,400 INFO [IPC Server handler 19 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 16:01:06,479 INFO [IPC Server handler 7 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 16:02:06,589 INFO [IPC Server handler 1 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 16:03:06,701 INFO [IPC Server handler 17 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 16:04:06,813 INFO [IPC Server handler 7 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0
2016-12-29 16:05:06,943 INFO [IPC Server handler 1 on 38857] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1483024248361_0011_m_000000_0 is : 0.0

So it looks like the progress is checked every minute and progress is always 0.

7 REPLIES 7
Highlighted

Re: Pig script is hanging

Expert Contributor

@Dmitry Otblesk

Whats the capacity of the cluster? Can you check if you are hitting resource allocation limits here ? Probably it looks like you are hitting the cluster limits in terms of resources .

The second job most probably is waiting for a resource for getting an AM container assigned which is restricted by the value of "yarn.scheduler.capacity.maximum-am-resource-percent" which is by default of "0.2" i.e 20% of teh overall cluster capacity . So if you cluster can allocated only 10 containers ( overall capacity) then at a time only 2 AM containers can be spun . Are we seeing similar behavior in your cluster?

Highlighted

Re: Pig script is hanging

Rising Star

@Sumesh

What exactly should I check and where? The dashboard in Ambari is not currently showing any problem with resources.

Highlighted

Re: Pig script is hanging

Expert Contributor

Please look at the RM UI when you will run the job. When you login to the RM UI, on the left side you will have the "Scheduler" link. Click on that and that will take you to the page where you will have to look at the queue setup and how much resources are allocated to each queue and when you submit a job which queue it allocates to . Ex as below:

10944-screen-shot-2016-12-29-at-115418-pm.png

Highlighted

Re: Pig script is hanging

Rising Star

@Sumesh

I checked this and it really shows that big portion of a queue is allocated for a first job. The question is - how to make Pig to allocate less memory resources?

Highlighted

Re: Pig script is hanging

Rising Star

@Sumesh

>The second job most probably is waiting for a resource for getting an AM container

The second job is waiting indeed. But why the first job (the one, which is running) is not getting anywhere? It reports 0 progress all the time.

Highlighted

Re: Pig script is hanging

New Contributor

The second job (PigLatin) was spawned by TempletonControllerJob. The first job (TempletonControllerJob) is waiting for the completion of the second (PigLatin), but you have a lock because of (probably) "yarn.scheduler.capacity.maximum-am-resource-percent" not allowing the second ApplicationManager (PigLatin) to transit to RUNNING.

At least this is what was happening to me (my comment bellow).

Test clusters with low resources are more prone to this type of locking because of yarn.scheduler.capacity.maximum-am-resource-percent. Happened to me a lot of times when using Oozie in Cloudera and never knew why. A huge thanks to @Sumesh

Highlighted

Re: Pig script is hanging

New Contributor

Don't you know how much frustration this gave me. And even gave it to me when using Cloudera. "yarn.scheduler.capacity.maximum-am-resource-percent" fixed the problem. Thanks!

In my case I have 6 VCores an 9GB of RAM. I am running a Pig script withouth checking "Execute on Tez" but it is executing it on Tez (I will ask about in other thread, because I want to run in MapReduce but seems the option in pig.properties does not work).

The "TempletonControllerJob MAPREDUCE" requested 2 containers and 2VCores.

The "PigLatin:script.pigTEZ" requested 1 container and 1 VCore (later will use 2 container and 2 VCores), and I guess it was being denied (and stuck) because 1 VCore > 0.2*4 free.

Since I am just testing, I set "yarn.scheduler.capacity.maximum-am-resource-percent=1" and the TEZ ApplicationManager started running :)

Thanks!!!

Don't have an account?
Coming from Hortonworks? Activate your account here