Support Questions

Find answers, ask questions, and share your expertise

The Pig Script Hangs While Running

avatar
Explorer

Hi all,

I have created a 2 node cluster (2 Core, 8 GB Ram on each) and started to follow the tutorial on this page. Whenever I try to execute the example pig script (Step 3.4: Execute Pig Script on Tez), script hangs although the status stays in "Running". As can be seen from the dashboard screenshot, all my services are seem to be up & running.

I have also attached the screenshots of simple yarn config screen and info regarding to specific MapReduce job.

Why might be the problem in my case?

Thank you.


yarn-basic.pngdashboard.pngmapreduce-job.png
1 ACCEPTED SOLUTION

avatar

@Muhammed Yetginbal

Considering your last comment and information provided, I had a look at:

https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore#HCatalogLoadStore-HCatStorer

Can you confirm that the pre-requisites are met in your case? The table 'riskFactor' exists (with the correct schema)?

Besides is Hive up and running?

Are you running your script where Hive is installed? are you in a clustered env?

View solution in original post

18 REPLIES 18

avatar
Explorer

Thank you for your quick feedback. I can confirm that the table 'riskfactor' exists with the correct scheme (I used the HiveQL statement in the tutorial). Hive is up & running. I am in a 2 node cluster environment.

Only not sure about the question whether I am running my script where Hive installed or not. I can confirm that services like Hive Metastore, HiveServer2 are installed in the second node of my cluster and the map reduce job that I see in the ResourceManager UI says that Node is the second one (also attached the screenshot).


riskfactor-hive.pnghive-status.pngmapreduce-job-detail.png

avatar

You are running the pig script from the pig view in ambari as described in the tutorial? If yes, do you have a chance to get the logs from the pig view?

avatar
Explorer

Yes, I do run the pig script from the pig view in Ambari, however since it constantly stays on the "Running" status no further log is displayed on the "Logs" panel in the view.

Btw, in order to observe the scenario you have mentioned, I changed the table name in the store operation to a non-existing table and it directly threw an error saying that the related table is not found. So we can assume that the script sees the "riskfactor" table but somehow stuck in performing the store operation.


pig-view.png

avatar
Explorer

I am also adding the screenshot regarding to the scenario where I do not use the "Store" operation.


riskfactor-no-store.png

avatar

@Muhammed Yetginbal One option would be to test launching the Pig script from the console (https://wiki.apache.org/pig/RunPig). Using the client (grunt) you could try to execute the same commands. Maybe it will give you more details about what is happening when "running".

avatar
Explorer

Finally! The script has worked via grunt, I have added the whole grunt execution logs in the attachment. First of all thank you Pierre, happy to see that the script has worked 🙂 But in this case, what might be the problem with the Pig View?

avatar

Honestly... I don't know 🙂 I'd suggest digging into the logs (ambari, mapreduce, yarn, hive, etc) to find an explanation. But maybe someone else on HCC will have an idea 🙂

avatar
Explorer

Thank you Pierre 🙂 You've helped me a lot, appreciated 🙂 I'll accept your answer since it was the only option for me to be able to execute the script.

avatar

@Muhammed Yetginbal

Hello, I'm facing the same issue by following the same tutorial mentioned in:

https://hortonworks.com/tutorial/hadoop-tutorial-getting-started-with-hdp/section/4/.

Did you find a solution for this issue? I'll be grateful if you could help me.