Support Questions

Find answers, ask questions, and share your expertise

Unable get the final result after processing pig script

avatar
Rising Star

Hi @slachterman

I am executing the below pig scripts and not getting the desired results.

5860-pig-script.png

I added -useHCatlog argument also and did syntax check before executing this query.

Below are the schema information about hive tables i used in pig scripts

5861-gelocation.png

5862-driver-mileage.png

5863-risk-factor.png

and when i run the select * from riskfactor i dont get any data in riskfactor table in hive.

Please find the attached result and log file details.

result.txt

In the log file it says below message:

WARNING: Use "yarn jar" to launch YARN applications. 16/07/19 10:59:47 INFO pig.Main: Pig script completed in 4 seconds and 983 milliseconds (4983 ms)

1 ACCEPTED SOLUTION

avatar

Hi @Ravikumar Kumashi, when you run the script with the -check flag, Pig will just parse and run semantic checks on the script but not actually execute it. Please try re-running the script without the -check flag and share the results.

View solution in original post

8 REPLIES 8

avatar

Hi @Ravikumar Kumashi, when you run the script with the -check flag, Pig will just parse and run semantic checks on the script but not actually execute it. Please try re-running the script without the -check flag and share the results.

avatar
Rising Star

@slachterman Actually i got the same results when i execute, my mistake i uploaded that screen shot.

avatar

@Ravikumar Kumashi I see. Please delete the -x tez argument as well so that only -useHCatalog is specified. The correct way to execute on tez is to check the appropriate 'Execute on Tez' checkbox in the upper right corner (see screenshot). The reason you are seeing the results you are is because pig will return the help text when the syntax is not correct.

5865-screen-shot-2016-07-19-at-91507-am.png

avatar
Rising Star

@slachterman as per suggestion i removed -x tez argument and ran the pig script and it works.

But I see it gives WARNING: Use "yarn jar" to launch YARN applications. how do we avoid this warning?.

and also it says it tooks 30+ minutes in logfile but it actually tokes more than 4 hours and why is that.

Please find the attached logfile.job-1468940847411-0003-logs.txt

avatar

@Ravikumar Kumashi glad to hear it worked. You can ignore that warning (see this HCC post).

That could be due to lack of sufficient resources to allocate YARN containers. I tested against the HDP 2.4 Sandbox on Virtualbox with 8 GB of RAM and 4 CPU and the pig script is executed in about 76 seconds.

avatar
Rising Star

@slachterman Thanks!!!..yes i just allocated 4GB RAM and 2 CPU.

I just would like to how do we create this pigscirpt(.pig) and hql querries(.hql) and schedule/run them in the cluster.

avatar

@Ravikumar Kumashi if you are interested in scheduling Pig or Hive scripts, please look into Oozie. You may want to post a separate question about that if you need more details or encounter any issues.

avatar
Rising Star
@slachterman

Thanks..ll do