Created on 07-19-2016 11:42 AM - edited 08-19-2019 01:55 AM
Hi @slachterman
I am executing the below pig scripts and not getting the desired results.
I added -useHCatlog argument also and did syntax check before executing this query.
Below are the schema information about hive tables i used in pig scripts
and when i run the select * from riskfactor i dont get any data in riskfactor table in hive.
Please find the attached result and log file details.
In the log file it says below message:
WARNING: Use "yarn jar" to launch YARN applications. 16/07/19 10:59:47 INFO pig.Main: Pig script completed in 4 seconds and 983 milliseconds (4983 ms)
Created 07-19-2016 01:27 PM
Hi @Ravikumar Kumashi, when you run the script with the -check flag, Pig will just parse and run semantic checks on the script but not actually execute it. Please try re-running the script without the -check flag and share the results.
Created 07-19-2016 01:27 PM
Hi @Ravikumar Kumashi, when you run the script with the -check flag, Pig will just parse and run semantic checks on the script but not actually execute it. Please try re-running the script without the -check flag and share the results.
Created 07-19-2016 01:37 PM
@slachterman Actually i got the same results when i execute, my mistake i uploaded that screen shot.
Created on 07-19-2016 02:17 PM - edited 08-19-2019 01:55 AM
@Ravikumar Kumashi I see. Please delete the -x tez argument as well so that only -useHCatalog is specified. The correct way to execute on tez is to check the appropriate 'Execute on Tez' checkbox in the upper right corner (see screenshot). The reason you are seeing the results you are is because pig will return the help text when the syntax is not correct.
Created 07-20-2016 11:27 AM
@slachterman as per suggestion i removed -x tez argument and ran the pig script and it works.
But I see it gives WARNING: Use "yarn jar" to launch YARN applications. how do we avoid this warning?.
and also it says it tooks 30+ minutes in logfile but it actually tokes more than 4 hours and why is that.
Please find the attached logfile.job-1468940847411-0003-logs.txt
Created 07-20-2016 12:58 PM
@Ravikumar Kumashi glad to hear it worked. You can ignore that warning (see this HCC post).
That could be due to lack of sufficient resources to allocate YARN containers. I tested against the HDP 2.4 Sandbox on Virtualbox with 8 GB of RAM and 4 CPU and the pig script is executed in about 76 seconds.
Created 07-20-2016 01:12 PM
@slachterman Thanks!!!..yes i just allocated 4GB RAM and 2 CPU.
I just would like to how do we create this pigscirpt(.pig) and hql querries(.hql) and schedule/run them in the cluster.
Created 07-20-2016 01:16 PM
@Ravikumar Kumashi if you are interested in scheduling Pig or Hive scripts, please look into Oozie. You may want to post a separate question about that if you need more details or encounter any issues.
Created 07-20-2016 01:22 PM
Thanks..ll do