Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

execution of pig script using oozie fails, but succeeds when submitted directly

Highlighted

execution of pig script using oozie fails, but succeeds when submitted directly

New Contributor

My pig script succeeds when submitted through command line but when I create a oozie workflow and run the same job, it fails with table not found error. Any idea what could be causing the pig script to not find the table when run through oozie, while it works fine when submitted on the console directly?

Run pig script using PigRunner.run() for Pig version 0.8+
2017-03-10 18:17:44,830 [main] INFO  org.apache.pig.Main  - Apache Pig version 0.15.0.2.4.2.0-258 (rexported) compiled Apr 25 2016, 07:10:53
2017-03-10 18:17:44,831 [main] INFO  org.apache.pig.Main  - Logging error messages to: /dfs/hadoop/yarn/local/usercache/mdurisheti/appcache/application_1489112872497_0765/container_1489112872497_0765_01_000002/pig-job_1489112872497_0765.log
2017-03-10 18:17:44,870 [main] INFO  org.apache.pig.impl.util.Utils  - Default bootup file /home/yarn/.pigbootup not found
2017-03-10 18:17:44,901 [main] INFO  org.apache.pig.tools.parameters.PreprocessorContext  - Executing command : whoami
2017-03-10 18:17:44,942 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation  - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2017-03-10 18:17:44,942 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation  - fs.default.name is deprecated. Instead, use fs.defaultFS
2017-03-10 18:17:44,943 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine  - Connecting to hadoop file system at: hdfs://h2ms01lax01us.prod.auction.local:8020
2017-03-10 18:17:44,948 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine  - Connecting to map-reduce job tracker at: h2ms01lax01us.prod.auction.local:8050
2017-03-10 18:17:44,954 [main] INFO  org.apache.pig.PigServer  - Pig Script ID for the session: PIG-clean_cm_contract.pig-ea439cc8-b8d2-421c-a899-9070def5cf72
2017-03-10 18:17:45,380 [main] INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl  - Timeline service address: http://0.0.0.0:8188/ws/v1/timeline/
2017-03-10 18:17:45,381 [main] INFO  org.apache.pig.backend.hadoop.ATSService  - Created ATS Hook
2017-03-10 18:17:45,396 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation  - fs.default.name is deprecated. Instead, use fs.defaultFS
2017-03-10 18:17:45,404 [ATS Logger 0] INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl  - Exception caught by TimelineClientConnectionRetry, will try 30 more time(s).
Message: java.net.ConnectException: Connection refused
2017-03-10 18:17:45,785 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation  - fs.default.name is deprecated. Instead, use fs.defaultFS
2017-03-10 18:17:45,939 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.optimize.mapjoin.mapreduce does not exist
2017-03-10 18:17:45,940 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.metastore.local does not exist
2017-03-10 18:17:45,940 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.semantic.analyzer.factory.impl does not exist
2017-03-10 18:17:45,940 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist
2017-03-10 18:17:45,973 [main] INFO  hive.metastore  - Trying to connect to metastore with URI thrift://hmas02lax01us.prod.auction.local:9083
2017-03-10 18:17:46,026 [main] INFO  hive.metastore  - Connected to metastore.
2017-03-10 18:17:46,140 [main] ERROR org.apache.pig.PigServer  - exception during parsing: Error during parsing. Table not found : raw.cm_contract table not found
Failed to parse: Can not retrieve schema from loader org.apache.hive.hcatalog.pig.HCatLoader@64d16158
	at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:201)
	at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1776)
	at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1484)
	at org.apache.pig.PigServer.parseAndBuild(PigServer.java:428)
	at org.apache.pig.PigServer.executeBatch(PigServer.java:453)
	at org.apache.pig.PigServer.executeBatch(PigServer.java:439)
	at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
	at org.apache.pig.tools.grunt.GruntParser.processFsCommand(GruntParser.java:1140)
	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:129)
	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
	at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
	at org.apache.pig.Main.run(Main.java:502)
	at org.apache.pig.PigRunner.run(PigRunner.java:49)
	at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:288)
	at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:231)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
	at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.RuntimeException: Can not retrieve schema from loader org.apache.hive.hcatalog.pig.HCatLoader@64d16158
	at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:91)
	at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901)
	at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
	at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
	at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
	at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
	at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
	at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
	... 30 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: Cannot get schema from loadFunc org.apache.hive.hcatalog.pig.HCatLoader
	at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:179)
	at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
	... 37 more
Caused by: org.apache.pig.PigException: ERROR 1115: Table not found : raw.cm_contract table not found
	at org.apache.hive.hcatalog.pig.PigHCatUtil.getTable(PigHCatUtil.java:209)
	at org.apache.hive.hcatalog.pig.HCatLoader.getSchema(HCatLoader.java:218)
	at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)
	... 38 more
2017-03-10 18:17:46,144 [main] ERROR org.apache.pig.tools.grunt.Grunt  - ERROR 1115: Table not found : raw.cm_contract table not found
2017-03-10 18:17:46,164 [main] INFO  org.apache.pig.Main  - Pig script completed in 1 second and 348 milliseconds (1348 ms)
3 REPLIES 3
Highlighted

Re: execution of pig script using oozie fails, but succeeds when submitted directly

Mentor

@Karthik Karuppaiya

you need the following property

hive.metastore.uris, easiest is to include copy of hive-site.xml in lib folder of the workflow.

please take a look at my example here https://github.com/dbist/oozie/tree/master/apps/hcatalog

I also have a working tutorial here https://community.hortonworks.com/articles/83051/apache-ambari-workflow-designer-view-for-apache-oo-...

Highlighted

Re: execution of pig script using oozie fails, but succeeds when submitted directly

New Contributor

@Artem Ervits

I am already setting it. Here is my properties file:

nameNode=hdfs://nn-hostname:8020
jobTracker=jt-hostname:8050
user.name=oozie
oozie.launcher.mapreduce.job.queuename=prod
#
ozie.libpath=${nameNode}/user/oozie/share/lib
oozie.use.system.libpath=true
oozie.wf.rerun.failnodes=true
oozie.action.sharelib.for.pig=pig,hcatalog,hive
#
oozieProjectRoot=${nameNode}/auction/scripts/oozie/homes
oozie.coord.application.path=${oozieProjectRoot}/coord_clean.xml
appPath=${oozieProjectRoot}/sub_workflows
#
timeout=1440
concurrency_level=4
execution_order=FIFO
Highlighted

Re: execution of pig script using oozie fails, but succeeds when submitted directly

Mentor

@Karthik Karuppaiya, I updated my answer, I didn't read your message correctly, you are missing

hive.metastore.uris

Don't have an account?
Coming from Hortonworks? Activate your account here