Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

CDH Quickstart 5.4 Pig and HCatalog


CDH Quickstart 5.4 Pig and HCatalog

New Contributor

I have written a query in Pig Editor and on execution its giving an error. I am not sure how to resolve this. I am using Hue 3.7.0.

The query is:-


stock_a = LOAD 'nyse_stocks' USING org.apache.hcatalog.pig.HCatLoader();
DESCRIBE stock_a


The table nyse_stocks is created in Metastore tables. I have also added hive-site.xml file in the properties of the script.



The log that I have is:-



Apache Pig version 0.12.0-cdh5.4.2 (rexported) 
compiled May 19 2015, 17:03:41

Run pig script using for Pig version 0.8+
2015-08-11 12:23:47,368 [uber-SubtaskRunner] INFO org.apache.pig.Main - Apache Pig version 0.12.0-cdh5.4.2 (rexported) compiled May 19 2015, 17:03:41
2015-08-11 12:23:47,390 [uber-SubtaskRunner] INFO org.apache.pig.Main - Logging error messages to: /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/cloudera/appcache/application_1438833065513_0...
2015-08-11 12:23:47,738 [uber-SubtaskRunner] INFO org.apache.pig.impl.util.Utils - Default bootup file /var/lib/hadoop-yarn/.pigbootup not found
2015-08-11 12:23:48,938 [uber-SubtaskRunner] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-08-11 12:23:48,947 [uber-SubtaskRunner] INFO org.apache.hadoop.conf.Configuration.deprecation - is deprecated. Instead, use fs.defaultFS
2015-08-11 12:23:48,947 [uber-SubtaskRunner] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://quickstart.cloudera:8020
2015-08-11 12:23:48,990 [uber-SubtaskRunner] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:8032
2015-08-11 12:23:49,016 [uber-SubtaskRunner] WARN org.apache.pig.PigServer - Empty string specified for jar path


After I visit the error link above I see:-



Cannot access: /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/cloudera/appcache/application_1438833065513_0010/container_1438833065513_0010_01_000001/pig-job_1438833065513_0010.log. Note: You are a Hue admin but not a HDFS superuser (which is "hdfs").

[Errno 2] File /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/cloudera/appcache/application_1438833065513_0010/container_1438833065513_0010_01_000001/pig-job_1438833065513_0010.log not found

Also I see:-


<file script.pig, line 1, column 35> pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]



Can anyone guide me in solving this issue.


Any help would be appreciated.






Re: CDH Quickstart 5.4 Pig and HCatalog

New Contributor

When I use:-

 stock_a = LOAD '/user/hive/warehouse/nyse_stocks/NYSE-2000-2001.tsv.gz' USING PigStorage() as (exchange:chararray, stock_symbol:chararray, date:chararray, stock_price_open:float,stock_price_high:float, stock_price_low:float,stock_price_close:float,stock_volume:double, stock_price_adj_close:float);


instead of

stock_a = LOAD 'nyse_stocks' USING org.apache.hcatalog.pig.HCatLoader();



the query executes successfully.


Not sure what's the problem while using HCatalog.

Re: CDH Quickstart 5.4 Pig and HCatalog

Cloudera Employee

Hi G_Arti,


Could you try using org.apache.hive.hcatalog.pig.HCatLoader() instead of org.apache.hcatalog.pig.HCatLoader() ?


As part of HCatalog moving to the Hive project, all client facing classes were moved to from org.apache.hcatalog to

 org.apache.hive.hcatalog resulting in the org.apache.hcatalog being deprecated starting CDH 5.3.


This is documented here:




Don't have an account?
Coming from Hortonworks? Activate your account here