Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎04-28-2014

Cannot load data in Pig using HCatalog

Hi,

 

In Cloudera Live I'm trying to run a simple Pig script (from a sample video):

sample_08 = LOAD 'sample_08' USING org.apache.hcatalog.pig.HCatLoader();
sal = GROUP sample_08 ALL;
out = FOREACH sal GENERATE AVG(sal.salary);
DUMP out;

 

but the job exits apparantly while trying to access the table schema from HCatalog with the following error:

2014-05-03 15:17:56,171 [main] ERROR org.apache.pig.PigServer  -

     exception during parsing: Error during parsing. Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
Failed to parse: Can not retrieve schema from loader org.apache.hcatalog.pig.HCatLoader@638e719c

 

Do I need hive_site.xml and if so where can I find it on Cloudera Live? Or something else?

 

Thanks

Cloudera Employee
Posts: 723
Registered: ‎07-30-2013

Re: Cannot load data in Pig using HCatalog

Yes, currently the hive-site.xml is not included in Cloudera Live. It is
something reported that we will add in the next version:
http://gethue.uservoice.com/forums/247008-general/suggestions/5876065-fix-another-bug


And for more technical info:

http://gethue.com/hadoop-tutorial-how-to-access-hive-in-pig-with/

"As HCatalog needs to access the metastore, we need to specify the
hive-site.xml. Go in ‘Properties’, ‘Resources’ and add a ‘File’ pointing to
the hive-site.xml uploaded on HDFS."


Romain