Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Cannot load data in Pig using HCatalog

Highlighted

Cannot load data in Pig using HCatalog

New Contributor

Hi,

 

In Cloudera Live I'm trying to run a simple Pig script (from a sample video):

sample_08 = LOAD 'sample_08' USING org.apache.hcatalog.pig.HCatLoader();
sal = GROUP sample_08 ALL;
out = FOREACH sal GENERATE AVG(sal.salary);
DUMP out;

 

but the job exits apparantly while trying to access the table schema from HCatalog with the following error:

2014-05-03 15:17:56,171 [main] ERROR org.apache.pig.PigServer  -

     exception during parsing: Error during parsing. Cannot get schema from loadFunc org.apache.hcatalog.pig.HCatLoader
Failed to parse: Can not retrieve schema from loader org.apache.hcatalog.pig.HCatLoader@638e719c

 

Do I need hive_site.xml and if so where can I find it on Cloudera Live? Or something else?

 

Thanks

1 REPLY 1

Re: Cannot load data in Pig using HCatalog

Yes, currently the hive-site.xml is not included in Cloudera Live. It is
something reported that we will add in the next version:
http://gethue.uservoice.com/forums/247008-general/suggestions/5876065-fix-another-bug


And for more technical info:

http://gethue.com/hadoop-tutorial-how-to-access-hive-in-pig-with/

"As HCatalog needs to access the metastore, we need to specify the
hive-site.xml. Go in ‘Properties’, ‘Resources’ and add a ‘File’ pointing to
the hive-site.xml uploaded on HDFS."


Romain