Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

AvroStorage with mapreduce and java.lang.RuntimeException: could not instantiate

Solved Go to solution

Re: AvroStorage with mapreduce and java.lang.RuntimeException: could not instantiate

Expert Contributor

thats ok ;-)

Re: AvroStorage with mapreduce and java.lang.RuntimeException: could not instantiate

Expert Contributor

thats strange... it works for me.

grunt> sensitiveSet = load '/t-spool-dir/Test-20160129-1401822-ttp.avro' USING AvroStorage();

grunt> nonSensSet = load '/d-spool-dir/Test-20160129-1401822-lake.avro' USING AvroStorage();

grunt> outputSet = join sensitiveSet by Row_ID, nonSensSet by Row_ID;grunt> outputSet = distinct outputSet;

grunt> outputSet = foreach outputSet generate nonSensSet::name,nonSensSet::customerId,sensitiveSet::VIN,sensitiveSet::Birthdate,nonSensSet::Mileage,nonSensSet::Fuel_Consumption;grunt> 

dump outputSet;

("Kina Buttars",12452346,"WBA32649710927373","1968-08-14",68,10.551)

("Caren Rodman",18853438,"WBA56064572124841","1987-01-24",96,6.779)

("Tierra Bork",89673290,"WBA69315467645466","1958-11-22",52,10.109)

("Thelma Steve",97170856,"WBA73739033913927","1985-12-03",98,5.081)

.....

Re: AvroStorage with mapreduce and java.lang.RuntimeException: could not instantiate

Mentor
@John Smith

your issue is with some reserved word in avro schema. Here's what I'm getting

grunt> nonSensSet = load '/user/root/Test-20160129-1401822-lake.avro' USING AvroStorage();
grunt> sensitiveSet = load '/user/root/Test-20160129-1401822-ttp.avro' using AvroStorage();
grunt> outputSet = join sensitiveSet by Row_ID, nonSensSet by Row_ID;
grunt> outputSet = distinct outputSet;
grunt> outputSet = foreach outputSet generate nonSensSet::name,nonSensSet::customerId,nonSensSet::Mileage,nonSensSet::Fuel_Consumption,sensitiveSet::VIN,sensitiveSet::Birthdate;
grunt> store outputSet into 'avrostorage' using AvroStorage();
2016-01-29 15:27:00,682 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2116:
<line 6, column 0> Output Location Validation Failed for: 'hdfs://sandbox.hortonworks.com:8020/user/root/avrostorage More info to follow:
Pig Schema contains a name that is not allowed in Avro
Details at logfile: /root/pig-upload/pig_1454081182813.log

I saved the outputSet successfully as PigStorage(','); so I can't comment what the issue is. Something intricate about Avro.

Re: AvroStorage with mapreduce and java.lang.RuntimeException: could not instantiate

Mentor

@John Smith I just read the AvroStorage wiki, they do say they have limited support for union schemas and record types, I guess the only thing I can comment on is that AvroStorage is limited in its functionality. Perhaps you'd want to look at other Storage Formats.

Re: AvroStorage with mapreduce and java.lang.RuntimeException: could not instantiate

Expert Contributor

Re: AvroStorage with mapreduce and java.lang.RuntimeException: could not instantiate

Mentor

@John Smith like I said, I tried with PigStorage and it worked fine, take a look at OrcStorage, which is pretty good columnar format for Pig, Hive and Spark (meaning you can query the same table from either tool natively), there are many formats, I can't recommend anything unless we know your use case. I do like Avro but sometimes it's driving me insane :). Try looking at the schemas, you can probably still get it working, I just don't have time to look at it. If you do find a solution, post here so we could all learn!

Re: AvroStorage with mapreduce and java.lang.RuntimeException: could not instantiate

Expert Contributor

is there anything important in

Details at logfile: /root/pig-upload/pig_1454081182813.log

Re: AvroStorage with mapreduce and java.lang.RuntimeException: could not instantiate

Mentor

same error as I pasted. @John Smith

Re: AvroStorage with mapreduce and java.lang.RuntimeException: could not instantiate

Expert Contributor
/**  * Translates a name in a pig schema to an acceptable Avro name, or  * throws an error if the name can't be translated.  * @param name The variable name to translate.  * @param doubleColonsToDoubleUnderscores Indicates whether to translate  * double colons to underscores or throw an error if they are encountered.  * @return A name usable by Avro.  * @throws IOException If the name is not compatible with Avro.  */  private static String toAvroName(String name,  final Boolean doubleColonsToDoubleUnderscores) throws IOException {  if (name == null) {  return null;  }  if (doubleColonsToDoubleUnderscores) {  name = name.replace("::", "__");  }  if (name.matches("[A-Za-z_][A-Za-z0-9_]*")) {  return name;  } else {  throw new IOException(  "Pig Schema contains a name that is not allowed in Avro");  }  }

This is the check, and i dont have any characters <>

A-Za-z_][A-Za-z0-9_

defined as part of the schema in pig.

Btw i dont know why but everything i paste here some CODE/ and click to formate it into code its completely messed up, all newlines are removed... .

Re: AvroStorage with mapreduce and java.lang.RuntimeException: could not instantiate

Mentor

@John Smith excellent, you went to the source code. It's actually [A-Za-z_][A-Za-z0-9_]* so plus asterisc. So if you did check and got no results of it, perhaps you discovered a bug? Once you're 100% sure, I suggest you file a Jira with Pig project.

Don't have an account?
Coming from Hortonworks? Activate your account here