Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Getting Exception on storing Null value through AvroStorage using PIG

Highlighted

Getting Exception on storing Null value through AvroStorage using PIG

New Contributor

ENVIORNMENT:

Hadoop 0.20.2-cdh3u5
Apache Pig version 0.8.1-cdh3u5
java version "1.6.0_27"

 

PROBLEM STATEMENT :

 

Getting exception on storing null valued record/tupple as avro.
The input file having one column with long values (one of them is null means nothing) and when I am trying to store the data in avro format ,it throws error.

 

input file: /home/hadoop/work/sudhir/AvroAnalysis/input/TSV_uncompressed/part*
content(The sixth record is null): 
2037179309
2037179338
2037179367
2037179433
2037179437


2037179449
2037179547
2037179631


Please suggest if I am missing any thing some where as per the bellow codebase or else please provide the patch.
******My code base.
REGISTER /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/snappy-java-1.0.4.1.jar
REGISTER /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/avro-1.7.5.jar
REGISTER /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/json-simple-1.1.jar;
REGISTER /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/piggybank.jar;
REGISTER /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/jackson-core-asl-1.5.5.jar;
REGISTER /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/jackson-mapper-asl-1.5.5.jar;
– The input file only have 1 column (normal TEXT data i.e TSV format) and the file having a null value means nothing
A = load '/home/hadoop/work/sudhir/AvroAnalysis/input/TSV_uncompressed/part*' using PigStorage('\t') as (USER_ID:long);
– The soutput to be stored in avro data format
STORE A INTO '/home/hadoop/work/sudhir/AvroAnalysis/output/AvroStore/' USING org.apache.pig.piggybank.storage.avro.AvroStorage('schema','{"namespace":"com.sudhir.schema.users.avro","type":"long","name":"users_avro","doc":"Avro storing with schema using Pig.","fields":[
{"name":"USER_ID","type":["null","long"],"default":null}
]}');
*******Getting Error like:
INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 2997: Unable to recreate exception from backed error: org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.NullPointerException: null of long
ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!

 

Thanks in advance.

 

2 REPLIES 2

Re: Getting Exception on storing Null value through AvroStorage using PIG

Contributor
I'm not sure the issue is just with NULLs. Your schema type is defined as "long" where is should be "record" since it contains fields.

So replace:
'schema','{"namespace":"com.sudhir.schema.users.avro","type":"long","name":"users_avro"....

with:
'schema','{"namespace":"com.sudhir.schema.users.avro","type":"record","name":"users_avro"....
Highlighted

Re: Getting Exception on storing Null value through AvroStorage using PIG

New Contributor
Thanks Shapira for reply.
I hv already experimented with schema type as record,but it throws error like "record type can't be cast to long".
The error is as the schema having only one field which is long.
so th schema type long is working properly.
its also working smoothly if the input data doesn't hv any null value.
Don't have an account?
Coming from Hortonworks? Activate your account here