Support Questions

Find answers, ask questions, and share your expertise

Error reading data from parquet file ( I am pretty new to Pig)

avatar
Rising Star
Pig Stack Trace
---------------
ERROR 1200: can't convert optional int96 uploadTime
Failed to parse: can't convert optional int96 uploadTime
  at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:201)
  at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1707)
  at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1680)
  at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
  at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1061)
  at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
  at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
  at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
  at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
  at org.apache.pig.Main.run(Main.java:560)
  at org.apache.pig.Main.main(Main.java:170)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
  at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: parquet.pig.SchemaConversionException: can't convert optional int96 uploadTime
  at parquet.pig.PigSchemaConverter.convertFields(PigSchemaConverter.java:108)
  at parquet.pig.PigSchemaConverter.convert(PigSchemaConverter.java:84)
  at parquet.pig.TupleReadSupport.getPigSchemaFromMultipleFiles(TupleReadSupport.java:70)
  at parquet.pig.ParquetLoader.initSchema(ParquetLoader.java:204)
  at parquet.pig.ParquetLoader.setInput(ParquetLoader.java:108)
  at parquet.pig.ParquetLoader.getSchema(ParquetLoader.java:188)
  at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)
  at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
  at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901)
  at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
  at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
  at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
  at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
  at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
  at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
  ... 16 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: NYI
  at parquet.pig.PigSchemaConverter$1.convertINT96(PigSchemaConverter.java:148)
  at parquet.pig.PigSchemaConverter$1.convertINT96(PigSchemaConverter.java:120)
  at parquet.schema.PrimitiveType$PrimitiveTypeName$7.convert(PrimitiveType.java:219)
  at parquet.pig.PigSchemaConverter.getSimpleFieldSchema(PigSchemaConverter.java:119)
  at parquet.pig.PigSchemaConverter.getFieldSchema(PigSchemaConverter.java:222)
  at parquet.pig.PigSchemaConverter.convertFields(PigSchemaConverter.java:99)
  ... 30 more
================================================================================
1 ACCEPTED SOLUTION

avatar
Rising Star

Actually they gave me a corrupted file which was causing the issue...between i have another question can i store the output from parquet as csv ?

View solution in original post

10 REPLIES 10

avatar
Master Mentor

@Pradeep Allu

it's complaining about timestamps, what version of Parquet jars are you using?

avatar
Rising Star

i am using parquet-pig-bundle-1.5.0.jar.

avatar
Master Mentor

@Pradeep Allu

there's discussion in this thread re: this. Can you cast the field to string perhaps or is it absolutely necessary for you to use timestamp? Can you please paste the pig script.

avatar
Rising Star

I am ok casting the field to string---- Below is my script

A = load 's3://XXXX-analytics-etl/event_store/Count_2015-06-15' USING parquet.pig.ParquetLoader() as (class: chararray,localTime: chararray,updated: chararray,tzone: double,rid: chararray,value: double,time: chararray,user: chararray,patchId: long);

avatar
Master Mentor
@Pradeep Allu

out of curiosity, can you remove time: charrarray and value: double and try again. If it works, add value next and if that works, try again with time. If that fails, use a different fieldname for that.

avatar
Rising Star

tried it got the same error

avatar
Rising Star

A = load 's3://xxs-analytics-etl/event_store/xxxx_2015-06-15' USING parquet.pig.ParquetLoader() as (user: chararray);

I tried just reading one column but got the same error

avatar
Master Mentor

@Pradeep Allu are you sure the dataset is in parquet format?

avatar
Rising Star

Actually they gave me a corrupted file which was causing the issue...between i have another question can i store the output from parquet as csv ?