Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Error reading data from parquet file ( I am pretty new to Pig)

avatar
Rising Star
Pig Stack Trace
---------------
ERROR 1200: can't convert optional int96 uploadTime
Failed to parse: can't convert optional int96 uploadTime
  at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:201)
  at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1707)
  at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1680)
  at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
  at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1061)
  at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
  at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
  at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
  at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
  at org.apache.pig.Main.run(Main.java:560)
  at org.apache.pig.Main.main(Main.java:170)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
  at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: parquet.pig.SchemaConversionException: can't convert optional int96 uploadTime
  at parquet.pig.PigSchemaConverter.convertFields(PigSchemaConverter.java:108)
  at parquet.pig.PigSchemaConverter.convert(PigSchemaConverter.java:84)
  at parquet.pig.TupleReadSupport.getPigSchemaFromMultipleFiles(TupleReadSupport.java:70)
  at parquet.pig.ParquetLoader.initSchema(ParquetLoader.java:204)
  at parquet.pig.ParquetLoader.setInput(ParquetLoader.java:108)
  at parquet.pig.ParquetLoader.getSchema(ParquetLoader.java:188)
  at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)
  at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
  at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901)
  at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
  at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
  at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
  at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
  at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
  at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
  ... 16 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: NYI
  at parquet.pig.PigSchemaConverter$1.convertINT96(PigSchemaConverter.java:148)
  at parquet.pig.PigSchemaConverter$1.convertINT96(PigSchemaConverter.java:120)
  at parquet.schema.PrimitiveType$PrimitiveTypeName$7.convert(PrimitiveType.java:219)
  at parquet.pig.PigSchemaConverter.getSimpleFieldSchema(PigSchemaConverter.java:119)
  at parquet.pig.PigSchemaConverter.getFieldSchema(PigSchemaConverter.java:222)
  at parquet.pig.PigSchemaConverter.convertFields(PigSchemaConverter.java:99)
  ... 30 more
================================================================================
1 ACCEPTED SOLUTION

avatar
Rising Star

Actually they gave me a corrupted file which was causing the issue...between i have another question can i store the output from parquet as csv ?

View solution in original post

10 REPLIES 10

avatar
Master Mentor

@Pradeep Allu

it's complaining about timestamps, what version of Parquet jars are you using?

avatar
Rising Star

i am using parquet-pig-bundle-1.5.0.jar.

avatar
Master Mentor

@Pradeep Allu

there's discussion in this thread re: this. Can you cast the field to string perhaps or is it absolutely necessary for you to use timestamp? Can you please paste the pig script.

avatar
Rising Star

I am ok casting the field to string---- Below is my script

A = load 's3://XXXX-analytics-etl/event_store/Count_2015-06-15' USING parquet.pig.ParquetLoader() as (class: chararray,localTime: chararray,updated: chararray,tzone: double,rid: chararray,value: double,time: chararray,user: chararray,patchId: long);

avatar
Master Mentor
@Pradeep Allu

out of curiosity, can you remove time: charrarray and value: double and try again. If it works, add value next and if that works, try again with time. If that fails, use a different fieldname for that.

avatar
Rising Star

tried it got the same error

avatar
Rising Star

A = load 's3://xxs-analytics-etl/event_store/xxxx_2015-06-15' USING parquet.pig.ParquetLoader() as (user: chararray);

I tried just reading one column but got the same error

avatar
Master Mentor

@Pradeep Allu are you sure the dataset is in parquet format?

avatar
Rising Star

Actually they gave me a corrupted file which was causing the issue...between i have another question can i store the output from parquet as csv ?