Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Error reading data from parquet file ( I am pretty new to Pig)

Contributor
Pig Stack Trace
---------------
ERROR 1200: can't convert optional int96 uploadTime
Failed to parse: can't convert optional int96 uploadTime
  at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:201)
  at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1707)
  at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1680)
  at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
  at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1061)
  at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
  at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
  at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
  at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
  at org.apache.pig.Main.run(Main.java:560)
  at org.apache.pig.Main.main(Main.java:170)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
  at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: parquet.pig.SchemaConversionException: can't convert optional int96 uploadTime
  at parquet.pig.PigSchemaConverter.convertFields(PigSchemaConverter.java:108)
  at parquet.pig.PigSchemaConverter.convert(PigSchemaConverter.java:84)
  at parquet.pig.TupleReadSupport.getPigSchemaFromMultipleFiles(TupleReadSupport.java:70)
  at parquet.pig.ParquetLoader.initSchema(ParquetLoader.java:204)
  at parquet.pig.ParquetLoader.setInput(ParquetLoader.java:108)
  at parquet.pig.ParquetLoader.getSchema(ParquetLoader.java:188)
  at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)
  at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
  at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901)
  at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
  at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
  at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
  at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
  at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
  at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
  ... 16 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: NYI
  at parquet.pig.PigSchemaConverter$1.convertINT96(PigSchemaConverter.java:148)
  at parquet.pig.PigSchemaConverter$1.convertINT96(PigSchemaConverter.java:120)
  at parquet.schema.PrimitiveType$PrimitiveTypeName$7.convert(PrimitiveType.java:219)
  at parquet.pig.PigSchemaConverter.getSimpleFieldSchema(PigSchemaConverter.java:119)
  at parquet.pig.PigSchemaConverter.getFieldSchema(PigSchemaConverter.java:222)
  at parquet.pig.PigSchemaConverter.convertFields(PigSchemaConverter.java:99)
  ... 30 more
================================================================================
1 ACCEPTED SOLUTION

Contributor

Actually they gave me a corrupted file which was causing the issue...between i have another question can i store the output from parquet as csv ?

View solution in original post

10 REPLIES 10

Mentor

@Pradeep Allu

it's complaining about timestamps, what version of Parquet jars are you using?

Contributor

i am using parquet-pig-bundle-1.5.0.jar.

Mentor

@Pradeep Allu

there's discussion in this thread re: this. Can you cast the field to string perhaps or is it absolutely necessary for you to use timestamp? Can you please paste the pig script.

Contributor

I am ok casting the field to string---- Below is my script

A = load 's3://XXXX-analytics-etl/event_store/Count_2015-06-15' USING parquet.pig.ParquetLoader() as (class: chararray,localTime: chararray,updated: chararray,tzone: double,rid: chararray,value: double,time: chararray,user: chararray,patchId: long);

Mentor
@Pradeep Allu

out of curiosity, can you remove time: charrarray and value: double and try again. If it works, add value next and if that works, try again with time. If that fails, use a different fieldname for that.

Contributor

tried it got the same error

Contributor

A = load 's3://xxs-analytics-etl/event_store/xxxx_2015-06-15' USING parquet.pig.ParquetLoader() as (user: chararray);

I tried just reading one column but got the same error

Mentor

@Pradeep Allu are you sure the dataset is in parquet format?

Contributor

Actually they gave me a corrupted file which was causing the issue...between i have another question can i store the output from parquet as csv ?

HI @Pradeep Allu please post your other question as a new question

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.