Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Writing complex types using ParquetOutputFormat

Writing complex types using ParquetOutputFormat

Explorer

We have a Map-Reduce job that uses ParquetOutputFormat to write parquet files that are subsequently used for Impala queries. We would like to add a column that is a complex type:

widgets ARRAY<STRUCT<c1:STRING,c2:STRING,c3:INT,c4:INT>>  

 

I'm wondering if anyone has done that and if there is some example code that I can see.

 

I tried googling, but was unsuccessful. I know it can be done, because Hive will do it. Thanks in advance

4 REPLIES 4
Highlighted

Re: Writing complex types using ParquetOutputFormat

Master Collaborator

Please correct me if I misunderstood your question.

 

The Impala docs should tell you how to create and query a table with complex types in Impala. Please be aware that Impala can only read complex types in Parquet. Impala cannot write complex types.

 

Documentation:

https://www.cloudera.com/documentation/enterprise/latest/topics/impala_complex_types.html

 

Examples:

https://blog.cloudera.com/blog/2015/11/new-in-cloudera-enterprise-5-5-support-for-complex-types-in-i...

 

Re: Writing complex types using ParquetOutputFormat

Master Collaborator

Or is your question on how to write a MapReduce job to produce a complex type using the ParquetOutputFormat?

Re: Writing complex types using ParquetOutputFormat

Explorer

Yes.  That is my question.  

 

We currently use parquet.hadoop.ParquetOutputFormat with parquet.example.data.Group (a somewhat suspicious package name).  If another mechanism is better, I'd be happy to hear about it.

 

 

Re: Writing complex types using ParquetOutputFormat

Master Collaborator

Sorry, I don't know how to do that with MR.

 

Hopefully somebody more knowledgable can chime in and help you.

Don't have an account?
Coming from Hortonworks? Activate your account here