Support Questions
Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Convert Hive Column type String to Hive Column type struct

New Contributor

Dear community I have executed a SELECT * to fetch all the tables from a hive table using Nifi 1.6 SelectHiveQL processor.

The problem I have is the source table has a column (satellite_metadata) which is type struct<record_source:string,load_time:timestamp,checksum:string,device_hash:string>).

However in the flowfile returned by SelectHiveQL the type of column satellite_metadata is string.

After fetching the data, I am converting AVRO to ORC, storing the file in HDFS, extracting the HIVE DDL from the flowfile and creating the hive table.

The content of column satellite_metadata for a single record looks like this:

{"record_source":"RAB","load_time":"2019-01-18 03:16:26.93","checksum":"11396be4b6cfe13542d3d6708546a4a4","device_hash":"2eac97fce07480194301e482680fe05e"}

I tried to define the correct structure at CREATE TABLE and also afterwards using ALTER TABLE But I get the following error when I try to do SELECT *

ORC does not support type conversion from file type string (14) to reader type struct

Any ideas how can I set the proper type for this struct column?

I've tried also to store the avro file in HDFS, but later I don't know how to create an external hive table that can read the schema from the .avro file.


Super Guru

The hard part here is that Hive returns STRUCT columns as JSON strings, so even if we can parse the JSON, we've lost the type information. It's possible we can retrieve it from the metadata and (if so) create a nested record from the results. Please feel free to file a Jira for this enhancement.