Created on 10-09-2019 05:30 PM - last edited on 10-09-2019 10:14 PM by ask_bill_brooks
Basically I want to create a table in impala with complex data types and insert data into it. To load this table I need to generate parquet files with Complex data types. Any help is really appreciated. Thanks
basically i want to create a table something like this
DESCRIBE struct_demo; +-------------------+--------------------------+ | name | type | +-------------------+--------------------------+ | id | bigint | | name | string | | employee_info | struct< | | | employer:string, | | | id:bigint, | | | address:string | | | > | | places_lived | array<struct< | | | street:string, | | | city:string, | | | country:string | | | >> | | memorable_moments | map<string,struct< | | | year:int, | | | place:string, | | | details:string | | | >> | | current_address | struct< | | | street_address:struct< | | | street_number:int, | | | street_name:string, | | | street_type:string | | | >, | | | country:string, | | | postal_code:string | | | > |
so as u see few columns are structs and maps. How can we generate parquet file with this kind of data.
Created 10-11-2019 06:32 AM
Hi @Nisha2019,
This example seems like a snippet from our documentation here. Just above this example DESCRIBE statement there is a sample CREATE TABLE query that generates this table schema, please see bellow.
As per ingesting data into these tables, Impala does not support creating data with complex type columns currently, Loading Data Containing Complex Types describes it in more detail. Additionally, some more information can be found in the Complex type considerations chapter.
Hive does not support inserting values to a parquet complex type one-by-one either, but there are two solutions:
CREATE TABLE struct_demo ( id BIGINT, name STRING, -- A STRUCT as a top-level column. Demonstrates how the table ID column -- and the ID field within the STRUCT can coexist without a name conflict. employee_info STRUCT < employer: STRING, id: BIGINT, address: STRING >, -- A STRUCT as the element type of an ARRAY. places_lived ARRAY < STRUCT <street: STRING, city: STRING, country: STRING >>, -- A STRUCT as the value portion of the key-value pairs in a MAP. memorable_moments MAP < STRING, STRUCT < year: INT, place: STRING, details: STRING >>, -- A STRUCT where one of the fields is another STRUCT. current_address STRUCT < street_address: STRUCT <street_number: INT, street_name: STRING, street_type: STRING>, country: STRING, postal_code: STRING > ) STORED AS PARQUET;