Support Questions

Find answers, ask questions, and share your expertise

Complex type HIVE: select key/value of a map in a struct and array

avatar
Explorer

Hi,

I created this:

 CREATE TABLE schoolparquet (
tipo STRING,
paese STRING,
per_paese ARRAY<STRUCT<sesso:STRING,anni:MAP<STRING,STRING>>>
) STORED AS
PARQUET;

 

>>describe schoolparquet.per_paese.item
| name | type
| sesso | string
| anni | map<string,string>

 

I'm trying to write a query (Impala) that extract key/values from map too, starting from this:

 

select tipo, paese, per_paese.item.sesso
from schoolparquet,
schoolparquet.per_paese;

This works fine.

 

I want to do something like this:

 

select tipo, paese, per_paese.item.sesso,
per_paese.item.anni.key, per_paese.item.anni.value
from schoolparquet,
schoolparquet.per_paese,
schoolparquet.per_paese.anni

 

But I have this error :

ERROR: AnalysisException: Illegal column/field reference 'per_paese.item.anni.value' with intermediate collection 'per_paese' of type 'ARRAY<STRUCT<sesso:STRING,anni:MAP<STRING,STRING>>>'

 

Help please!

 

Thank you

Maria

 

 

 

1 REPLY 1

avatar

You will need to use both dot notation and join notation:

 

https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_complex_types.html

 

When complex types are nested inside each other, you use a combination of joins, pseudocolumn names, and dot notation to refer to specific fields at the appropriate level. This is the most frequent form of query syntax for complex columns, because the typical use case involves two levels of complex types, such as an ARRAY of STRUCT elements.

SELECT id, phone_numbers.area_code FROM contact_info_many_structs INNER JOIN contact_info_many_structs.phone_numbers phone_numbers LIMIT 3;

You can express relationships between ARRAY and MAP columns at different levels as joins. You include comparison operators between fields at the top level and within the nested type columns so that Impala can do the appropriate join operation.