Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Joining Two Tables - One with Complex Types

Joining Two Tables - One with Complex Types

Explorer

I am attempting to join two tables - one of which contains complex types. I am not trying to join on the complex type field.

 

For example, table t1 has fields (id int, first_name string, last_name string) and table t2 has fields (idNumber int, scores array<int>). I would like a table with the columns t1.id, t1.first_name, t1.last_name, t2.scores. Is it possible to join t1 to t2 on t1.id=t2.idNumber and get the resulting table with those four columns?

 

The only way I know how is using a CREATE TABLE AS (SELECT ...), but I can't include complex types in the select list.

 

2 REPLIES 2
Highlighted

Re: Joining Two Tables - One with Complex Types

Master Collaborator

It sounds like you are trying to CREATE a new table with complex types and then INSERT into that new table (or use a shorthand like CREATE TABLE AS SELECT). I'm afraid Impala currently cannot INSERT into tables with complex types.

 

See "Limitations and Restrictions for Complex Types" here:

https://www.cloudera.com/documentation/enterprise/latest/topics/impala_complex_types.html

 

If you really don't want to INSERT but want to perform a query against the join result, then I think Impala should be able to do that in most cases.

 

Re: Joining Two Tables - One with Complex Types

Explorer

That's exactly what I'm trying to do. Really, I just want to perform a join of these two datasets and write them to a parquet file for later processing.

 

I currently have a Spark job running to perform this, but was hoping to compare performance between the join in Spark and the join in Impala.