Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Joining Two Tables - One with Complex Types

Highlighted

Joining Two Tables - One with Complex Types

Explorer

I am attempting to join two tables - one of which contains complex types. I am not trying to join on the complex type field.

 

For example, table t1 has fields (id int, first_name string, last_name string) and table t2 has fields (idNumber int, scores array<int>). I would like a table with the columns t1.id, t1.first_name, t1.last_name, t2.scores. Is it possible to join t1 to t2 on t1.id=t2.idNumber and get the resulting table with those four columns?

 

The only way I know how is using a CREATE TABLE AS (SELECT ...), but I can't include complex types in the select list.

 

2 REPLIES 2
Highlighted

Re: Joining Two Tables - One with Complex Types

Master Collaborator

It sounds like you are trying to CREATE a new table with complex types and then INSERT into that new table (or use a shorthand like CREATE TABLE AS SELECT). I'm afraid Impala currently cannot INSERT into tables with complex types.

 

See "Limitations and Restrictions for Complex Types" here:

https://www.cloudera.com/documentation/enterprise/latest/topics/impala_complex_types.html

 

If you really don't want to INSERT but want to perform a query against the join result, then I think Impala should be able to do that in most cases.

 

Re: Joining Two Tables - One with Complex Types

Explorer

That's exactly what I'm trying to do. Really, I just want to perform a join of these two datasets and write them to a parquet file for later processing.

 

I currently have a Spark job running to perform this, but was hoping to compare performance between the join in Spark and the join in Impala.

Don't have an account?
Coming from Hortonworks? Activate your account here