Support Questions
Find answers, ask questions, and share your expertise

Hive ARRAY column INSERT INTO SELECT FROM exception

New Contributor

Is there a limit to the size of an array that can be copied using INSERT INTO SELECT FROM?


I have several source Hive EXTERNAL tables, stored as ORC files, that have ARRAY<DOUBLE> columns.  One such table has an array with up to 8192 values.  Another table has an array with exactly 1024 values in all rows. 

 

SELECTing from either table gives the expected results.

 

INSERT INTO another_table SELECT FROM  source_table succeeds for the table with 1024 values, and fails with an "ArrayIndexOutOfBoundsException: 1024" for the table that has up to 8192 values in some rows.  

I'm on HDP 3.1.4.0-315 with HIVE 3.1.0

 

Stack track snippet:

Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storePrimitiveRowColumn(VectorDeserializeRow.java:568)
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeComplexFieldRowColumn(VectorDeserializeRow.java:778)
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeListRowColumn(VectorDeserializeRow.java:822)
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeRowColumn(VectorDeserializeRow.java:938)
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.java:1360)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:480)
... 19 more


Thanks!

1 ACCEPTED SOLUTION

Hi @ToddP 

Can you try after setting below?

set hive.vectorized.execution.enabled=false;

View solution in original post

2 REPLIES 2

Hi @ToddP 

Can you try after setting below?

set hive.vectorized.execution.enabled=false;

New Contributor

That worked! Thanks!

Cloudera suggested a patch to Hive found here: https://issues.apache.org/jira/browse/HIVE-23501

But this is much easier! Thanks again!

; ;