Created on 04-21-2022 01:21 PM - edited 04-21-2022 01:24 PM
Is there a limit to the size of an array that can be copied using INSERT INTO SELECT FROM?
I have several source Hive EXTERNAL tables, stored as ORC files, that have ARRAY<DOUBLE> columns. One such table has an array with up to 8192 values. Another table has an array with exactly 1024 values in all rows.
SELECTing from either table gives the expected results.
INSERT INTO another_table SELECT FROM source_table succeeds for the table with 1024 values, and fails with an "ArrayIndexOutOfBoundsException: 1024" for the table that has up to 8192 values in some rows.
I'm on HDP 3.1.4.0-315 with HIVE 3.1.0
Stack track snippet:
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storePrimitiveRowColumn(VectorDeserializeRow.java:568)
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeComplexFieldRowColumn(VectorDeserializeRow.java:778)
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeListRowColumn(VectorDeserializeRow.java:822)
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeRowColumn(VectorDeserializeRow.java:938)
at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.java:1360)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:480)
... 19 more
Thanks!
Created 04-22-2022 11:31 AM
Created 04-22-2022 11:31 AM
Created 04-25-2022 06:04 AM
That worked! Thanks!
Cloudera suggested a patch to Hive found here: https://issues.apache.org/jira/browse/HIVE-23501
But this is much easier! Thanks again!