Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Super Guru

SYMPTOM:

same hive insert query is failing with following different exception intermittently.

grep 'Failed to move' hive.log 
ERROR [main]: metadata.Hive (Hive.java:copyFiles(2652)) - Failed to move: java.util.ConcurrentModificationException 
ERROR [main]: metadata.Hive (Hive.java:copyFiles(2652)) - Failed to move: java.util.ConcurrentModificationException 
ERROR [main]: metadata.Hive (Hive.java:copyFiles(2652)) - Failed to move: java.util.ConcurrentModificationException 
ERROR [main]: metadata.Hive (Hive.java:copyFiles(2652)) - Failed to move: java.util.NoSuchElementException 
ERROR [main]: metadata.Hive (Hive.java:copyFiles(2652)) - Failed to move: java.util.NoSuchElementException 
ERROR [main]: metadata.Hive (Hive.java:copyFiles(2652)) - Failed to move: java.lang.ArrayIndexOutOfBoundsException: -3 

ROOT CAUSE:

earlier it was observed that if query execution time is small but the Move Task which copies the part file to destination directory is actually taking too long to complete if the destination directory has too many partitions.In HDP 2.5, hive community introduced move task parallelism with default 15 concurrent threads. during the copy phase there is some race condition at metastore failing the query with different exceptions.

WORKAROUND:

disable move task parallelism by setting hive.mv.files.thread=0

RESOLUTION:

To fix it get a patch for https://issues.apache.org/jira/browse/HIVE-15355

3,188 Views
Comments

Does setting hive.mv.file.thread=0 reduce the performance of the insert query.Can you explain what does setting this configration has to do with HDP 2.5 upgrade? Is the move task parallelism not present in the previous version