Created 09-07-2021 08:39 AM
Hello all,
I am getting the below error when our application (Java) tries to execute an 'ADD partition' after 'DROP partition IF EXISTS' command in Hive:-
"""
Caused by: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. AlreadyExistsException(message:Partition already exists: Partition(values:[xxxx, yyyy, zzzz-zz-zz, tttttttt], dbName:<db_name>, tableName:<tbl_name>
"""
Sequence of commands executed:-
Thread A: USE <db_name>
Thread B: ALTER TABLE <tbl_name> DROP IF EXISTS PARTITION(`i_id` ='xxxx', `c_id` ='yyyy', `dt` ='zzzz-zz-zz', `time` ='tttttttt') PURGE
Thread C: ALTER TABLE <tbl_name> ADD PARTITION(`i_id` ='xxxx', `c_id` ='yyyy', `dt` ='zzzz-zz-zz', `time` ='tttttttt')
Note:-
Cluster - 5 Mgr nodes (Hive deployed on 3 of them), 3 Utils and 30 DNs
There are no signs of any latency issues in ambari-server alerts/logs during the timeframe (+- 30 mins) when the above error/exception occurs.
It is an EXTERNAL hive table
This is a random occurrence (twice a week), associated with separate tables (not the same table everytime).
Would appreciate any Help to understand what might be causing this issue (Partition ALready Exists) and if I need to look into any other logs to find out the reason behind this.
Created 09-08-2021 10:24 PM
This one is tough to diagnose without access to the Java source code or any other indication that the application has been designed with full regard for how concurrency works when it comes to databases, but I would say just based on the information you've supplied in this post that you want to first eliminate the most obvious possibility—that the problem is a race condition, in which case Thread C is starting to execute before the code in Thread B has fully completed executing.
I'd recommend you rewrite the Java code so that the DDL commands operate sequentially and from a single thread as a first step and see if the "random occurrence" stops happening.
Created on 09-13-2021 03:32 AM - edited 09-13-2021 03:32 AM
Thanks for your reply Bill!!
Though the threads are separate for DROP and ADD partition but I didn't find any race condition/issue in hive-server2 logs when this error occurred.
DROP partition had completed executing before ADD partition command started processing. Also, DROP partition is just a precautionary step in our application (only helpful in case of reruns or duplicate processing) as daily we receive a new file once and respectively a NEW partition gets created for this new file. Hence, I am pretty sure this is not the actual reason.
I assume that this has something to do with Hive retrying internally to execute the ADD partition causing it to fail in one of the retries but I don't have any proof to establish this theory (Nothing in hive-server2 logs as such to determine this could be the reason).