02-01-2018 11:47 AM - edited 02-01-2018 01:39 PM
As part of a testing effort I had some tables that needed to be renamed.
These tables are parquetfiles created within Impala but are stored in S3. I realize that when a table is renamed and is stored on S3, the data is moved from the original tablename to the new tablename under the covers. This seems to work just fine for tables of a certain size.
However, for my larger tables I receive the following when the SQL for rename is running:
Query: alter table comp_web_sales rename to web_sales
ImpalaRuntimeException: Error making 'alter_table' RPC to Hive Metastore:
CAUSED BY: InvalidOperationException: tpc10_s3.comp_web_sales table not found
Could not execute command: alter table comp_web_sales rename to web_sales
What I noticed is that in S3, the new folder is created and all of the files continue to move until completed and then the "old" folder is removed as one would expect.
At that point, all of the required files reside in the folder dedicated to the new table name that was being assigned in the alter statement. However, because that statement timed out, the table name inside HIVE does not change. For example: (most entries removed for brevity)
[**] > show tables;
Query: show tables
| name |
...| catalog_returns |
| comp_catalog_sales |
| comp_store_returns |
| comp_store_sales |
| comp_web_sales |
| warehouse |
| web_page |
| web_returns |
| web_site |
When i try to query the old table name, as expected, it cannot find the data:
[**] > select count(*) from comp_web_sales;
Query: select count(*) from comp_web_sales
Query submitted at: 2018-02-01 19:37:27 (...)
Query progress can be monitored at:...
Failed to open HDFS file s3a://<LOCATION>/comp_web_sales/4c4f7290ec687341-6d01caf800000007_460147606_data.63.parq
Error(2): No such file or directory
Likewise, and obviously, it cannot find the new table either:
[**] > select count(*) from web_sales;
Query: select count(*) from web_sales
Query submitted at: 2018-02-01 19:38:57 (..)
ERROR: AnalysisException: Could not resolve table reference: 'web_sales'
How do I get them changed within Impala so it recognizes the new location now that the files are moved???
-= Tara =-
Total table size (compressed) of any of the ones I'm having issues with is in the 2+TB range.