Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

RESOLVED: Issue when renaming Large table stored in S3

New Contributor

As part of a testing effort I had some tables that needed to be renamed.

These tables are parquetfiles created within Impala but are stored in S3. I realize that when a table is renamed and is stored on S3, the data is moved from the original tablename to the new tablename under the covers. This seems to work just fine for tables of a certain size.


However, for my larger tables I receive the following when the SQL for rename is running:


Query: alter table comp_web_sales rename to web_sales
ImpalaRuntimeException: Error making 'alter_table' RPC to Hive Metastore:
CAUSED BY: InvalidOperationException: tpc10_s3.comp_web_sales table not found

Could not execute command: alter table comp_web_sales rename to web_sales


What I noticed is that in S3, the new folder is created and all of the files continue to move until completed and then the "old" folder is removed as one would expect.


At that point, all of the required files reside in the folder dedicated to the new table name that was being assigned in the alter statement. However, because that statement timed out, the table name inside HIVE does not change. For example: (most entries removed for brevity)

[**] > show tables;
Query: show tables
| name |
...| catalog_returns |
| comp_catalog_sales |
| comp_store_returns |
| comp_store_sales |
| comp_web_sales |
| warehouse |
| web_page |
| web_returns |
| web_site |


When i try to query the old table name, as expected, it cannot find the data:

[**] > select count(*) from comp_web_sales;
Query: select count(*) from comp_web_sales
Query submitted at: 2018-02-01 19:37:27 (...)
Query progress can be monitored at:...
Failed to open HDFS file s3a://<LOCATION>/comp_web_sales/4c4f7290ec687341-6d01caf800000007_460147606_data.63.parq
Error(2): No such file or directory


Likewise, and obviously, it cannot find the new table either:

[**] > select count(*) from web_sales;
Query: select count(*) from web_sales
Query submitted at: 2018-02-01 19:38:57 (..)
ERROR: AnalysisException: Could not resolve table reference: 'web_sales'


How do I get them changed within Impala so it recognizes the new location now that the files are moved???




-= Tara =-


Total table size (compressed) of any of the ones I'm having issues with is in the 2+TB range.




New Contributor

This was corrected by issuing and "invalidate metadata" within Impala once the S3 transfer was complete.



Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.