Member since
02-01-2018
2
Posts
0
Kudos Received
0
Solutions
02-01-2018
01:39 PM
This was corrected by issuing and "invalidate metadata" within Impala once the S3 transfer was complete.
... View more
02-01-2018
11:47 AM
As part of a testing effort I had some tables that needed to be renamed. These tables are parquetfiles created within Impala but are stored in S3. I realize that when a table is renamed and is stored on S3, the data is moved from the original tablename to the new tablename under the covers. This seems to work just fine for tables of a certain size. However, for my larger tables I receive the following when the SQL for rename is running: Query: alter table comp_web_sales rename to web_sales ERROR: ImpalaRuntimeException: Error making 'alter_table' RPC to Hive Metastore: CAUSED BY: InvalidOperationException: tpc10_s3.comp_web_sales table not found Could not execute command: alter table comp_web_sales rename to web_sales What I noticed is that in S3, the new folder is created and all of the files continue to move until completed and then the "old" folder is removed as one would expect. At that point, all of the required files reside in the folder dedicated to the new table name that was being assigned in the alter statement. However, because that statement timed out, the table name inside HIVE does not change. For example: (most entries removed for brevity) [**] > show tables; Query: show tables +---------------------------+ | name | +---------------------------+ ...| catalog_returns | | comp_catalog_sales | | comp_store_returns | | comp_store_sales | | comp_web_sales | ... | warehouse | | web_page | | web_returns | | web_site | +---------------------------+ When i try to query the old table name, as expected, it cannot find the data: [**] > select count(*) from comp_web_sales; Query: select count(*) from comp_web_sales Query submitted at: 2018-02-01 19:37:27 (...) Query progress can be monitored at:... WARNINGS: Failed to open HDFS file s3a://<LOCATION>/comp_web_sales/4c4f7290ec687341-6d01caf800000007_460147606_data.63.parq Error(2): No such file or directory Likewise, and obviously, it cannot find the new table either: [**] > select count(*) from web_sales; Query: select count(*) from web_sales Query submitted at: 2018-02-01 19:38:57 (..) ERROR: AnalysisException: Could not resolve table reference: 'web_sales' How do I get them changed within Impala so it recognizes the new location now that the files are moved??? Thanks, -= Tara =- Total table size (compressed) of any of the ones I'm having issues with is in the 2+TB range.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala
-
HDFS