Support Questions

Find answers, ask questions, and share your expertise

Who agreed with this topic

File has an invalid version number. This could be due to stale metadata

avatar
New Contributor

I am running a CDH distribution (version 5.6.0) with Impala (version 2.4.0).

I have some Parquet files stored in HDFS. Next, I have loaded these files into an Impala external table. Upon executing the following query all the files are successfully listed:

[cloudera-impala-dn0.eastus.cloudapp.azure.com:21000] > show files in parquettable;

Also, the metadata is correct (checked by executing describe parquettable).

The stats of the table are:

[cloudera-impala-dn0.eastus.cloudapp.azure.com:21000] > show table stats parquettable;

 

Rows | Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location

-1 | 838 | 249.64GB | NOT CACHED | NOT CACHED | PARQUET | false | hdfs://cloudera-impala-mn0.eastus.cloudapp.azure.com:8020/user/root/big_data

Executing the following query:

[cloudera-impala-dn0.eastus.cloudapp.azure.com:21000] > select count(*) from parquettable;

results in the following WARNING, but without any output result or error:

File 'hdfs://cloudera-impala-mn0.eastus.cloudapp.azure.com:8020/user/root/big_data/part-r-00001-7c29b85c-bd1f-420e-8834-96300076a92d.gz.parquet' has an invalid version number: ▒.F/ This could be due to stale metadata. Try running "refresh default.parquettable".

Running refresh default.parquettable did not have any effect.

Who agreed with this topic