Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

File has an invalid version number. This could be due to stale metadata

Highlighted

File has an invalid version number. This could be due to stale metadata

New Contributor

I am running a CDH distribution (version 5.6.0) with Impala (version 2.4.0).

I have some Parquet files stored in HDFS. Next, I have loaded these files into an Impala external table. Upon executing the following query all the files are successfully listed:

[cloudera-impala-dn0.eastus.cloudapp.azure.com:21000] > show files in parquettable;

Also, the metadata is correct (checked by executing describe parquettable).

The stats of the table are:

[cloudera-impala-dn0.eastus.cloudapp.azure.com:21000] > show table stats parquettable;

 

Rows | Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location

-1 | 838 | 249.64GB | NOT CACHED | NOT CACHED | PARQUET | false | hdfs://cloudera-impala-mn0.eastus.cloudapp.azure.com:8020/user/root/big_data

Executing the following query:

[cloudera-impala-dn0.eastus.cloudapp.azure.com:21000] > select count(*) from parquettable;

results in the following WARNING, but without any output result or error:

File 'hdfs://cloudera-impala-mn0.eastus.cloudapp.azure.com:8020/user/root/big_data/part-r-00001-7c29b85c-bd1f-420e-8834-96300076a92d.gz.parquet' has an invalid version number: ▒.F/ This could be due to stale metadata. Try running "refresh default.parquettable".

Running refresh default.parquettable did not have any effect.

1 REPLY 1
Highlighted

Re: File has an invalid version number. This could be due to stale metadata

New Contributor

Hi,

 

I am getting same issue when using below versions.

CDH : 6.2.

Hive  :2.1.1-cdh6.2.1

Impala : 3.2.0-cdh6.2.1

 

Trying to run Compute stats  <db_name.table_name> after running invalidate metadata  <db_name.table_name> and refresh  <db_name.table_name> commands .

 

still getting same error.

 

ERROR: File 'hdfs://name_node/abc/xyz/000001_0' has an invalid version number: 2-11
This could be due to stale metadata. Try running "refresh <db_name.table_name>".

 

My process is running for more than 100 tables. This error is occurring for only 5-6 random tables some times.

 

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here