Environment : CDH 5.15
Impala version : impalad version 2.12.0-cdh5.15.0 RELEASE
OS: Centos 6.10
Table size : 88TB
Partitions : 7K
Type : Parquet, file size compacted 256MB
We ingest data every minute to the table partition and run refresh table to load the data. There is a separate compaction process that runs every hour and merges smaller files into big. The set up was working fine for months until recently we are running into a strange issue of inconsistent behavior between few nodes. Randomly some nodes appears to have incosistent metadata i.e. even though refresh table command ran successfully some nodes still didn't have correct files so they referred older files for those partitions.
We tried invalidating metadata ( followed by describe table to fix metada) but it didn't help. Even re-running refresh doesn't help all the time. We need some help/points to figure out the issue.
* Is there a way to check if all Impala nodes have stale metadata ?
* How to fix metadata for individual node ?Is there a command ?
* Anyone has faced similar issue ? Can you share your experience and fix ?
We do have load balancer in Impala. In our case issue happen to be cross referenced VIP in different data center causing load on metadata servers. But still some of the capabilities of metadata status is either missing or undocumented or may be I'm unaware of. SYNC_DDL makes the DDL query extremely slow ... the performance drops from something that runs in 3 seconds to 5 minutes. We have 30 node cluster and I am hoping SYNC_DDL doesn't mean sequenctial execution of DDL ( even that doens't ad up ).
Is there a way to identify which node needs metadata refresh ? ( or which impalad has invlid metadata .... i.e. time when last metadata refresh occured ? )