Support Questions

Find answers, ask questions, and share your expertise

inconsistent metadata between impala daemons in 5.15


Environment : CDH 5.15

Impala version : impalad version 2.12.0-cdh5.15.0 RELEASE

OS: Centos 6.10

Table size : 88TB

Partitions : 7K

Type : Parquet, file size compacted 256MB


We ingest data every minute to the table partition and run refresh table to load the data. There is a separate compaction process that runs every hour and merges smaller files into big. The set up was working fine for months until recently we are running into a strange issue of inconsistent behavior between few nodes. Randomly some nodes appears to have incosistent metadata i.e. even though refresh table command ran successfully some nodes still didn't have correct files so they referred older files for those partitions.


We tried invalidating metadata ( followed by describe table to fix metada) but it didn't help. Even re-running refresh doesn't help all the time. We need some help/points to figure out the issue.


* Is there a way to check if all Impala nodes have stale metadata ? 

* How to fix metadata for individual node ?Is there a command ?

* Anyone has faced similar issue ? Can you share your experience and fix ?



Hi @sunilosunil,

First question I have is do you have Load Balancer for Impala? If you have and not using SYNC_DDL query options, there are chances that you might switch to another impala daemon too fast and access stale metadata as the update on one host might not be synced to other impala daemons.



We do have load balancer in Impala. In our case issue happen to be cross referenced VIP in different data center causing load on metadata servers. But still some of the capabilities of metadata status is either missing or undocumented or may be I'm unaware of. SYNC_DDL makes the DDL query extremely slow ... the performance drops from something that runs in 3 seconds to 5 minutes. We have 30 node cluster and I am hoping SYNC_DDL doesn't mean sequenctial execution of DDL ( even that doens't ad up ).


Is there a way to identify which node needs metadata refresh ? ( or which impalad has invlid metadata .... i.e. time when last metadata refresh occured ? )


With only 30 nodes, SYNC_DDL makes query run from 3 seconds to 5 minutes does not make sense. Looks like there are bottlenecks somewhere in the cluster.

Is it possible to shutdown a few impala daemons to test and see which host(s) might contribute to the slow sync? Or maybe the catalogd server itself?

Currently I do not think there is a way to identify which node needs metadata refresh, as it is internal to impala and such information is not exposed.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.