We have at least once a day a situation where a refresh table is running on Impala and can take about an hour to finish and it seems like it gets locked on something.
Even simple queries like DESCRIBE on another table can take 1 hour to finish when the refresh table is running.
Do you know about locking issues I need to be aware of? How can I investigate this issue further cause I didn't find anything about it and don't know where to start?
Thanks for the help
We had this issue too (on different versions most recently with 5.10) and tried many things over a 6 month period:
- Fiddled with catalog server config, such as -num_metadata_loading_threads=32 in the advanced config snippet.
- Increased catalog memory to 24GB I think, but since not helpful ended up going back to 16GB
- Refactored many of our jobs to drastically lower the number of refreshes and invalidate metadatas
- Went from doing a daily compute stats on our tables to weekly on Saturdays.
All was pretty much to no avail, then we noticed some tables were not being defragmented by our maintenance jobs and had upwards of 100K files (each!). We fixed that and started compacting others more aggressively, so our tables went from having over 2 million total files to about 1.4
That did the trick. No more long catalog metadata operations. Hope that helps.
Very good and interesting direction to check:),
From what I've looked at our tables,
It looks like the problematic tables have 2000-20000 partitions and 1000-10000 files,
so maybe it's too many partitions.
Can you elaborate on how you defragment/compact your tables? what process are you doing on your maintenance jobs?
Thanks a lot!
For us it didn't appear to be any particular table having too many files or partitions, but rather the catalog tracking too many of them overall. So definitely compact the most fragmented ones to start with, but the goal is to lower total files.
We use impala itself, doing an insert overwrite in place. This does result in a short outage as queries will fail for a few seconds on that table (if reading the same partitions being overwritten), so we schedule this late at night.
For a typical table partitioned by event_date_yearmonth, and account_id_mod (i.e. account_id % 10), we typically will compact the current month (which has data coming in throughout the day so many new small files) with:
insert overwrite sometable partition(event_date_yearmonth, and account_id_mod)
select * from sometable where event_date_yearmonth = '201712'
This will result in 1 file in each partition (or more if partition's data bigger than block size), and all account_id_mod partitions for event_date_yearmonth 201712 will be rewritten, while other months will not be touched.
Notice I didn't specify partition values in the partition clause, so it's fully dynamic, and therefore the * works ( * returns all the schema columns AND the partition columns).
Caution though: having 1 file per partition will decrease parallelism and query performance if common use case is to read a single partition at a time. If so, you can set PARQUET_FILE_SIZE before the insert, to create files in each partition smaller than the defaul 128m.