Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

Catalog server publishing update with a delay after refresh

Catalog server publishing update with a delay after refresh

Explorer

When a refresh command is fired in Impala over a table, the table gets refreshed with following logs in the catalogd : 

 

Refreshing table metadata: XXXXXX

I0517 09:11:17.238642 13238 HdfsTable.java:1194] Incrementally loading table metadata for: XXXXXX
I0517 09:11:17.251282 13238 HdfsTable.java:835] Loading file and block metadata for 1050 paths for table XXXXXX using a thread pool of size 5
I0517 09:11:17.345129 13238 HdfsTable.java:875] Loaded file and block metadata for XXXXXX
I0517 09:11:17.345355 13238 HdfsTable.java:1204] Incrementally loaded table metadata for: XXXXXX
I0517 09:11:17.345435 13238 CatalogServiceCatalog.java:1019] Refreshed table metadata: XXXXXX
.
.
other metadata activities on other tables
.
.
.
I0517 09:15:53.448909 106386 catalog-server.cc:324]Publishing update: TABLE:XXXXX@104793

 

Notice that there is a significant delay before the refresh on the table gets published to daemons. I am seeing the following error on queries which try to access that in the mean time before the publish update is fired.

 

File 'hdfs://XXXXXXXXXXXXX.parquet' has an invalid version number:  This could be due to stale metadata. Try running "refresh XXXXX".

 

Is there a reason behind this or am i understanding it wrong? 

 

 

Also i notice that all the publish statements are more often than not grouped together as follows : 

 

I0517 09:11:17.548977 106386 catalog-server.cc:324] Publishing update: TABLE:1@104780
I0517 09:11:17.582988 106386 catalog-server.cc:324] Publishing update: TABLE:2@104779
I0517 09:11:17.586165 106386 catalog-server.cc:324] Publishing update: TABLE:3@104781
I0517 09:11:17.586362 106386 catalog-server.cc:324] Publishing update: TABLE:4@104778
I0517 09:11:17.592391 106386 catalog-server.cc:324] Publishing update: TABLE:5@104776
I0517 09:11:17.602597 106386 catalog-server.cc:324] Publishing update: CATALOG:a122eeb2b432428e:80c5f5d3f78ee968@104781
I0517 09:11:17.602612 106386 catalog-server.cc:351] Publishing deletion: TABLE:6

 

Is that the reason why changes are pushed with a dela?

 

let me know should you need more information.

 

1 REPLY 1

Re: Catalog server publishing update with a delay after refresh

New Contributor

Hi,

 

Can someone please help on the issue/behaviour mentioned above??

 

Thanks.