Welcome to the Cloudera Community

Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Who agreed with this topic

Impala Metadata size - sane values

avatar
New Contributor

   Hello all,

 

I know this question was posed several times over the years, but given recent progress in Impala performance and introduction of split coordinator/worker roles, can you give some fresh recommendations on following maximum/sane values per cluster:

  • number of databases;
  • number of tables per database;
  • number of columns per table;
  • number of partitions per database
  • number of partitions per table

 

The latest recommendations I found so far are from 2017 Latest Impala Cookbook - but even there is the information is sparse (i.e. I'm not sure if 100k recommendation is still valid, is this value for table or database)

 

Any fresh info will be helpfull. 

 

Thanks in advance.

Who agreed with this topic