About PedroGaVal

mpercy · ‎05-09-2018

Kudu does not use HDFS at all. It requires its own storage space. If you use 3x replication (the default) and no compression then Kudu will take 3x the amount of space that you ingest. However Kudu tends to efficiently encode and compress data so you will have to evaluate how much space Kudu takes based on the schema and data ingestion patterns you have. The more RAM you give Kudu the better it will perform... treat Kudu like a database (think MySQL or Vertica). Right now there is no way to specify a quota, the only available settings related to that are: --fs_wal_dir_reserved_bytes ( https://kudu.apache.org/docs/configuration_reference.html#kudu-tserver_fs_wal_dir_reserved_bytes ) and --fs_data_dirs_reserved_bytes ( https://kudu.apache.org/docs/configuration_reference.html#kudu-tserver_fs_data_dirs_reserved_bytes ) If you need to closely control the amount of space Kudu uses then you can consider putting it on its own partitions or machines. However if it possible to put Kudu on the same machines that have HDFS running on them if you want to do that. Hope that helps!

alex.behm · ‎02-27-2018

Kudu has the capability to evaluate simple filters natively, e.g. using the primary index of a table, so Impala will push such filters directly to Kudu. More complex filters (e.g. those involving UDFs) are evaluated by Impala after receiving rows from Kudu. Impala clearly distinguishes the filters evaluated by Kudu and those by Impala in the explain plan.

Online	Offline
Last Visited	‎06-14-2018 10:01 AM

Member Since	‎06-29-2017 06:14 AM
Last Visited	‎06-14-2018 10:01 AM
Posts	11

Cloudera Community

Re: Does Kudu need additional resources appart fro...

Re: Is Impala a proces engine when I use kudu?