Member since
06-29-2017
11
Posts
0
Kudos Received
0
Solutions
05-09-2018
12:27 AM
Hi, We are analyzing kudu for production in real time ingestion but we are not sure about how much disk/ram needs, I have understood that kudu storage does not work with hdfs means that we need additional resources and we don´t know how much disk/ram for each 1gb of data ingested storaged from kafka. how can I control cuota (%) for each process if kudu does not work with YARN? Do you recomend separated cluster for kudu to control memory usage? Are there any formula to calculate additional hardware for kudu? disk=replication factor * data? and memory? If kudu does not need HDFS for storage probably HDFS is not necessary. Isn´t that right? Thanks in advance
... View more
Labels:
- Labels:
-
Apache Kudu
02-27-2018
05:32 AM
OK, thanks @AcharkiMed , I understand that Impala not only show kudu (as external table) but also process the data. If you create a UDF ('validateCard' as Impala function) I guess you can use it, so kudu is just a storage and it does not process nothing. Then if some data is storaged in kudu format it does not use hdfs. I'm right?
... View more
02-26-2018
07:35 AM
I am using Impala+kudu table, I don´t know if Impala is just a interface to see that tables in hue/shell or it works as proces engine when I launch a query (select,update). ¿Is it using Impala when I launch a query? Impala+kudu allows UDF or it just works for Impala (without kudu storage) ? thanks in advance
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Kudu