Member since
04-08-2014
70
Posts
20
Kudos Received
12
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6838 | 07-16-2018 04:12 PM | |
6953 | 07-13-2018 03:17 PM | |
7454 | 07-10-2018 03:00 PM | |
7127 | 07-10-2018 02:54 PM | |
7709 | 07-05-2018 03:35 PM |
05-15-2018
07:16 PM
I looked into this a little more today, out of interest, and added some additional info about how to make it work on EL6 here: https://issues.apache.org/jira/browse/KUDU-2442 Regardless, if you are using a parcel install, you must first symlink the include files and the libraries into /usr/ or /usr/local, something like this: sudo ln -s /opt/cloudera/parcels/CDH/include/kudu /usr/local/include/ sudo ln -s /opt/cloudera/parcels/CDH/lib64/libkudu_client.so /usr/local/lib64/ sudo ln -s /opt/cloudera/parcels/CDH/lib64/libkudu_client.so.0 /usr/local/lib64/ sudo ln -s /opt/cloudera/parcels/CDH/lib64/libkudu_client.so.0.1.0 /usr/local/lib64/
... View more
05-15-2018
01:18 PM
I looked at this a little more and I think kudu-python install on RHEL 6 / Centos 6 is broken because of the lack of Python 2.6 support in newer versions of pip. There are a couple of related JIRAs on this issue: https://issues.apache.org/jira/browse/KUDU-1705 and https://issues.apache.org/jira/browse/KUDU-2442 Another issue is that the parcel does not link the include files and libraries into a place that pip can find them. You may be able to work around that with putting symlinks from /opt/cloudera/parcels/CDH/includes/kudu into /usr/include/kudu and /opt/cloudera/parcels/CDH/lib64/*kudu* into /usr/lib64/ to see if that helps solve the problem on a newer OS version.
... View more
05-15-2018
11:53 AM
Hi elkarel, This is a very old thread and has some outdated information and expired links. Please include any problems you are facing including any error messages you are seeing, your version, platform, etc. Regarding your question about the documentation, there is no longer any special procedure to install Kudu. With the latest versions of CDH, Kudu is included in the parcel: https://www.cloudera.com/documentation/enterprise/5-14-x/topics/kudu_install_cm.html Thanks, Mike
... View more
05-09-2018
06:14 PM
1 Kudo
Kudu does not use HDFS at all. It requires its own storage space. If you use 3x replication (the default) and no compression then Kudu will take 3x the amount of space that you ingest. However Kudu tends to efficiently encode and compress data so you will have to evaluate how much space Kudu takes based on the schema and data ingestion patterns you have. The more RAM you give Kudu the better it will perform... treat Kudu like a database (think MySQL or Vertica). Right now there is no way to specify a quota, the only available settings related to that are: --fs_wal_dir_reserved_bytes ( https://kudu.apache.org/docs/configuration_reference.html#kudu-tserver_fs_wal_dir_reserved_bytes ) and --fs_data_dirs_reserved_bytes ( https://kudu.apache.org/docs/configuration_reference.html#kudu-tserver_fs_data_dirs_reserved_bytes ) If you need to closely control the amount of space Kudu uses then you can consider putting it on its own partitions or machines. However if it possible to put Kudu on the same machines that have HDFS running on them if you want to do that. Hope that helps!
... View more
05-07-2018
05:57 PM
Unfortunately there is no support for storing data on ADLS at this time. Regarding backup and restore, we are in the process of designing a backup solution into Kudu, however there is no ETA at this time for when it will be ready to use. You can follow the progress at https://issues.apache.org/jira/browse/KUDU-1575
... View more
06-28-2017
09:38 AM
1 Kudo
Impala heavily relies on parallelism for throughput so if you have 60 partitions for Kudu and 1800 partitions for Parquet then due to Impala's current single-thread-per-partition limitation you have built in a huge disadvantage for Kudu in this comparison. Please let us know if you re-run your comparison test.
... View more
06-27-2017
09:30 PM
If you are under the scale limits consider increasing # of partitions. Impala tends to use one thread per partition when scanning.
... View more
06-27-2017
09:29 PM
Could you check whether you are under the current scale recommendations for Kudu? We are working hard on increasing these limits and will try to do so for each coming release. Current scale limits for CDH 5.11 (Kudu 1.3): https://www.cloudera.com/documentation/kudu/latest/topics/kudu_known_issues.html#concept_cws_n4n_5z
... View more
06-27-2017
03:06 PM
1 Kudo
Make sure you run COMPUTE STATS after loading the data so that Impala knows how to join the Kudu tables. What is the total size of your data set? I am surprised at the difference in your numbers and I think they should be closer if tuned correctly. Regardless, if you don't need to be able to do online inserts and updates, then Kudu won't buy you much over the raw scan speed of an immutable on-disk format like Impala + Parquet on HDFS.
... View more
- « Previous
- Next »