About arseniy

DennisJaheruddi · ‎02-03-2021

Apologies to disturb this old thread, but it seems people are still landing on this via search: The discussion here is outdated, especially the exclusions around gateway nodes. Please contact your Cloudera Representative for the latest terms an conditions.

wdberkeley · ‎03-29-2019

This is a known issue with some code to auto-detect whether replicas of non-replicated tablets can be moved without issues (see KUDU-2443). The code relied on std::regex. The tool was built with g++/libstdc++ of versions < 4.9, which means std::regex unexpectedly fails to compile a regular expression containing a bracket, throwing a std::regex_error exception (see [1]). Starting from version 4.9.1, the libstdc++ has proper support for the C++11's regular expressions (see [2]). This makes the kudu CLI crash if running 'kudu cluster rebalance' on the following platforms: * RHEL/CentOS 7 * Ubuntu14.04 LTS (Trusty) * SLES12 You should be able to work around the problem by specifying the flag --move_single_replicas to either 'enabled' or 'disabled', as you require, instead of the default 'auto'. Unfortunately there's no release in the CDH 5 line in which this issue is fixed (yet).

Grant Henke · ‎03-29-2019

If I understand correctly, you are talking about the logs in the configured --log_dir. By default Kudu will keep 10 log files per severity level. There is a flag to change that value, but it's currently marked as "experimental". It has been in Kudu for some time, so not changing it to stable is probably a bit of an oversight. I opened an Apache Kudu jira (KUDU-2754) to change it to a stable config. In the mean time, you can use the --max_log_files configuration by unlocking experimental configurations via --unlock_experimental_flags.

adar · ‎03-20-2019

Indeed, there's going to be a significant amount of memory consumed just as overhead to support that number of tablets. So you should either reduce the number of tablets per tserver, or increase the amount of RAM available to Kudu on those heavily-loaded machines.

arseniy · ‎03-01-2019

You have mentioned that NTP is not related to the problem. Let's consider this scenario: 1. Impala Daemon is working with the READ_AT_SNAPSHOT setting enabled. Impala daemon makes a read operation in Kudu. It sets the read timestamp T1 immediately after the preceiding write operation. 2. Kudu despatches the read request to some replica R1. This replica R1 is running on a machine with poorly configured NTP, so the local time on this machine is 1 minute behind. 3. The replica R1 waits for the timeout specified by '--safe_time_max_lag_ms': 30 seconds. After the timeout, the local time is still 30 seconds behind T1 (ideally). Does this lead to the problem under discussion: 'Tablet is lagging too much to be able to serve snapshot scan'?

Online	Offline
Last Visited	‎05-08-2019 02:47 AM

Member Since	‎02-27-2019 03:55 AM
Last Visited	‎05-08-2019 02:47 AM
Posts	11

Cloudera Community

Re: Pricing for gateway nodes

Re: Kudu rebalance crash

Re: Limit Kudu logs

Re: Kudu tablet servers - who eats memory?

Re: Kudu read fails: Tablet is lagging too much to...