Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Kudu rebalance crash

SOLVED Go to solution

Kudu rebalance crash

Explorer

Kudu rebalance tool crashes - when I run it in command line as well as when I use Cloudera Manager UI.

Here is the stderr displayed in the Cloudera Manager:

+ exec kudu cluster rebalance master1.domain.com,master2.domain.com,master3.domain.com --max_moves_per_server=5 --max_run_time_sec=0 --max_staleness_interval_sec=300
terminate called after throwing an instance of 'std::regex_error'
  what():  regex_error
*** Aborted at 1553857242 (unix time) try "date -d @1553857242" if you are using GNU date ***
PC: @     0x7fbfc3637207 __GI_raise
*** SIGABRT (@0x3ec00005c50) received by PID 23632 (TID 0x7fbfc5c83a00) from PID 23632; stack trace: ***
    @     0x7fbfc5642680 (unknown)
    @     0x7fbfc3637207 __GI_raise
    @     0x7fbfc36388f8 __GI_abort
    @     0x7fbfc3f467d5 __gnu_cxx::__verbose_terminate_handler()
    @     0x7fbfc3f44746 (unknown)
    @     0x7fbfc3f44773 std::terminate()
    @     0x7fbfc3f44993 __cxa_throw
    @     0x7fbfc3f99dd5 std::__throw_regex_error()
    @           0x931c32 std::__detail::_Compiler<>::_M_bracket_expression()
    @           0x931e3a std::__detail::_Compiler<>::_M_atom()
    @           0x932469 std::__detail::_Compiler<>::_M_alternative()
    @           0x9324c4 std::__detail::_Compiler<>::_M_alternative()
    @           0x932649 std::__detail::_Compiler<>::_M_disjunction()
    @           0x93297b std::__detail::_Compiler<>::_Compiler()
    @           0x932cb7 std::__detail::__compile<>()
    @           0x92bfc6 (unknown)
    @           0x92c664 std::_Function_handler<>::_M_invoke()
    @           0xde6672 kudu::tools::Action::Run()
    @           0x9957d7 kudu::tools::DispatchCommand()
    @           0x99619b kudu::tools::RunTool()
    @           0x8dee4d main
    @     0x7fbfc36233d5 __libc_start_main
    @           0x9284b5 (unknown)

 

I have already created an issue: https://issues.apache.org/jira/browse/KUDU-2753.

Strange that there was no such issue in Jira yet. Did anybody face this issue before?

 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Kudu rebalance crash

Cloudera Employee

This is a known issue with some code to auto-detect whether replicas of non-replicated tablets can be moved without issues (see KUDU-2443). The code relied on std::regex. The tool was built with g++/libstdc++ of versions < 4.9, which means std::regex unexpectedly fails to compile a regular expression containing a bracket, throwing
a std::regex_error exception (see [1]). Starting from version 4.9.1, the libstdc++ has proper support for the C++11's regular expressions (see [2]). This makes the kudu CLI crash if running 'kudu cluster rebalance' on the following platforms:
* RHEL/CentOS 7
* Ubuntu14.04 LTS (Trusty)
* SLES12

 

You should be able to work around the problem by specifying the flag --move_single_replicas to either 'enabled' or 'disabled', as you require, instead of the default 'auto'.

 

Unfortunately there's no release in the CDH 5 line in which this issue is fixed (yet).

1 REPLY 1
Highlighted

Re: Kudu rebalance crash

Cloudera Employee

This is a known issue with some code to auto-detect whether replicas of non-replicated tablets can be moved without issues (see KUDU-2443). The code relied on std::regex. The tool was built with g++/libstdc++ of versions < 4.9, which means std::regex unexpectedly fails to compile a regular expression containing a bracket, throwing
a std::regex_error exception (see [1]). Starting from version 4.9.1, the libstdc++ has proper support for the C++11's regular expressions (see [2]). This makes the kudu CLI crash if running 'kudu cluster rebalance' on the following platforms:
* RHEL/CentOS 7
* Ubuntu14.04 LTS (Trusty)
* SLES12

 

You should be able to work around the problem by specifying the flag --move_single_replicas to either 'enabled' or 'disabled', as you require, instead of the default 'auto'.

 

Unfortunately there's no release in the CDH 5 line in which this issue is fixed (yet).