Support Questions

Find answers, ask questions, and share your expertise

Kudu rebalance crash

avatar
Explorer

Kudu rebalance tool crashes - when I run it in command line as well as when I use Cloudera Manager UI.

Here is the stderr displayed in the Cloudera Manager:

+ exec kudu cluster rebalance master1.domain.com,master2.domain.com,master3.domain.com --max_moves_per_server=5 --max_run_time_sec=0 --max_staleness_interval_sec=300
terminate called after throwing an instance of 'std::regex_error'
  what():  regex_error
*** Aborted at 1553857242 (unix time) try "date -d @1553857242" if you are using GNU date ***
PC: @     0x7fbfc3637207 __GI_raise
*** SIGABRT (@0x3ec00005c50) received by PID 23632 (TID 0x7fbfc5c83a00) from PID 23632; stack trace: ***
    @     0x7fbfc5642680 (unknown)
    @     0x7fbfc3637207 __GI_raise
    @     0x7fbfc36388f8 __GI_abort
    @     0x7fbfc3f467d5 __gnu_cxx::__verbose_terminate_handler()
    @     0x7fbfc3f44746 (unknown)
    @     0x7fbfc3f44773 std::terminate()
    @     0x7fbfc3f44993 __cxa_throw
    @     0x7fbfc3f99dd5 std::__throw_regex_error()
    @           0x931c32 std::__detail::_Compiler<>::_M_bracket_expression()
    @           0x931e3a std::__detail::_Compiler<>::_M_atom()
    @           0x932469 std::__detail::_Compiler<>::_M_alternative()
    @           0x9324c4 std::__detail::_Compiler<>::_M_alternative()
    @           0x932649 std::__detail::_Compiler<>::_M_disjunction()
    @           0x93297b std::__detail::_Compiler<>::_Compiler()
    @           0x932cb7 std::__detail::__compile<>()
    @           0x92bfc6 (unknown)
    @           0x92c664 std::_Function_handler<>::_M_invoke()
    @           0xde6672 kudu::tools::Action::Run()
    @           0x9957d7 kudu::tools::DispatchCommand()
    @           0x99619b kudu::tools::RunTool()
    @           0x8dee4d main
    @     0x7fbfc36233d5 __libc_start_main
    @           0x9284b5 (unknown)

 

I have already created an issue: https://issues.apache.org/jira/browse/KUDU-2753.

Strange that there was no such issue in Jira yet. Did anybody face this issue before?

 

1 ACCEPTED SOLUTION

avatar
Contributor

This is a known issue with some code to auto-detect whether replicas of non-replicated tablets can be moved without issues (see KUDU-2443). The code relied on std::regex. The tool was built with g++/libstdc++ of versions < 4.9, which means std::regex unexpectedly fails to compile a regular expression containing a bracket, throwing
a std::regex_error exception (see [1]). Starting from version 4.9.1, the libstdc++ has proper support for the C++11's regular expressions (see [2]). This makes the kudu CLI crash if running 'kudu cluster rebalance' on the following platforms:
* RHEL/CentOS 7
* Ubuntu14.04 LTS (Trusty)
* SLES12

 

You should be able to work around the problem by specifying the flag --move_single_replicas to either 'enabled' or 'disabled', as you require, instead of the default 'auto'.

 

Unfortunately there's no release in the CDH 5 line in which this issue is fixed (yet).

View solution in original post

1 REPLY 1

avatar
Contributor

This is a known issue with some code to auto-detect whether replicas of non-replicated tablets can be moved without issues (see KUDU-2443). The code relied on std::regex. The tool was built with g++/libstdc++ of versions < 4.9, which means std::regex unexpectedly fails to compile a regular expression containing a bracket, throwing
a std::regex_error exception (see [1]). Starting from version 4.9.1, the libstdc++ has proper support for the C++11's regular expressions (see [2]). This makes the kudu CLI crash if running 'kudu cluster rebalance' on the following platforms:
* RHEL/CentOS 7
* Ubuntu14.04 LTS (Trusty)
* SLES12

 

You should be able to work around the problem by specifying the flag --move_single_replicas to either 'enabled' or 'disabled', as you require, instead of the default 'auto'.

 

Unfortunately there's no release in the CDH 5 line in which this issue is fixed (yet).