Member since
10-21-2022
1
Post
0
Kudos Received
0
Solutions
10-21-2022
07:33 AM
We faced the problem on our production kudu cluster. The hard disk with wal catalog was failed on the tablet server. We install new disk and clear data directory according to Kudu documentation https://kudu.apache.org/docs/administration.html#rebuilding_kudu . After starting the failing tablet server we have seen that kudu ksck displayed two instance tablet server for one server with different UUID. One of this server had status "WRONG SERVER_UUID". Why may the error occure? Are there any ways to avoid it? Is there way to solve the problem without restarting master server? Also found the command "kudu tserver unregister" for removing tablet server with wrong UUID but we hadn't found this step in documentation. Steps for reproduce the similar problem: 1.Install Apache Kudu Quickstart. Instructions - https://kudu.apache.org/docs/quickstart.html#_bring_up_the_cluster Clone the Apache Kudu repository using Git and change to the kudu directory: $ git clone https://github.com/apache/kudu $ cd kudu Set the KUDU_QUICKSTART_IP environment variable to your ip address: $ export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}' | tail -1) Bring up the Cluster: $ docker-compose -f docker/quickstart.yml up -d Check the cluster health: $ docker exec -it $(docker ps -aqf "name=kudu-master-1") /bin/bash $ kudu cluster ksck kudu-master-1:7051,kudu-master-2:7151,kudu-master-3:7251 2.Remove --fs_wal_dir from one of the tablet servers, after Tablet server container starts crashing. $ docker exec -it $(docker ps -aqf "name=kudu-tserver-1") /bin/bash $ rm -rf /var/lib/kudu/tserver/wals 3.Delete directories, --fs_metadata_dir and --fs_data_dirs tablet server, default: $ docker start docker-kudu-tserver-1-1 $ docker exec -it $(docker ps -aqf "name=kudu-tserver-1") /bin/bash $ rm -rf /var/lib/kudu/tserver/ 4.Restart tablet server. $ docker stop docker-kudu-tserver-1-1 $ docker start docker-kudu-tserver-1-1 5.Execute <kudu cluster ksck kudu-master-1:7051,kudu-master-2:7151,kudu-master-3:7251>. We get two UUID tablet server. One is in the status "WRONG SERVER_UUID". $ docker exec -it $(docker ps -aqf "name=kudu-master-1") /bin/bash $ kudu cluster ksck kudu-master-1:7051,kudu-master-2:7151,kudu-master-3:7251
... View more
Labels:
- Labels:
-
Apache Kudu