Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Kudu T-server data distribution

avatar
Expert Contributor

Hello,
I would like some guidance/ information on data distribution in Kudu T-Servers.
We have Kudu cluster of 3 Masters and 9 T-Servers (each t-server has storage of 1TB). We are noticing that space in some t-server is getting consumed rapidly whereas in other its not that much being consumed. Would like to know why this is happening and is there any way this can be overcome, so that data can be distributed evenly across of 9 t-servers.

Kudu 1.7.0-cdh5.16.2/ CM 5.16.2

 

Appreciate any assistance in this regard.

 

Thanks

Wert

 

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi @wert_1311 

 

That's right, balancer just balances the tablet across the kudu cluster. If one host is consuming more space, it could be that the size of tablets is huge. 

Thats right, Kudu cant rebalance like HDFS based on dfs usage. 

one of the workaround you can try:-

- Stop that specific kudu TS role

- Run ksck until it comes healthy. 

- once ksck is healthy, rebuild that particular Kudu TS (rebuilding = wiping all data and wal dir)

https://kudu.apache.org/docs/administration.html#rebuilding_kudu

- start that specific TS 

- Run rebalance again

 

That should help. Let me know how did that go.

 

Cheers,

 

~ If that answers your question - Please  give the thumbs up & mark the post as accept as solution. 

View solution in original post

4 REPLIES 4

avatar
Expert Contributor

Hi @wert_1311 ,

 

Check for Tablet distribution across tablet servers. For some reason if one tablet server goes down/unavailable, the data will be replicated to other tablet servers. 

 

You get can get number of tablets per tablet server using this command :-

sudo -u kudu kudu table list <csv of master addresses>  -list_tablets | grep "^    " | cut -d' ' -f6,7 | sort | uniq -c

 

If you find the tablet distribution is uneven. You can go ahead with kudu rebalance tool to balance your cluster.

https://docs.cloudera.com/runtime/7.2.2/administering-kudu/topics/kudu-running-tablet-rebalancing-to...

 

Let me know how did that go.

 

If that answers your question, Please mark this post as "accept as solution"

 

Regards,

avatar
Expert Contributor

Hi @kingpin 

I did execute the script, ran rebalance report & did a rebalance too however the result I was looking for was not archived (space is still over consumed in 1 TS). I think rebalance just distributes tablets evenly to all TS what I am looking to achieve is like HDFS rebalancer and I don’t think it is there in Kudu, correct me if I am wrong.

 

Thanks 

Wert

avatar
Expert Contributor

Hi @wert_1311 

 

That's right, balancer just balances the tablet across the kudu cluster. If one host is consuming more space, it could be that the size of tablets is huge. 

Thats right, Kudu cant rebalance like HDFS based on dfs usage. 

one of the workaround you can try:-

- Stop that specific kudu TS role

- Run ksck until it comes healthy. 

- once ksck is healthy, rebuild that particular Kudu TS (rebuilding = wiping all data and wal dir)

https://kudu.apache.org/docs/administration.html#rebuilding_kudu

- start that specific TS 

- Run rebalance again

 

That should help. Let me know how did that go.

 

Cheers,

 

~ If that answers your question - Please  give the thumbs up & mark the post as accept as solution. 

avatar
Community Manager

@wert_1311, has @kingpin's reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. 



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: