About kingpin

kingpin · ‎09-17-2021

Hi @Chetankumar , Given you have heterogeneous storage & HDFS follows rack topology to balance the blocks across the datanodes. Currently the DataNode Volume uses Round Robin policy, we change it to Available Space policy. This means new data will be written to the lesser used disks.By doing that it will chose datanode based on available space. This can help in your case You can avail the below settings in HDFS - CM->HDFS-> config -> DataNode Volume Choosing Policy -> change to Available Space Save changes and restart datanodes. If that helps, Please feel free to mark the post as Accepted solution. regards, Vipin

kingpin · ‎07-15-2021

Hi @Amn_468 In Kudu Table is divided into multiple tablets and those tables are distributed across the cluster. So the table data will be stored across multiple TS (kudu nodes) You can get that info from Kudu master WebUI CM->Kudu ->Webui -> Tables ->select table curl -i -k --negotiate -u : "http://Abcde-host:8051/tables" Also, You can run ksck command to get that info :- https://kudu.apache.org/docs/command_line_tools_reference.html#table-list Does that answer your question, if yes please feel free to mark the post as solution accepted and give a thumbs up. regards,

kingpin · ‎06-21-2021

Hi @FEIDAI Check in hdfs trash if the deleted folder is there. (if you haven't used -skipTrash) If you manage to find the folder under trash, copy it to your destination path hdfs dfs -cp /user/hdfs/.Trash/Current/<your file> <destination> Otherwise, The best option is probably to find and use a data recovery tool or backup, Regards,

kingpin · ‎06-21-2021

Hi @sakitha Seems to be a known issue. Is the topic whitelist is set to " * " ? Can you please try with dot - " .*" Let us know if that works for you. Regards, ~ If the above answers your questions. Please give a thumbs up and mark the post as accept as solution.

kingpin · ‎06-21-2021

Hi @wert_1311 That's right, balancer just balances the tablet across the kudu cluster. If one host is consuming more space, it could be that the size of tablets is huge. Thats right, Kudu cant rebalance like HDFS based on dfs usage. one of the workaround you can try:- - Stop that specific kudu TS role - Run ksck until it comes healthy. - once ksck is healthy, rebuild that particular Kudu TS (rebuilding = wiping all data and wal dir) https://kudu.apache.org/docs/administration.html#rebuilding_kudu - start that specific TS - Run rebalance again That should help. Let me know how did that go. Cheers, ~ If that answers your question - Please give the thumbs up & mark the post as accept as solution.

kingpin · ‎06-17-2021

Hi @wert_1311 , Check for Tablet distribution across tablet servers. For some reason if one tablet server goes down/unavailable, the data will be replicated to other tablet servers. You get can get number of tablets per tablet server using this command :- sudo -u kudu kudu table list <csv of master addresses> -list_tablets | grep "^ " | cut -d' ' -f6,7 | sort | uniq -c If you find the tablet distribution is uneven. You can go ahead with kudu rebalance tool to balance your cluster. https://docs.cloudera.com/runtime/7.2.2/administering-kudu/topics/kudu-running-tablet-rebalancing-tool.html Let me know how did that go. If that answers your question, Please mark this post as "accept as solution" Regards,

kingpin · ‎04-20-2021

Hi @sipocootap2 Answered it here - https://community.cloudera.com/t5/Support-Questions/How-can-I-get-fsimage-with-curl-command/m-p/314859/highlight/false#M226223 cheers,

kingpin · ‎04-20-2021

Hi @sipocootap2 , AFAIK "/getimage" is deprecated in CDH and we suggest this not to be used. Instead you can use the command "hdfs dfsadmin -fetchImage <dir>" to download & save the latest fsimage. Based on research, in earlier versions of CDH, the getImage method was available after which a need was realized to provide a proper command/utility to download the FSimage and as a result of which "hdfs dfsadmin -fetchImage" was born. Once that was put in place, the getImage was removed. Does that answers your questions ? If yes, feel free to mark this post as "accept as solution" Regards,

kingpin · ‎04-20-2021

Hi @ROACH Ideally we recommend 1gb heap per 1 million blocks. Also , How much memory you actually need depends on your workload, especially on the number of files, directories, and blocks generated in each namespace. Type of hardware VM or bare metal etc also taken into account. https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_nn_memory_config.html Also have a look at examples of estimating namenode heap memory https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/hdfs-overview/topics/hdfs-examples-namenode-heap-memory.html If write intensive operations or snapshots operations are being performed on Cluster oftenly then 6-9 gb sounds fine. I would suggest to do grep for GC in Namenode logs and if you see long Pauses says for more than 3-5 seconds then its a good starting point to increase the heap size. Does that answer your question. do let us know. regards,

kingpin · ‎04-20-2021

Hi @Chetankumar , I think we answered in this thread :- https://community.cloudera.com/t5/Support-Questions/How-to-move-block-from-one-mount-point-to-other-and-remove/td-p/314861 If that answers all your questions, feel free to mark the post "accepted solution" Regards,

Online	Offline
Last Visited	‎06-01-2023 06:05 AM

Member Since	‎09-11-2018 03:04 AM
Last Visited	‎06-01-2023 06:05 AM
Posts	76
Kudos received	7

Cloudera Community

Re: Kudu Tables

Re: MirrorMaker wont start due to java.lang.Runtim...

Re: Kudu T-server data distribution

Re: How to move block from one mount point to othe...

Re: Open file descriptors issue on Kudu

Re: How HDFS balancer works ?

Re: Kudu Tables

Re: HDFS 整个数据目录被删，请协助

Re: MirrorMaker wont start due to java.lang.Runtim...

Re: Kudu T-server data distribution

Re: Kudu T-server data distribution

Re: How can I get fsimage with curl command?

Re: How can I get fsimage with curl command?

Re: CDH namenode java heap bigger than it should b...

Re: Remove disk permanently from all Data Nodes