- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Can HDFS Rebalancer run without interrupted Production Data?
- Labels:
-
HDFS
Created 09-11-2023 09:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My Production environment has existing 4 data nodes.
I just added additional 4 data nodes in the cluster and would like to run the HDFS Rebalance to balance the data across the 8 data nodes.
My question is, can I rebalance data but also ingesting and processing new data at the same time? Will it caused any file interrupted if do so?
Thanks.
Created 09-13-2023 06:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@newtocm you can't pause the Balancer. You can kill it and start it again and it will try to balance the rest of the DFS data remaining to be balanced.
Created 09-11-2023 10:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@newtocm, Welcome to our community! To help you get the best possible answer, I have tagged in our HDFS experts @willx @SVB who may be able to assist you further.
Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
Regards,
Vidya Sargur,Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 09-11-2023 11:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @newtocm. Yes, You can run your jobs of ingesting and processing data when running the HDFS balancer. But it is ideally recommended to run the balancer when the load on the cluster is not at its peak, so the performance of the jobs are not impacted as balancer is a resource consuming process.
Created 09-11-2023 11:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is the balancer can be paused halfway during the peak and resume back again after the peak?
Created 09-13-2023 06:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@newtocm you can't pause the Balancer. You can kill it and start it again and it will try to balance the rest of the DFS data remaining to be balanced.
Created 09-12-2023 12:26 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can stop balancer at any time, it's safe to stop it by pressing ctrl+c command.
https://community.cloudera.com/t5/Support-Questions/When-should-i-stop-the-balancer/td-p/168115
Created 09-20-2023 06:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@newtocm, Have any of the replies helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
Regards,
Vidya Sargur,Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
