Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Solved
Go to solution
Can I run the balancer for hdfs?
Labels:
- Labels:
-
Apache HBase
-
HDFS
Contributor
Created on ‎08-23-2018 06:04 PM - edited ‎09-16-2022 06:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I use cloudera cdh 4.0.4.
I run balancing on Hbase.
However, I have 10 data nodes, and only 5 servers are being used as hbase region servers.
Data node imbalance has occurred.
Is there a possibility that Hbase will cause problems when balancing with Hadoop hdfs?
1 ACCEPTED SOLUTION
Mentor
Created ‎08-23-2018 07:38 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There will not be any operational problems such as crashes or errors when
running a HDFS balancer on a cluster with HBase running, but there can
potentially be a performance impact depending on what the balancer decides
to move based on its space thresholds.
The performance impact would come from loss of locality - the
RegionServers' required HFiles may find their blocks to be remote, so a
slightly higher network usage can be observed until the next major
compaction rewrites a block replica locally.
If you'd like to narrow down the time-frame of impact, you can run the HDFS
balancer with the desired balancing threshold, and then once it is
complete, immediately follow up with a major compaction command on your
latency-sensitive HBase tables.
running a HDFS balancer on a cluster with HBase running, but there can
potentially be a performance impact depending on what the balancer decides
to move based on its space thresholds.
The performance impact would come from loss of locality - the
RegionServers' required HFiles may find their blocks to be remote, so a
slightly higher network usage can be observed until the next major
compaction rewrites a block replica locally.
If you'd like to narrow down the time-frame of impact, you can run the HDFS
balancer with the desired balancing threshold, and then once it is
complete, immediately follow up with a major compaction command on your
latency-sensitive HBase tables.
1 REPLY 1
Mentor
Created ‎08-23-2018 07:38 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There will not be any operational problems such as crashes or errors when
running a HDFS balancer on a cluster with HBase running, but there can
potentially be a performance impact depending on what the balancer decides
to move based on its space thresholds.
The performance impact would come from loss of locality - the
RegionServers' required HFiles may find their blocks to be remote, so a
slightly higher network usage can be observed until the next major
compaction rewrites a block replica locally.
If you'd like to narrow down the time-frame of impact, you can run the HDFS
balancer with the desired balancing threshold, and then once it is
complete, immediately follow up with a major compaction command on your
latency-sensitive HBase tables.
running a HDFS balancer on a cluster with HBase running, but there can
potentially be a performance impact depending on what the balancer decides
to move based on its space thresholds.
The performance impact would come from loss of locality - the
RegionServers' required HFiles may find their blocks to be remote, so a
slightly higher network usage can be observed until the next major
compaction rewrites a block replica locally.
If you'd like to narrow down the time-frame of impact, you can run the HDFS
balancer with the desired balancing threshold, and then once it is
complete, immediately follow up with a major compaction command on your
latency-sensitive HBase tables.
