Reply
Explorer
Posts: 31
Registered: ‎04-13-2017

How to increase size of EBS volume on datanode

I would like to increase the storage available on a cluster.

 

I am under the impression that it is possible to simply increase the existing EBS volumes on the data nodes rather than adding new ones.

 

If I were to simply increase the volume size through AWS, what other steps would I have to take?

 

I presume that I would need to expand the drive with resize2fs as described in the AWS docs

 

  1. Do I need to stop the cluster at any point in the process?
  2. Do I need to expand the volumes of all the datanodes equally?
  3. Are there any post-modification requirements such as rebalancing the cluster or 'Actions > Upgrade HDFS Metadata'?

 

Posts: 174
Topics: 8
Kudos: 20
Solutions: 19
Registered: ‎07-16-2015

Re: How to increase size of EBS volume on datanode

If I were to simply increase the volume size through AWS, what other steps would I have to take?

I presume that I would need to expand the drive with resize2fs as described in the AWS docs

I guess it would be enough, yes.

 

Do I need to stop the cluster at any point in the process?

Not sure so I won't take a stand on this one.

 

Do I need to expand the volumes of all the datanodes equally?

No this is not mandatory. But this is usualy the case.

 

Are there any post-modification requirements such as rebalancing the cluster or 'Actions > Upgrade HDFS Metadata'?

Not that I'm aware of. Increasing the size of HDFS (by increasing the size of the disk) has no connection to rebalancing (unless you add new nodes for increasing the size of HDFS > then you might consider it (not mandatory)).

Highlighted
Posts: 390
Topics: 11
Kudos: 60
Solutions: 35
Registered: ‎09-02-2016

Re: How to increase size of EBS volume on datanode

@epowell

 

If I were to simply increase the volume size through AWS, what other steps would I have to take?

 

Simply increasing the volumn may not help unless the addition disk space added to the correct partition in your linux

 

Go to Cloudera Manager -> HDFS -> Configuration -> DataNode Data Directory (search for either dfs.data.dir (or) dfs.datanode.data.dir)

 

Get the disk partition that datanode belongs to and make sure disk space added to that partition, otherwise datanode will not use the new disk space


If needed, do this for yarn.nodemanager.local-dirs and Impala - scratch_dirs, etc

 

 

Announcements