Reply
Highlighted
Explorer
Posts: 13
Registered: ‎12-30-2014
Accepted Solution

Add disk in data node manually

Hi ,

I am trying to add new disk in a data node and have some questions 

 

1- to add a disk in a data node do i have to add  disk in all node together .

 

2-do i have to update dfs.name.dir and dfs.data.dir both parameter ?

 

3- do i have to update dfs.name.dir/dfs.data.dir on each data node in case i add disk on one data node .

Cloudera Employee
Posts: 576
Registered: ‎01-20-2014

Re: Add disk in data node manually

> 1- to add a disk in a data node do i have to add disk in all node together .

Each node can have a different number of disks. It's not ideal but not
wrong to be so.


> 2-do i have to update dfs.name.dir and dfs.data.dir both parameter ?

If you're using this new disk for storing hdfs blocks, then update
dfs.data.dir. If you're using it for namenode metadata, then update
dfs.name.dir.


> 3- do i have to update dfs.name.dir/dfs.data.dir on each data node in case i add disk on one data node .

dfs.data.dir would be different on each datanode if they all have
varying mount points and/or number of disks.


Regards,
Gautam Gopalakrishnan
Cloudera Support
New Contributor
Posts: 3
Registered: ‎03-07-2018

Re: Add disk in data node manually

Hi Gautam, team

 

I have follow up questions on this topic and they are;

 

1. Customer is having 1 cluster with 17 nodes and want to add more storage (no intension to increase the compute) and make it to >=48tbs. Is this recommed? If so please share some pointers.

2. Here in this http://i.dell.com/sites/doccontent/business/large-business/en/Documents/Dell-Cloudera-Apache-Hadoop-....  in this Ref doc it is mentioned as "For drive capacities greater than 4TB or node storage density over 48TB, special consideration is required for HDFS setup. Configurations of this size are close to the limit of Hadoop per-node storage capacity". Please share insights in this regard as well.

2. Are there any side effects by doing this. (e.g job running performance etc, etc.) and the considerations...

 

Regards,

Prad

Champion
Posts: 744
Registered: ‎05-16-2016

Re: Add disk in data node manually

@prad

 

People ususally consider the below following during hardware sizing 

 

1. number of disks spindles and its throughput 

 

2.total number of time to replicate the data loss when one of the node is corrupted 

 

we have 12x2TB which works good over 12x4TB  considering the above 

 

 

 

 

 

New Contributor
Posts: 3
Registered: ‎03-07-2018

Re: Add disk in data node manually

@csguna

 

Thanks.

 

1. But what are the config changes required to perform these steps

2. I think we also need to consider the current workloads that are running and more strorage in the same node may give some performance issues?

 

- Prad

Champion
Posts: 744
Registered: ‎05-16-2016

Re: Add disk in data node manually

2. Number of roles and its memory allocation does effect the performance like swapping , gc pause . 

so you have to be careful calculate them  in deployment phase .

 

 

1 .  "But what are the config changes required to perform these steps "  - Could you be  more specfic 

New Contributor
Posts: 3
Registered: ‎03-07-2018

Re: Add disk in data node manually

@csguna

 

Regarding #1... I am looking for any additional configuration steps required. Or there is no 'additional' config required.

 

Thanks.

Announcements