I realized I set up a couple of hosts with the data node role, but they do not need to have that role. I have three other hosts with data node role.
How can I remove the role? I do not see any way to simply remove the role.
if I understand clearly you need to remove a datanode from one your nodes ?
Are you performing in testing enviroment or production clusters ?
whats the replication factor in your cluster
Please let me know .
I just need to remove the Data Node role from a node that doesn't need to be a data node.
This is in a testing environment.
Replication factor = 3
I's very easy, go to the service that containe this role -from the CM interface-, then click to instances, check the role in the host you need to remove it, and click remove.
if you are doing Prod / Test (Provided if you have huge amount of data in that desired datanode that you are going to decommision.
Increase the Datanode network bandwidh.
Maximum number of replication threads on a DataNode and
Hard limit on the number of replication threads on a DataNode
Click on the Host that has Datanode Role - Perform Decommision and once the decommissioned is finished
that you can monitor in the Namenode Web UI 50070 .
Sometime I would rather leave it as decommisioned state and later if you want to include the instance back as for scalability you follow the above same procedure instead of decommision u will click recommision that it joins back to the cluster .
if you want to delete the instance after decommisioning please follow the manual
if you need more information please let me know