Reply
Explorer
Posts: 24
Registered: ‎10-18-2017

Define your own rack topology script

[ Edited ]

Hi,

I want to make my own script to  define the rack topology. With CM I have changed the hdfs configuration for 

net.topology.script.file.name to my script's location: /tmp/JOB/topology.sh . The script is placed only on the namenode. The script used was the same as the shell script defined at the bottom of this page : https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/RackAwareness.html that assumes that each host is in its own rack with a number equal to the last digits of the IP address. It was my understanding that hadoop would use this script with as input all possible IPs of the hosts. Therefore I tested it and for following input :
. topology.sh xxx.xxx.11.10 xxx.xxx.11.11
we get the expected output :
/rack-10
/rack-11 . I have set the owner/group to root:hadoop and gave full rights (chmod 777). This to make sure that the configuration of my script is the same as the  one for the automatically generated script in /etc/hadoop/conf.
I had expected that after a restart of the hdfs service (I also tried to whole cluster), I would see in CM under hosts -> rack now to see rack-10 and rack-11 instead of default. However this is not the case. I notice that in /etc/hadoop/conf a topology.map script is created upon restart where the racks are still called 'default'. This is also what I seen in cloudera manager.
 
 
Does anyone have a clue what I might be doing wrong?
 
Thanks!
Cloudera Employee
Posts: 508
Registered: ‎07-30-2013

Re: Define your own rack topology script

Hi,

Normally, CM leverages the user-configured rack in the CM UI to populate the topology scripts for the cluster. If you override the topology script, hadoop should use your custom one, but CM does not invoke that script when deciding what rack to display in the UI, so CM will show something different from what your cluster is actually using.

The topology.map is not used if you've customized the topology script to do something else.

-Darren
Explorer
Posts: 24
Registered: ‎10-18-2017

Re: Define your own rack topology script

Hi,

many thanks for a reply!

If the custom script is used but not shown in cloudera manager, is there any other place where I could see that indeed my script is used rather than the default? I have tried looking in the logs but I do not know exactly where to look at. Do you have any idea what is the best method to verify the script did its job?

 

Thanks for all your input.

Cloudera Employee
Posts: 508
Registered: ‎07-30-2013

Re: Define your own rack topology script

Looks like the YARN ResourceManager WebUI will tell you the racks it sees if click on Nodes.

You could also temporarily modify your script to touch /tmp/didmyscriptrun and make sure that timestamp gets updated at some point.
Explorer
Posts: 24
Registered: ‎10-18-2017

Re: Define your own rack topology script

Hi,

I notice now my script indeed did run, in the cloudera manager I see the names that I gave using CM (hosts -> assign rack) and in the YARN UI (http://<resourcemanager-IP>:8088/cluster/nodes ) I see the 'default-rack' names. 

 

To make sure my script has run, I have included in my script touch statments that create files with the same name that is printed to the output and I notice my file did run and has created the desired output (I have 4 nodes and the output was rack1, rack2, rack3, rack4). If you have an idea why I can not see the rack names, please let me know.

 

 

Thanks for any further feedback!

 

 

Cloudera Employee
Posts: 508
Registered: ‎07-30-2013

Re: Define your own rack topology script

I'm not sure why that didn't work, sorry. Hopefully a YARN / Hadoop expert can chime in.
Highlighted
Explorer
Posts: 24
Registered: ‎10-18-2017

Re: Define your own rack topology script

Thanks for all your feedback.

Announcements