Node labels were released in Apache Hadoop 2.6. Is this feature enabled in CDH? Where can I find documentation on configuring?
Node labels is not considered ready by Cloudera or even by the upstream community. The basis for node labels was added to Hadoop 2.6 with a large number of limitations. The only scheduler that currently implements node labels support is the CapacityScheduler. None of the other schedulers supports it yet. Cloudera recommends, for a number of reasons, that you use the FairScheduler in your cluster.
Setting up node labels is partially supported through the command line interface but it still requires manual steps and configuration. Support for labels is also limited to one (1) label per YARN application. Using labels requires you to add them on the command line when an application is submitted. MapReduce does not implement any of the node label support yet (MAPREDUCE-6304) in the current release.
Node labels due to its limited implementation can also cause a large increase in scheduling delays which makes using them counter productive. We are working with the community to make node labels ready for production but currently it is not there.
Is there any updates on node labels. I am able to create and use node labels but when I restart yarn node labels are getting deleted.
Node labels are not supported in the FairScheduler, not even in C6. C6 has ReourceTypes which allows you to most of what node labels does in a scalable way. There are still some enhacement being worked on for upcoming releases around resource types beside that it is supported and working.