We have an application outside CDH that need to write to HDFS. So it need to know the active namenode URL.
We have a nameserveiceHA configured within CDH cluster but we are looking to access it from utside CDH as well.
We are using CDH 5.9
One quick solution is to add application host to cluster and give (only) HDFS Gateway role to it.
Then you can use NameNode Nameservice name as uri e.g. hdfs://nameservice1/.
This could be done with CM.
- Can manage configuration centrally.
- Host goes under CM management i.e. get monitored by agent, be capable of getting assigned other roles and so on.
- Files are distributed e.g. under /opt/cloudera which consume disk space (take log space into consideration also).
- Some ports - 9000 and 9001 are used by agent.
These might cross administrative boundary that please carefully consider and make a plan beforehand.