Support Questions

Find answers, ask questions, and share your expertise

Hadoop HA, Standby Namenode principal Configuration

New Contributor
Hi,I am trying to setup a secured hadoop cluster. In that, I do not find property key to specify the principals for the active and standby namenodes.
There is one property key "dfs.namenode.kerberos.principal", in that I can specify only one namenode's principal.There is one property key "dfs.secondary.namenode.kerberos.principal", but that shouldn't be useful since in Hadoop HA we don't use secondary Namenode. There is way to define namenodes like below, but I do not find any way to specify principals for them. Could somebody help me with this. dfs.nameservices=syracuse dfs.ha.automatic-failover.enabled=true dfs.ha.namenodes.syracuse=nn1,nn2 # The fully-qualified RPC address for each NameNode to listen on dfs.namenode.rpc-address.syracuse.nn1=namenode1:8020 dfs.namenode.rpc-address.syracuse.nn2=namenode2:8020 # The fully-qualified HTTP address for each NameNode to listen on dfs.namenode.http-address.syracuse.nn1=namenode1:50070 dfs.namenode.http-address.syracuse.nn2=namenode2:50070Regards
7 REPLIES 7

Pushparaj Motamari there is no different property for active and standby name , you can configure it something like nn/_HOST@EXAMPLE.COM ,

you can refere following docs:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_Security_Guide/content/kerb-config-hdfs-...

https://ambari.apache.org/1.2.5/installing-hadoop-using-ambari/content/ambari-kerb-2-3-1.html

New Contributor

At client side , where you are trying to access hadoop cluster...If you use the same key for both namenodes ....one overrides the other

dfs.namenode.kerberos.principal=namenode1/_HOST@xyz.com

dfs.namenode.kerberos.principal=namenode2/_HOST@xyz.com

New Contributor

thanks for sharing!

+1

At client side , where you are trying to access hadoop cluster...If you use the same key for both namenodes ....one overrides the other

@Pushparaj Motamari - In an HA environment you do not need to specify different properties for "dfs.namenode.kerberos.principal". It is just one property which holds true for both the Namenodes.

Consider, this scenario:

  • The hostname for one Namenode is xxx1xxx.abc.site
  • The hostname for second Namenode is xxx2xxx.abc.site

In this case we will define few important properties as below:

<configuration>
   <property>
      <name>dfs.nameservices</name>
      <value>nameservice</value>
   </property>
   <property>
      <name>dfs.ha.namenodes.nameservice</name>
      <value>nn1,nn2</value>
   </property>
   <property>
      <name>dfs.https.port</name>
      <value>50470</value>
   </property>
      <name>dfs.namenode.https-address.nameservice.nn1</name>
      <value>xxx1xxx.abc.site:50470</value>
   </property>
   <property>
      <name>dfs.namenode.https-address.nameservice.nn2</name>
      <value>xxx2xxx.abc.site:50470</value>
   </property>
   <property>
      <name>dfs.namenode.rpc-address.nameservice.nn1</name>
      <value>xxx1xxx.abc.site:8020</value>
   </property>
   <property>
      <name>dfs.namenode.rpc-address.nameservice.nn2</name>
      <value>xxx2xxx.abc.site:8020</value>
   </property>
   <property>
      <name>dfs.namenode.kerberos.principal</name>
      <value>nn/_HOST@EXAMPLE.COM</value>
   </property>
   <property>
      <name>dfs.namenode.keytab.file</name>
      <value>/etc/security/keytabs/nn.service.keytab</value>
   </property>
   <property>
      <name>dfs.datanode.kerberos.principal</name>
      <value>dn/_HOST@EXAMPLE.COM</value>
   </property>
   <property>
      <name>dfs.datanode.keytab.file</name>
      <value>/etc/security/keytabs/dn.service.keytab</value>
   </property>
</configuration>

As per the properties we specify dfs.ha.namenodes.nameservice = nn1,nn2 and dfs.https.port = 50470

  • So, the property for dfs.namenode.https-address.nameservice.nn1 = xxx1xxx.abc.site:50470
  • And for dfs.namenode.https-address.nameservice.nn2 = xxx2xxx.abc.site:50470
  • Similarly dfs.namenode.rpc-address.nameservice.nn1 = xxx1xxx.abc.site:8020
  • And dfs.namenode.rpc-address.nameservice.nn2 = xxx2xxx.abc.site:8020

Since, all of the above require properties with respect to nameservice.

But, when it comes to Kerberos the properties will be:

  • dfs.namenode.kerberos.principal = nn/_HOST@EXAMPLE.COM
  • dfs.namenode.keytab.file = /etc/security/keytabs/nn.service.keytab

The value for _HOST will be resolved at run time.

For instance running klist on Namenode service keytab on host 1 (xxx1xxx.abc.site) will return :

[root@xxx1xxx ~]# klist -kt /etc/security/keytabs/nn.service.keytab
Keytab name: FILE:/etc/security/keytabs/nn.service.keytab
KVNO Timestamp           Principal
---- ------------------- ------------------------------------------------------
   1 03/16/2017 23:57:54 nn/xxx1xxx.abc.site@EXAMPLE.COM
   1 03/16/2017 23:57:54 nn/xxx1xxx.abc.site@EXAMPLE.COM
   1 03/16/2017 23:57:54 nn/xxx1xxx.abc.site@EXAMPLE.COM
   1 03/16/2017 23:57:54 nn/xxx1xxx.abc.site@EXAMPLE.COM
   1 03/16/2017 23:57:54 nn/xxx1xxx.abc.site@EXAMPLE.COM

Whereas running klist on Namenode service keytab on host 2 (xxx2xxx.abc.site) will return :

[root@xxx2xxx ~]# klist -kt /etc/security/keytabs/nn.service.keytab
Keytab name: FILE:/etc/security/keytabs/nn.service.keytab
KVNO Timestamp           Principal
---- ------------------- ------------------------------------------------------
   1 03/16/2017 23:57:54 nn/xxx2xxx.abc.site@EXAMPLE.COM
   1 03/16/2017 23:57:54 nn/xxx2xxx.abc.site@EXAMPLE.COM
   1 03/16/2017 23:57:54 nn/xxx2xxx.abc.site@EXAMPLE.COM
   1 03/16/2017 23:57:54 nn/xxx2xxx.abc.site@EXAMPLE.COM
   1 03/16/2017 23:57:54 nn/xxx2xxx.abc.site@EXAMPLE.COM

Hope, this explains you why we don't need two properties for kerberos unlike other nameservice properties.

New Contributor

@Namit Maheshwari we have to

dfs.namenode.kerberos.principal = nn/_HOST@EXAMPLE.COM when trying to access the cluster from the client as well?

@Pushparaj Motamari - I am not sure why do you need the dfs.namenode.kerberos.principal from client. This is a service principal which is supposed to be used by HDFS internally.

For all the service users there need to be user keytabs and principals. Like below:

[root@xxx2xxx keytabs]$ klist -kt user1.headless.keytab
Keytab name: FILE:user1.headless.keytab
KVNO Timestamp           Principal
---- ------------------- ------------------------------------------------------
   2 03/17/2017 00:18:14 user1@EXAMPLE.COM
   2 03/17/2017 00:18:14 user1@EXAMPLE.COM
   2 03/17/2017 00:18:14 user1@EXAMPLE.COM
   2 03/17/2017 00:18:14 user1@EXAMPLE.COM
   2 03/17/2017 00:18:14 user1@EXAMPLE.COM
   2 03/17/2017 00:18:14 user1@EXAMPLE.COM
   2 03/17/2017 00:18:14 user1@EXAMPLE.COM

Below is a pretty good guide for Kerberos:

https://community.hortonworks.com/articles/1327/kerberos-the-missing-guide.html

New Contributor
@Namit Maheshwari

Client has to contact namenode and for that it has to get ticket from kdc. In order to get ticket from KDC it has to know the principal before hand. How does the client knows the principals unless some entity in the cluster informing it or we providing in the configuration. P.S: I am accessing an hadoop cluster (secured) from a java program from a machine which is in the subnet as the hadoop cluster and kdc.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.