Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Configure Nifi to mutliple kerberized HDP Cluster

avatar

Hi everyone,

 

I'm working to a new feature with an existing nifi cluster to provide a new service to add an interface with serveral kerberized HDP Cluster.

 

I would like to know if a single Nifi cluster can use several realms in the same krb5 file.

Reading official documentation, nifi can do it (https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#kerberos_properties) : " If necessary the krb5 file can support multiple realms."

 

It seems ok but because at this time I have no way to have several hadoop cluster for testing (to notice : my cluster is working already with one kerberized cluster hadoop ), is anybody can confirm or reject this design: one cluster Nifi with different realms to communicate with multiple kerberized hdp cluster.

 

Thanks for your help and as soon as I have several kerberized cluster hadoop for testing, I will update this article.

 

1 ACCEPTED SOLUTION

avatar

Hello !


Sorry I was out during few months.

No need to have a cross-realm trust setup because it's just a single one direction.

 

hadoop.JPG


Solution and it's now running :

[realms]
  romulus = {
    admin_server = <...>
    kdc = <...>
  }

  remus = {
    admin_server = <...>
    kdc = <...>
  }

[domain_realm]

  <IP Name Node 1 romulus cluster> = romulus
  <IP Name Node 2 romulus cluster> = romulus

  <IP Name Node 1 remus cluster> = remus
  <IP Name Node 2 remus cluster> = remus

 

Let me explain :

Nifi needs a default realm. the default realm is not used to communicate with project Hadoop cluster kerberised (romus and remulus).

To help Nifi you must maps the name node hostnames to Kerberos realms in the section domain_realm.

 

In this case, Nifi will try to use the default realm and the realm of the main kerberos defined in the HDFS processor of the project and will failed.

hadoop2.JPG

It was a little bit tricky 😉

View solution in original post

12 REPLIES 12

avatar
Master Mentor

@dupuy_gregory 

Yes, a krb5.conf can contain multiple realms in it.

The HDFS processor components offered through NiFi can each be configured with different hadoop configuration resources (core-site.xml, hdfs-site.xml, etc..) and different Kerberos credentials (keytab and principal).

This will allow various NiFi dataflows to interact with different target Hadoop clusters.

If you found this response addressd yoru query, please take a moment to login and click on "Accept as Solution"

Thank you,

Matt

avatar

Thanks for your anwser !

 

Waiting for several days to have an another Hadoop Cluster, it's now provided.
After some configuration and check, it's working with some trouble.

Let me explain my issue.

 

I have two Hadoop kerberized cluster : romulus and remus.
For each one, kerberos configuration wad added in the krb5.conf

Find an extract of this configuration :

 

[libdefaults]
  default_realm = romulus
  dns_lookup_realm = false
  dns_lookup_kdc = true
  rdns = false
  dns_canonicalize_hostname = false
  ticket_lifetime = 168h 0m 0s
  renew_lifetime = 90d
  forwardable = true
  udp_preference_limit = 0
  ...

[realms]
  romulus = {
    admin_server = <...>
    kdc = <...>
  }
  remus = {
    admin_server = <...>
    kdc = <...>
  }

 

If I use the kinit client , it's working fine.

 

I configure two GetHDFS processor, each one with the core-site.xml and hdfs-site.xml for Hadoop Configuration Resources, the keytab and the principal associated.

Capture.JPG

 

Case one : default_realm is romulus and I start the GetHDFS processor romulus.

=> Ok nice, I get the hdfs file (you can see the flowfile in the queue)

 

Case two: default_realm is still romulus and I start the GetHDFS processor remus

=> KO :

Failed on local exception: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: nn/<ip namenode active>@<ROMULUS>, expecting: nn/<ip namenode active>@<REMUS>

Nifi trying to connect using the default realm romulus : 

 

Server has invalid Kerberos principal: nn/<ip namenode active>@<ROMULUS>

 

And with the correct realm remus :

 

expecting: nn/<ip namenode active>@<REMUS>

 

 

To sum up : Nifi working well with the default realm but have an issue with the other realms.

 

Is there a configuation I missed ?

 

Thanks

 

 

avatar
Master Mentor

@dupuy_gregory 

 

Can you share the configurations for both your GetHDFS processors?

avatar

For sur !

 

For your information because I haven't written : if I switch the default realm under krb5.conf from romulus to remus, romulus not working and remus is ok. it's like Nifi, when the default realm it's not the same as the principal realm is lost and try to use an another realm.

 

Further more, remus and romulus are not the true name and I need to change path, server name, ip and other objets to share with you 😉

 

Remus :

remus.JPG

Romulus :

romulus.JPG

avatar

I made some changes just to try

- added under /etc/hosts name node IP and server name

- updated dns_lookup_kdc from true to false

- udp_preference_limit from 0 to 1

 

No effect. It's still not working

avatar

@dupuy_gregory 

When using multiple realms, the KDC servers have to have cross-realm trust setup.

Here is an article with links to good resources that explain in detail: How does a cross realm trust work? 

One of the links is broken in the article, here is a good link for the setup of cross-realm trust: Setting up Cross-Realm Kerberos Trusts 

 

As a test to make sure it is working, run the kinit command in a terminal window on the NiFi node for the principal in the non-default realm, for example, kinit hdfs-romulus@ROMULUS

avatar

Hello !


Sorry I was out during few months.

No need to have a cross-realm trust setup because it's just a single one direction.

 

hadoop.JPG


Solution and it's now running :

[realms]
  romulus = {
    admin_server = <...>
    kdc = <...>
  }

  remus = {
    admin_server = <...>
    kdc = <...>
  }

[domain_realm]

  <IP Name Node 1 romulus cluster> = romulus
  <IP Name Node 2 romulus cluster> = romulus

  <IP Name Node 1 remus cluster> = remus
  <IP Name Node 2 remus cluster> = remus

 

Let me explain :

Nifi needs a default realm. the default realm is not used to communicate with project Hadoop cluster kerberised (romus and remulus).

To help Nifi you must maps the name node hostnames to Kerberos realms in the section domain_realm.

 

In this case, Nifi will try to use the default realm and the realm of the main kerberos defined in the HDFS processor of the project and will failed.

hadoop2.JPG

It was a little bit tricky 😉

avatar
Cloudera Employee

@dupuy_gregory  - from your last comment, it seems you have a solution, is that correct? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.



Regards,

Chris McConnell,
Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar

Hi Christopher,

 

It's done ! I hope this post can help other people !