Support Questions
Find answers, ask questions, and share your expertise

How to use Apache falcon to copy data from Non-Kerberized cluster to a Kerberized one?


We need to copy data from a non-kerberized (NK) cluster to a kerberized one (K). I am registering entities in "K" cluster. I was able to register the "K" cluster with the falcon server running on "K" cluster without any problems. While registering the "NK" cluster to the falcon server running on "K" , I run into multiple issues. Falcon seems to expect a "kerberos" principal by default. It is not clear if this property is for the "K" cluster or for the "NK" cluster..... Right now, I am using the namenode prinicipal in "K" cluster itself (not sure if this is right). And, while registering the NK entity, an error message says "Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections". ... So , I just added a property like this: " <property name="ipc.client.fallback-to-simple-auth-allowed" value="true" />". But this does not help either. Can some1 help how to do this with Falcon. Also, please kindly point me to some nice tutorials on Falcon. Thank you.


Cloudera Employee
@Sarnath K

Your cluster definition might be wrong. What's your HDP version?

You can try it with HDP 2.5.3. Create cluster definition using falcon gui and try.


@Ram Baskaran

Thanks for coming back on this. HDP 2.5.3 is the version in the Hadoop Kerberized Cluster where Falcon server is running. HDP 2.4.2 is running in the Non-Kerberized Cluster

So, When I add the cluster entity XML for the Non-Kerberized Cluster, here is the error message I receive.

ERROR: Bad Request;default/Invalid storage server or port: hdfs://namenodehostOfNonKBCluster:8020, Cluster definition missing required namenode credential property: dfs.namenode.kerberos.principal CausedBy: Cluster definition missing required namenode credential property: dfs.namenode.kerberos.principal

hdfs://namenodehostOfNonKBCluster:8020 is the verbatim copy of the "fs.defaultFS" property in the "Advanced core-site" config section in the Ambari server running on the Non-Kerberized cluster.

Now, How do I tell Falcon not to look for the kerberos principal property?

Also, I assume that the "locations" for staging, working etc.. for both the "K" and "NK" cluster are created in the cluster where Falcon server is running. In my setup, this is the "K" cluster. Kindly let me know on this. Thank you very much for your time.

; ;