Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

In Which node(i have 3 node nifi cluster) i have to copy that hdfs-site.xml,core-site.xml if i want to interact with hadoop

In Which node(i have 3 node nifi cluster) i have to copy that hdfs-site.xml,core-site.xml if i want to interact with hadoop

New Contributor

Hi All,

I have 3 node nifi cluster and 3 node Hadoop cluster if i want to interact with HDFS, is i have to copy the configuration files(hdfs-site.xml,core-site.xml) into 3 nifi nodes or only in nifi master node ?

8 REPLIES 8

Re: In Which node(i have 3 node nifi cluster) i have to copy that hdfs-site.xml,core-site.xml if i want to interact with hadoop

Hi @AnjiReddy Anumolu,

Even though you create a flow in the NCM UI, it runs on each nifi cluster node. So the configuration files should be placed in all the cluster nodes under same directory for this to work. (not mandatory to have in NCM).

- you may see similar errors if you don't have it on all nodes and processor will be in invalid state.

5282-screen-shot-2016-06-28-at-123945-am.png

Thanks!!

Re: In Which node(i have 3 node nifi cluster) i have to copy that hdfs-site.xml,core-site.xml if i want to interact with hadoop

New Contributor

Hi @Jobin George, I have a query regarding your answer. I have a 3 node NiFi cluster setup and a 3 node HDP setup. Though I faced the same issue accessing the UI from NCM, I did not get any error when I accessed it from a browser in the Hadoop Namenode.

I referenced the config files from inside the Namenode and data was transferred from NiFi to HDFS directory successfully.

It may not be a good approach to access NiFi from the Namenode in production, but for experimentation and learning purposes can you pls try the above and let me know if it utilizes all the NiFi nodes or is it running in a single node(which defeats the use of the cluster)

Also, if the above method does work, any suggestions to suit the production environment?

Thanks.

Re: In Which node(i have 3 node nifi cluster) i have to copy that hdfs-site.xml,core-site.xml if i want to interact with hadoop

Hi @Saisubramaniam Gopalakrishnan,

Do you have issue accessing the cluster UI, or you only see the same error as in my screenshot?

Thanks,

Re: In Which node(i have 3 node nifi cluster) i have to copy that hdfs-site.xml,core-site.xml if i want to interact with hadoop

New Contributor

I do not have any issues @Jobin George, I am able to transfer data from NiFi into HDFS from a browser in the Namenode, by referencing the path of the configurations files inside Hadoop directory from Namenode.

I want to know if by this method, NiFi is able to run in a full clustered mode(since the config files are not copied to other NiFi nodes) or does it internally run as a single node setup.

Thanks.

Re: In Which node(i have 3 node nifi cluster) i have to copy that hdfs-site.xml,core-site.xml if i want to interact with hadoop

Hi @Saisubramaniam Gopalakrishnan

Browser in Namenode is what confusing me, are you accessing NiFi cluster NCM url, or another instance of NiFi running on Namenode?

Re: In Which node(i have 3 node nifi cluster) i have to copy that hdfs-site.xml,core-site.xml if i want to interact with hadoop

New Contributor

The NiFi cluster NCM url :-)

Re: In Which node(i have 3 node nifi cluster) i have to copy that hdfs-site.xml,core-site.xml if i want to interact with hadoop

@Saisubramaniam Gopalakrishnan: Oh ok. You have total 6 separate machines running 3node HDP cluster and 3node NiFi cluster? or both cluster runs on the same 3 nodes?

Re: In Which node(i have 3 node nifi cluster) i have to copy that hdfs-site.xml,core-site.xml if i want to interact with hadoop

New Contributor

Thank you for the suggestion @Jobin George, you are right. I have 4 machines, one Namenode, one NCM, and two Datanodes/NiFi nodes. I guess that is why I did not face the error. Apologies for the comments.

Is this setup of 4 machines with shared components a good approach, or do you suggest having separate machines for the NiFi nodes? I will not be dealing with too much overhead at the datanodes, only when there is a need for nightly model re-training and during model predictions (please also have a look at my query in your NiFi + Spark : Feeding Data to Spark Streaming thread)

Thanks for your time and patience :-)