Created 08-02-2024 02:46 PM
Hi ,
I haven been playing with nifi on docker lately and its been quite the challenge and the learning experience. To best understand how to utilize docker for nifi, Im hoping the community can help me with addressing the following observations\questions:
1- Most of the examples I found on the internet including the nifi official docker page seem to be suitable for single host deployment ! Im finding this is strange - unless Im missing something - but doesnt that defeat the purpose of having cluster with no single point of failure ? What are the the scenarios where someone wants to deploy single host multiple container cluster vs multiple host single container ?
2. Getting to understand docker networking I found that if I want to create multi host cluster and have the cluster to have visibility to our work network then the ideal way to do it is using "host" networking, is this correct or is there a better way (maybe using overlay networking with swarm? if I do that later then how Im going to access none docker servers on my network?
3. If "host" networking is one of the options, then why the official nifi docker image doent mention how to see the https host name as one of the environment propeties similar to what we do locally by setting "" in the ? using other sites\images I found the property "NIFI_WEB_HTTPS_HOST" can be used fort that which works ! Is there another way of setting the host?
4. Initially I was trying to use embedded zookeeper setup but I found that it doesnt work no matter how hard I tried. I found a lot people recommending using external zookeeper which what I ended up doing. Actually it turns out there is Jira bug for the problem I was faciing but its not resolved despite its been open for couple years! Why is that and is it ever going to be fixed or the recommendation is to use external zookeeper? if so at least that should have been mentioned somewhere.
5. Are the environment variables listed in the official docker page cover everything or there is more? where we can find comprehensive list of all the environment properties? I can see for example this image seem to list more env properties.
6. This is really important because I struggled the most with: How do we go about setting the nodes identity so that they are included in the authorizers.xml file? I could not find any clear instruction on this and I was getting the "Untrusted Proxy ". The only way I was able to get it to work is to manually update this file (using docker cp) but I had to also delete the generated users.xml and authorizations.xml files while the container is running because it seems you cant do it while container is stopped. I dont think this is the proper way of doing it and I hope there is better way that can be done in the yml file itself.
I really appreciate the community feedback on this specially from the expert like @MattWho , @steven-matison, @pvillard