Created 05-28-2024 03:15 AM
docker run -d --name nifi24 -p 8443:8443 -e SITE_TO_SITE_SECURE=false -v ~/tools/nifi24_conf/conf:/opt/nifi/nifi-current/conf -v ~/tools/nifi24_conf/lib:/opt/nifi/nifi-current/lib -v ~/tools/nifi24_conf/nar_extensions:/opt/nifi/nifi-current/extensions apache/nifi:1.24.0
# Site to Site properties
nifi.remote.input.host=c30abd07b4ba
nifi.remote.input.secure=true
nifi.remote.input.socket.port=10000
nifi.remote.input.http.enabled=false
nifi.remote.input.http.transaction.ttl=30 sec
nifi.remote.contents.cache.expiration=30 secs
nifi.web.http.host=
nifi.web.http.port=
nifi.web.http.network.interface.default=
#############################################
nifi.web.https.host=c30abd07b4ba
nifi.web.https.port=8443
nifi.web.https.network.interface.default=
nifi.web.https.application.protocols=http/1.1
nifi.web.jetty.working.directory=./work/jetty
nifi.web.jetty.threads=200
nifi.web.max.header.size=16 KB
nifi.web.proxy.context.path=
nifi.web.proxy.host=
nifi.web.max.content.size=
nifi.web.max.requests.per.second=30000 nifi.web.max.access.token.requests.per.second=25
nifi.web.request.timeout=60 secs
nifi.web.request.ip.whitelist=
nifi.web.should.send.server.version=true
nifi.web.request.log.format=%{client}a - %u %t "%r" %s %O "%{Referer}i" "%{User-Agent}i"
Unfortunately, I am not able to get it working as i understand that it is not possible to configure Site to Site with security disabled while also running NiFi with HTTPS. Those settings go together.
Please advise on how to get this working. Many Thanks
Created 06-12-2024 08:32 PM
Hi @MattWho ,
I have figured it out,
I set the access policy recieve data via site-to-site and its has now started to work.
i used an api call to set the value referring to this.
Access Policies | CDP Private Cloud (cloudera.com)
thank you so much for your help.
TO Summarize,
nifi.properties
bash-4.4$ cat conf/nifi.properties | grep remote
nifi.remote.input.host=nifi-0.nifi-headless.namespace.svc.cluster.local
nifi.remote.input.secure=true
nifi.remote.input.socket.port=10443
nifi.remote.input.http.enabled=true
nifi.remote.input.http.transaction.ttl=30 sec
nifi.remote.contents.cache.expiration=30 secs
in another pod
nifi.remote.input.host=nifi-1.nifi-headless.namespace.svc.cluster.local
nifi.web.https.host=nifi-0.nifi-headless.namespace.svc.cluster.local
nifi.web.https.port=9443
and respectively on another pod
nifi.web.https.host=nifi-1.nifi-headless.namespace.svc.cluster.local
nifi.web.https.port=9443
set access policies
created reporting task
url set is podname.svc/https port
eg
https://nifi-0.nifi-headless.doc-norc.svc.cluster.local:9443/nifi
set management controller service
created an input port and remote group to send data
Created 05-28-2024 06:42 AM
@scoutjohn
The Site-To-Site (S2S) configuration properties configure how your NiFi instance handles both inbound S2S to and outbound S2S connections are handled. It is the receiving instance of NiFi the determines if S2S communication should be secure or not.
nifi.remote.input.secure=true nifi.remote.input.socket.port=10000 nifi.remote.input.http.enabled=false
First you need to understand how S2S works.
The instance of of NiFi with a RemoteProcessGroup (RPG) or a S2S Reporting task is the client side of the connection. When that client component (RPG or S2S reporting task) executes it need to communicate with the target NiFi. That initial communication is always going to be over HTTP(S) to the target NiFi. So if the target NiFi is secured (nifi.web.https.port configured) and the URL provided to RPG or S2S reporting task is "HTTPS" the initial connection is going to be secure. This initial connection is used to fetch S2S details from the target NiFi. Included in those S2S details are numerous bits of information to include:
With the setup you shared your NiFi is setup with only the nifi.web.https.port configured meaning that this NiFi can only support https communications from S2S connections.
Not sure why you would want to send your data unsecured over your network. Whey not send secure since your NiFi is already secured over https.
Now if you were to also configure the nifi.web.http.port (which makes no sense since you would be exposing your NiFi UI unsecured over http as well as secured over https), does it still force nifi.remote.input.secure back to true from false? I have not confgures http and https at same time for a very very long time (only some done rarely when there were different internal and external networks). I could not find any Apache Jiras that stated this is no longer an option, but it is possible that this has changed. But even if possible, i still question using unsecured when your NiFi is already secured.
Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 05-28-2024 09:14 PM
Hi @MattWho , Thank you for your response.
I am not trying to send data unsecured over network. Nifi running in my local is on https and i want it to stay it that way. But I would also like to fetch the Provenance data
I was trying to configure my nifi running on standalone mode based on what is described in the document
I have changed the nifi.remote.input.http.enabled as true
also tried adding StandardRestrictedSSLContextService
I have used the same value which is there in the truststore and keystore values in the nifi.properties.
i can see logs like this
// Another save pending = false
2024-05-29 04:05:25,991 INFO [Timer-Driven Process Thread-7] o.a.n.c.s.TimerDrivenSchedulingAgent SiteToSiteProvenanceReportingTask[id=a971da9d-018f-1000-2b00-6824f28134d8] started.
2024-05-29 04:05:26,214 INFO [Flow Service Tasks Thread-1] o.a.nifi.controller.StandardFlowService Saved flow controller org.apache.nifi.controller.FlowController@c18025a // Another save pending = false
2024-05-29 04:05:36,347 INFO [pool-7-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository
2024-05-29 04:05:36,348 INFO [pool-7-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile Repository with 23 records in 0 milliseconds
2024-05-29 04:05:37,621 INFO [Timer-Driven Process Thread-6] o.a.n.p.store.WriteAheadStorePartition Successfully rolled over Event Writer for Provenance Event Store Partition[directory=./provenance_repository] due to MAX_TIME_REACHED. Event File was 17.76 KB and contained 10 events.
2024-05-29 04:05:56,348 INFO [pool-7-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile Repository
2024-05-29 04:05:56,352 INFO [pool-7-thread-1] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 24 Records and 0 Swap Files in 3 milliseconds (Stop-the-world time = 1 milliseconds), max Transaction ID 48
But unfortunately, I do not see any data flowing into the input port
Created 05-29-2024 05:42 AM
@scoutjohn
The article you are using for reference was written back in 2016 before NiFi was changed to starting secure out of the box. It is written entirely around that unsecured NiFi example. You could always unsecure your NiFi and test out S2S capability. That would atleast allow you to test/evaluate the functionality.
When NiFi is secure both authentication and authorization must be handled. This includes authentication and authorizations for S2S operations. An out-of-box installation of NiFi utilizes self -generated self-signed certificates to create the keystore and truststore files needed for mutualTLS. It also uses a very basic non production single-user-provider for user authentication and a single-user-authorizer for user/client authorization. These basic providers make it easy to evaluate NiFi, but are not robust enough to support all features. Is this what you are using still or have you created your own keystore and truststore files and setup non single user authentication and authorization providers?
To be honest, I always setup production ready NiFi instance and clusters that don't use the auto-generated self-signed certificates and or single user providers. I can't say that I have tried using S2S in such out-of-box environment. So I can't say that the single-user-authorizer supports those needed authorizations.
Above being said, I see you set nifi.remote.input.http.enabled=true, but all that property does is allow http transport protocol which means that means that the NiFi would support transferring FlowFiles over http protocol. That does not mean unsecured, it could be http or https depending on the destination URL. The S2S properties in the the NiFi properties need to be modified to support secure S2S by changing nifi.remote.http.secure=true (you did not comment if you made that change or not).
1. Is your S2SProvenanceReportingTask producing any bulletin messages?
2. Are you seeing any not authorized related log lines in the nifi-user.log?
3. What keystore and truststore did you configure in the StandardRestrictedSSLContextService controller service?
I'll try to mess around with and out-of-box setup if that is what you are using to see if what you are trying to do is possible in such a non-production ready setup when I have some time.
Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created on 05-29-2024 10:03 PM - edited 05-29-2024 10:08 PM
Hello @MattWho, thank you for looking into it.
# Site to Site properties 8c92690b14e6
nifi.remote.input.host=cd8e8c899db6
nifi.remote.input.secure=true
nifi.remote.input.socket.port=10000
nifi.remote.input.http.enabled=true
nifi.remote.input.http.transaction.ttl=30 sec
nifi.remote.contents.cache.expiration=30 secs
2024-05-30 04:48:43,666 INFO [Timer-Driven Process Thread-9] o.a.n.c.s.StandardControllerServiceNode Successfully enabled StandardControllerServiceNode[service=SSLContextService[id=c27f79ba-018f-1000-ada5-343b2ba8f4e2], name=StandardRestrictedSSLContextService, active=true]
2024-05-30 04:49:07,157 INFO [Timer-Driven Process Thread-1] o.a.n.c.s.TimerDrivenSchedulingAgent SiteToSiteProvenanceReportingTask[id=a971da9d-018f-1000-2b00-6824f28134d8] started.
nifi.security.autoreload.enabled=false
nifi.security.autoreload.interval=10 secs
nifi.security.keystore=./conf/keystore.p12
nifi.security.keystoreType=PKCS12
nifi.security.keystorePasswd=b465f3c4cb37f83f825a2166a656719f
nifi.security.keyPasswd=b465f3c4cb37f83f825a2166a656719f
nifi.security.truststore=./conf/truststore.p12
nifi.security.truststoreType=PKCS12
nifi.security.truststorePasswd=e20ef7bb480f25c7e2446bbaffc1d95b
Created 05-30-2024 01:27 PM
@scoutjohn
I installed an out-of-the-box Apache NiFi 1.26 using single user providers and the NiFi self-signed generated certificates.
I was able to send provenance events via the S2SProvenanceReportingTask successfully back to a Remote Input Port on the same NiFi with no issues. So authorization is not an issue here. I tested using both HTTP and RAW transport protocols successfully.
I also validated that S2S was working by setting up a Remote Process Group to send FlowFiles to a Remote Input port as well. Here is the dataflow I setup:
You can see in the above that i generated some FlowFiles that were sent over S2S to the "Input1" remote port. You can also see that my "prov" port received provenance events from the S2SProvenanceReportingTask.
My S2S setting from nifi.properties file:
# Site to Site properties
nifi.remote.input.host=localhost
nifi.remote.input.secure=true
nifi.remote.input.socket.port=10001
nifi.remote.input.http.enabled=true
nifi.remote.input.http.transaction.ttl=30 sec
nifi.remote.contents.cache.expiration=30 secs
My Remote Process Group configuration:
Switching to "HTTP" transport protocol also worked.
S2SProvenanceReportingTask configuration:
While all of this worked correctly, sending provenance events via the S2SProvenanceReportingTask back to the same NiFi is not advisable. It creates an endless loop of provenance events. For every FlowFile received on the "prov" port another provenance "RECEIVE" event is created which then gets set by the reporting task. This an infinite loop is created. You would certainly have difficulty related to authentication and authorization sending to another NiFi instance using the out-of-the-box keystore, truststore, and single user providers between two out of the box NiFi deployments. But for testing purposes this works.
Now I see from your configuration you setup:
nifi.remote.input.host=cd8e8c899db6
Makes me wonder if that given hostname is:
keytool -v -list -keystore keystore.p12
Try changing that property to "localhost" see if it resolves your issue.
Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 05-31-2024 04:51 AM
I was able to make this running by running this on Windows. Using the same configurations as you have done.
My docker runs on WSL, Nifi was not coming up when i changed the host name to localhost.
Thank you so much for your time
Created on 06-06-2024 03:29 AM - edited 06-06-2024 05:33 AM
Hello @MattWho,
Sorry to come back on this topic.
I am trying to implement the same S2S reporting task in a Kubernetes environment.
we have NiFi running in cluster mode. and we have 2 pods (it is usually 3, on embedded zookeeper. We're just working with 2 nodes for the time being)
The configuration for s2s is set as follows
bash-4.4$ cat conf/nifi.properties | grep remote
nifi.remote.input.host=nifi-0.nifi-headless.namespace.svc.cluster.local
nifi.remote.input.secure=true
nifi.remote.input.socket.port=10443
nifi.remote.input.http.enabled=true
nifi.remote.input.http.transaction.ttl=30 sec
nifi.remote.contents.cache.expiration=30 secs
I have tried with protocol RAW and HTTP
we have also given permission to the user in Retrieve site-to-site details
we have set the destination URL as https://<fullyqualifiedDNS>:portnumber
https://nifi-0.nifi-headless.namespace.svc.cluster.local:9443
the host is
nifi.remote.input.host=nifi-0.nifi-headless.doc-norc.svc.cluster.local
port number we're using is
nifi.web.https.port=9443
but the events are not coming into the port
logs says
the authentication is successful
{"type":"log", "facility":"25", "host":"ao0059-cjts5-worker-0-lqm8m", "level":"INFO", "event-type":"N_USER_OPER", "systemid":"nifi","neid":"706546b360714e94b74591ca351b0655", "system":"nifi-0", "time":"2024-06-06T12:06:13.189Z" ,"timezone":"UTC", "log":"[NiFi Web Server-1830] o.a.n.w.s.NiFiAuthenticationFilter Authentication Started 10.255.15.73 [CN=nifi-api-admin] GET https://nifi-0.nifi-headless.namespace.svc.cluster.local:9443/nifi-api/site-to-site"}
{"type":"log", "facility":"25", "host":"ao0059-cjts5-worker-0-lqm8m", "level":"INFO", "event-type":"N_USER_OPER", "systemid":"nifi","neid":"706546b360714e94b74591ca351b0655", "system":"nifi-0", "time":"2024-06-06T09:51:41.644Z" ,"timezone":"UTC", "log":"[NiFi Web Server-854] o.a.n.w.s.NiFiAuthenticationFilter Authentication Success [CN=nifi-api-admin] 10.255.15.73 GET https://nifi-0.nifi-headless.namespace.svc.cluster.local:9443/nifi-api/site-to-site"}
The log also says
No events to send due to 'events' being null or empty.
{"type":"log", "host":"ao0059-cjts5-worker-0-lqm8m", "level":"DEBUG", "event-type":"N_USER_OPER", "systemid":"nifi","neid":"706546b360714e94b74591ca351b0655", "system":"nifi-0", "time":"2024-06-06T10:05:39.499Z" ,"timezone":"UTC", "log":"[Timer-Driven Process Thread-8] o.a.n.r.SiteToSiteProvenanceReportingTask SiteToSiteProvenanceReportingTask[id=ecacc388-018f-1000-ffff-ffff8c5138a7] Returning LOCAL State: StandardStateMap[version=-1, values={}]"}
{"type":"log", "host":"ao0059-cjts5-worker-0-lqm8m", "level":"DEBUG", "event-type":"N_USER_OPER", "systemid":"nifi","neid":"706546b360714e94b74591ca351b0655", "system":"nifi-0", "time":"2024-06-06T10:05:39.499Z" ,"timezone":"UTC", "log":"[Timer-Driven Process Thread-8] o.a.n.r.SiteToSiteProvenanceReportingTask SiteToSiteProvenanceReportingTask[id=ecacc388-018f-1000-ffff-ffff8c5138a7] No events to send due to 'events' being null or empty."}
{"type":"log", "host":"ao0059-cjts5-worker-0-lqm8m", "level":"DEBUG", "event-type":"N_USER_OPER", "systemid":"nifi","neid":"706546b360714e94b74591ca351b0655", "system":"nifi-0", "time":"2024-06-06T10:05:44.501Z" ,"timezone":"UTC", "log":"[Timer-Driven Process Thread-4] o.a.n.r.SiteToSiteProvenanceReportingTask SiteToSiteProvenanceReportingTask[id=ecacc388-018f-1000-ffff-ffff8c5138a7] Returning LOCAL State: StandardStateMap[version=-1, values={}]"}
{"type":"log", "host":"ao0059-cjts5-worker-0-lqm8m", "level":"DEBUG", "event-type":"N_USER_OPER", "systemid":"nifi","neid":"706546b360714e94b74591ca351b0655", "system":"nifi-0", "time":"2024-06-06T10:05:44.502Z" ,"timezone":"UTC", "log":"[Timer-Driven Process Thread-4] o.a.n.r.SiteToSiteProvenanceReportingTask SiteToSiteProvenanceReportingTask[id=ecacc388-018f-1000-ffff-ffff8c5138a7] No events to send due to 'events' being null or empty."}
there are entries on data provenance
there was also an issue where the task complained that the input port is not available
{"type":"log", "host":"ao0059-cjts5-worker-0-z5b82", "level":"ERROR", "event-type":"N_USER_OPER", "systemid":"nifi","neid":"174577b6145e4b87a86e5d9c397c8f75", "system":"nifi-0", "time":"2024-06-04T11:20:41.566Z" ,"timezone":"UTC", "log":"[Timer-Driven Process Thread-3] o.a.n.r.SiteToSiteProvenanceReportingTask SiteToSiteProvenanceReportingTask[id=e2e547c5-018f-1000-0000-00004876faee] Error running task SiteToSiteProvenanceReportingTask[id=e2e547c5-018f-1000-0000-00004876faee] due to org.apache.nifi.processor.exception.ProcessException: Failed to send Provenance Events to destination due to IOException:Could not find Port with name 'prov' for remote NiFi instance"}
but now this is not coming up, though we have not made any changes on it.
there are no other errors in the logs, have enabled debug to
org.apache.nifi.reporting
I tried the nifi-api
https://nifi-0.nifi-headless.namespace.svc.cluster.local:9443/nifi-api/site-to-site/peers
and
https://nifi-0.nifi-headless.namespace.svc.cluster.local:9443/nifi-api/nifi-api/site-to-site
Can you please help us with this? Thank you for your time
Created 06-07-2024 07:39 AM
@scoutjohn
I don't have a Kubernetes env to mess around with currently.
But a couple things i see from your response:
Thank you,
Matt
Created on 06-07-2024 10:27 PM - edited 06-07-2024 11:15 PM
Hi @MattWho , Thank you for your response.
nifi.web.https.host=nifi-0.nifi-headless.namespace.svc.cluster.local
nifi.web.https.port=9443
and respectively on other pod
nifi.web.https.host=nifi-1.nifi-headless.namespace.svc.cluster.local
nifi.web.https.port=9443
we also have proxy host
nifi.web.proxy.context.path=/apigw/namespace/nifi
nifi.web.proxy.host=ckng.apps.ao0059.tre.nsn-rdnet.net:443, nifi-headless.namespace.svc.cluster.local:9443
which is same for both the pods
bash-4.4$ ping nifi-1.nifi-headless.namespace.svc.cluster.local
PING nifi-1.nifi-headless.namespace.svc.cluster.local (10.255.8.118) 56(84) bytes of data.
64 bytes from nifi-1.nifi-headless.namespace.svc.cluster.local (10.255.8.118): icmp_seq=1 ttl=64 time=0.019 ms
64 bytes from nifi-1.nifi-headless.namespace.svc.cluster.local (10.255.8.118): icmp_seq=2 ttl=64 time=0.027 ms
and for nifi0
bash-4.4$ ping nifi-0.nifi-headless.namespace.svc.cluster.local
PING nifi-0.nifi-headless.namespace.svc.cluster.local (10.255.8.118) 56(84) bytes of data.
64 bytes from nifi-0.nifi-headless.namespace.svc.cluster.local (10.255.8.118): icmp_seq=1 ttl=64 time=0.019 ms
64 bytes from nifi-0.nifi-headless.namespace.svc.cluster.local (10.255.8.118): icmp_seq=2 ttl=64 time=0.027 ms
the configurations match for both the pods