Member since
06-06-2018
4
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6039 | 06-13-2018 03:51 PM |
06-13-2018
03:51 PM
Apologies to anyone reading this; we discovered the error was actually due to another process on our system interfering with Java operations. NiFi was not at fault.
... View more
06-12-2018
04:05 PM
Hi @kkawamura, Sorry for the delay, here are the logs as requested. We haven't made any changes to nifi.sh or any of the startup logic; our only changes are to conf files. nifi-bx-stdout.txt
... View more
06-08-2018
05:05 PM
Hi @kkawamura, thanks for your response. Attached are snippets of the nifi logs. Let me know if you see anything amiss. nifi-app-snippet.txt nifi-bootstrap-snippet.txt
... View more
06-07-2018
01:47 PM
Hi everyone, We are attempting to set up a NiFi (v1.5.0) cluster on Amazon ECS, with a load balancer and DNS entry pointing to the UIs on every node. This works from a UI perspective, and we are able to get site-to-site working between cluster nodes using a Remote Process Group. However, when the cluster is coming up fresh, we have this issue: When we start NiFi, it attempts to refresh the status of the RPG, which we ping via the load balancer. This fails with a 502 error, because the UI is not yet up behind the load balancer. However, instead of continuing to bring the UI up, the service receives a SHUTDOWN signal from bootstrap and restarts. This causes the service to loop through its startup process indefinitely. HOWEVER - sometimes, eventually, magically, the UI will actually start. We believe that there is a race condition between a thread starting the UI and a thread starting the site-to-site components. When the UI thread finishes first, NiFi goes live. Otherwise, it loops. This loop can go on anywhere from 10 minutes to several hours. My question is: is this a bug or an intended function? Are there any known workarounds? I've been unable to find documentation or forum posts on this issue. Here is a stack trace from the nifi-app.log file during a failed start. As you can see, it throws the 502 error, then begins restarting the whole service without an indication as to exactly why this is happening. Ideal behavior here would be that this error is non-blocking, and that the UI would still start successfully. A bulletin or warning should then appear informing the user that the remote process group is unreachable. Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: 2018-06-06 20:59:06,143 WARN [Remote Process Group 47a43fe6-2c9b-39aa-4fae-5851f048d58b Thread-1] o.a.n.r.util.SiteToSiteRestApiClient Failed to get controller from http://<URL>:8080/nifi-api due to org.apache.nifi.remote.util.SiteToSiteRestApiClient$HttpGetFailedException: response code 502:Bad Gateway with explanation: null
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: 2018-06-06 20:59:06,143 DEBUG [Remote Process Group 47a43fe6-2c9b-39aa-4fae-5851f048d58b Thread-1] o.a.n.r.util.SiteToSiteRestApiClient
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: org.apache.nifi.remote.util.SiteToSiteRestApiClient$HttpGetFailedException: response code 502:Bad Gateway with explanation: null
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at org.apache.nifi.remote.util.SiteToSiteRestApiClient.execute(SiteToSiteRestApiClient.java:1203)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at org.apache.nifi.remote.util.SiteToSiteRestApiClient.execute(SiteToSiteRestApiClient.java:1237)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at org.apache.nifi.remote.util.SiteToSiteRestApiClient.fetchController(SiteToSiteRestApiClient.java:419)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at org.apache.nifi.remote.util.SiteToSiteRestApiClient.getController(SiteToSiteRestApiClient.java:394)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at org.apache.nifi.remote.util.SiteToSiteRestApiClient.getController(SiteToSiteRestApiClient.java:361)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at org.apache.nifi.remote.util.SiteToSiteRestApiClient.getController(SiteToSiteRestApiClient.java:346)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at org.apache.nifi.remote.StandardRemoteProcessGroup$InitializationTask.run(StandardRemoteProcessGroup.java:1177)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: at java.lang.Thread.run(Thread.java:748)
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: 2018-06-06 20:59:06,143 WARN [Remote Process Group 47a43fe6-2c9b-39aa-4fae-5851f048d58b Thread-1] o.a.n.remote.StandardRemoteProcessGroup org.apache.nifi.remote.StandardRemoteProcessGroup$InitializationTask@42a30cb When communicating with remote instance, got unexpected result. Bad Gateway
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: 2018-06-06 20:59:07,185 INFO [main] o.apache.nifi.controller.FlowController Successfully synchronized controller with proposed flow
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: 2018-06-06 20:59:07,256 DEBUG [main] o.a.nifi.controller.StandardFlowService First controller initialization, initializing controller...
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: 2018-06-06 20:59:07,949 DEBUG [Listen to Bootstrap] org.apache.nifi.BootstrapListener Listening for Bootstrap Requests
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: 2018-06-06 20:59:09,951 DEBUG [Listen to Bootstrap] org.apache.nifi.BootstrapListener Listening for Bootstrap Requests
Jun 6 20:59:12 nifi-cluster--01f8453d323eeea25 nifi-app: 2018-06-06 20:59:11,953 DEBUG [Listen to Bootstrap] org.apache.nifi.BootstrapListener Listening for Bootstrap Requests
Jun 6 20:59:22 nifi-cluster--01f8453d323eeea25 nifi-app: 2018-06-06 20:59:15,697 INFO [main] org.apache.nifi.NiFi Launching NiFi... Other thoughts: - There is no obvious difference in logs between a failed launch and the eventual success. Even in the eventual success, the 502 error appears. The success log just keeps going along with normal startup logs instead of starting from the beginning again. - Having nifi.flowcontroller.autoResumeState=false does not solve the issue - The issue is the same for both service restarts and whole node reboots.
... View more
Labels:
- Labels:
-
Apache NiFi