Created 06-06-2024 07:11 AM
Hi Team
We recently upgraded our AKS cluster from 1.28.5 to 1.29.4 . The apache nifi - version 1.16.3 was already installed in it and running successfully. After the upgrade to 1.29.4 , one of the Pod - Nifi was continuously restarting and goes to Crashloopbackoff. When we investigate, the server container keeps restarting and there were no logs showing any error as it is changing status to Running and then to Not ready very quickly and pod reports Crashloopbackoff. We are also unable to login inside the pod due to this. At times when the status is running , it is allowing to connect for few seconds and we noticed that the KeyStore.jks and truststore,jks from certmanager is not present within tls folder.
We are not sure what is the cause of the continuous restart of the server container after upgrade of aks cluster to 1.29.4
when we compared the logs of the nifi-0 and nifi-1 pod, the nifi -0 pod showing below log and running successfully. Whereas nifi-1 and nifi-2 do not have these logs.
/opt/nifi/nifi-current/tls/truststore.jks is not readable! Waiting for cert-manager sidecar to populate it.
Could you please help share some inputs to resolve this issue ?
Best Regards
Dhinesh Kumar Ganeshan
Created 06-06-2024 10:39 AM
@GDK Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our NiFi expert @steven-matison who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 06-18-2024 08:25 AM
@steven-matison , Requesting you to let us know if you have any suggestion on this issue.
Created 06-25-2024 12:41 PM
Created on 07-10-2024 09:32 AM - edited 07-10-2024 09:34 AM
Hi Team
We would like to share some additional findings while observing the node server logs, we noticed that the order in which the nifi pod execution seem to have changed.
An healthy nifi pod execution has logs in following order
After upgrading to AKS 1.29 and restart of pods display a different order of execution
2. TLS CERT MANAGER DIRECTORY CREATION
1.PROP REPLACE IS FIRST STEP EXECUTED BY SERVER CONTAINER. FOLLOWED BY APP LOG, BOOTSRAP AND USER LOG.
3.COPY AND CLASS PATH EXECUTION
we are not sure why such behavior in Nifi. We have nifi version 1.16.3 and java - 8.
Any inputs would certainly help.