Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Why does nifi ship with WriteAheadProvenanceRepository and UseG1GC configured but later recommend against using both together?

avatar

Hello NiFi community,

I have been investigating a problem where the most recent data provenance segment in a shut down nifi application session is not available when the nifi application is next started.

My investigation has been to repeatedly (20+ times) stop nifi, start nifi (and let it generate a bit of data provenance), and then inspect the data provenance when the UI becomes available to confirm that all previous data provenance segments are accessible. Then repeat, and repeat, and repeat.

At first, I was able to reproduce the problem fairly reliably ever second or third time that I stopped nifi and started it again. The data provenance generated during the previous nifi application session was not displayed in the current nifi application session. I could still see the .prov files in the provenance_repository on disk, so I concluded the most recent .prov file from the previous nifi application session had been corrupted.

I then discovered this recommendation (https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.1.1/bk_user-guide/content/bootstrap-conf.html) that suggests that you should comment out UseG1GC in bootstrap.conf when using the write-ahead configuration for provenance.

I followed the instructions and commented-out the line in bootstrap.conf. Then, I observed that the data provenance generated during previous sessions of nifi were no longer missing from subsequent nifi sessions. The change appeared to stem whatever problem was occurring. Assuming that it was a corruption problem, disabling UseG1GC seemed to prevent the corruption that I had been observing.

This behaviour (to me) seems to confirm that the stability of WriteAheadProvenanceRepository becomes questionable when UseG1GC is configured.

I have a few questions related to what I have seen:

  1. Does this problem only affect provenance when nifi is being stopped or have other problems been observed for long-running nifi application sessions?
  2. If UseG1GC is as much at odds with WriteAheadProvenanceRepository as it seems, why does nifi ship with both configured out of the box?
  3. Is there a means by which I can validate the integrity of the .prov files programmatically?

Thank you, kindly, for your time and attention.

Sean


1 ACCEPTED SOLUTION

avatar
Super Mentor

@Sean Dockery

1. The problem with G1GC in Java 8 is not unique to provenance. G1GC in Java 8 and earlier was still considered experimental. It was observed that G1GC had better performance with the larger heap sizes commonly used in NiFi setups, so early on it was the recommended GC for NiFI. While these bugs exist that can cause corruption in the in the Java heap space, we had not encountered the corruption prior to the introduction of the new WriteAheadProvenance implementation. While the G1GC issues have been resolved as of Java 9, those fixes were not back ported to earlier versions of Java. NiFi currently only supports Java 8, so we decided to move away from recommending using G1GC when using the new high performant WriteAheadProvenance implementation.
https://wiki.apache.org/lucene-java/JavaBugs

2. The change of default provenance implementation from PersistentProvenance to WriteAheadProvenance only occurred recently and it appears no one changed the configuration in the bootstrap.conf at that time to comment out the G1GC line.
It is also very likely we will again recommend G1GC once NiFi supports newer versions of Java where these G1GC issues have been addressed.

3. I have no answer for number 3, perhaps someone else can comment on that question in your query.

Thank you,
Matt


If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

View solution in original post

3 REPLIES 3

avatar
Super Mentor

@Sean Dockery

1. The problem with G1GC in Java 8 is not unique to provenance. G1GC in Java 8 and earlier was still considered experimental. It was observed that G1GC had better performance with the larger heap sizes commonly used in NiFi setups, so early on it was the recommended GC for NiFI. While these bugs exist that can cause corruption in the in the Java heap space, we had not encountered the corruption prior to the introduction of the new WriteAheadProvenance implementation. While the G1GC issues have been resolved as of Java 9, those fixes were not back ported to earlier versions of Java. NiFi currently only supports Java 8, so we decided to move away from recommending using G1GC when using the new high performant WriteAheadProvenance implementation.
https://wiki.apache.org/lucene-java/JavaBugs

2. The change of default provenance implementation from PersistentProvenance to WriteAheadProvenance only occurred recently and it appears no one changed the configuration in the bootstrap.conf at that time to comment out the G1GC line.
It is also very likely we will again recommend G1GC once NiFi supports newer versions of Java where these G1GC issues have been addressed.

3. I have no answer for number 3, perhaps someone else can comment on that question in your query.

Thank you,
Matt


If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

avatar
Super Mentor

@Sean Dockery
A Jira has been filled to comment out the G1GC line in the NiFi bootstrap.conf in next Apache release:

https://issues.apache.org/jira/browse/NIFI-6132

avatar

Thanks, Matt. Hopefully this change will avoid headaches for users who don't visit every single configuration option when implementing NiFi.