Support Questions
Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Access nar bundle/version/etc. info from within my processor


From within my custom processor, how can I access some details from the NAR manifest? e.g. I'd like to get some string that includes the version. I know NiFi parses and retains this info because it displays in the web UI.

The closest I've gotten so far is ProcessorInitializationContext.getIdentifier().


Super Guru

There is a class in nifi-nar-utils.jar (which is in lib/ and thus in your classpath) called org.apache.nifi.nar.NarClassLoadersHolder, which has a static method getInstance() which returns a NarClassLoaders object. You can call getBundles() on this, and iterate through the bundles, performing a Class.forName() on each, looking for your processor's name (via this.getClass().getName()), and using the bundle's classloader.

If Class.forName() succeeds, you know that bundle contains your processor's class. You can then call getBundleDetails() on the Bundle object, then getCoordinate() on the BundleDetails object, then there are a few methods to return various details about the bundle, such as getVersion(). The following is a Groovy script I tried in ExecuteScript, note that I use a hardcoded class name as the script itself is not a Processor so it does not exist in any NAR. You'd use this.getClass().getName() from a proper Processor implementation:

import org.apache.nifi.nar.*
def flowFile = session.get()
if(!flowFile) return
try {
   NarClassLoadersHolder.instance.bundles.findAll {b ->
      try {
         Class.forName('org.apache.nifi.processors.elasticsearch.PutElasticsearch', false, b.classLoader)
      } catch(e) {false}
   }.each {x ->}
    session.transfer(flowFile, REL_SUCCESS)
} catch (final Throwable t) {
      session.transfer(flowFile, REL_FAILURE)

If there is a good reason for a processor to know the bundle information, then I think we can provide it through one of the context APIs. I'm curious to know what the use case is for the processor needing to know this, not saying we shouldn't add it, just want to understand.


Thanks for the replies- although a bit of the long way around, it looks like I can get what I need through this mechanism.

My use case is for filling out a provenance field in the data I produce. Specifically, I have a custom processor that translates data between different XML and JSON schemas. I'd like to be able to include my NiFi processor version (e.g. 1.1.32 or whatever) as a property in the data my processor produces. (The translated data ultimately goes to Elasticsearch.)

Down the road (could be months or even years later) if a user discovers an issue with a particular record in Elastic, e.g. let's say a critical calculation was wrong, I want to be able to look at the record and see that it was processed with version 1.1.32 of our processor. I can go back to that version of the source code for my processor and being piecing together what happened, find other data that might be affected, etc.

I think it does make sense to expose this via a context API... it is not an uncommon use case in general for software to want to have this type of knowledge of itself (for various reasons.)

Thanks for the insight, that makes total sense.

Now that you mentioned provenance, I'm now wondering if NiFi should automatically be putting the bundle info of the component into the provenance events, since that is important information to know when looking at the history.