Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Store data required by a custom NiFi processor

avatar
Super Collaborator

HDP-2.5.3.0, NiFi 1.1.1

I am writing a custom processor in NiFi. There are several String and Timestamp fields that I need to store somewhere such that those are available on all/any nodes. @Tags({ "example" })

@CapabilityDescription("Provide a description")
@SeeAlso({})
@ReadsAttributes({ @ReadsAttribute(attribute = "", description = "") })
@WritesAttributes({ @WritesAttribute(attribute = "", description = "") })
public class MyProcessor extends AbstractProcessor {
.
.
.
private List<PropertyDescriptor> descriptors;
private Set<Relationship> relationships;

 /* Persist these, probably, in ZK */
private Timestamp lastRunAt;
private String startPoint;
.
.
.

@Override
public void onTrigger(final ProcessContext context,final ProcessSession session) throws ProcessException {FlowFile flowFile = session.get();

/*Retrieve lastRunAt & startPoint and use*/
lastRunAt ;
startPoint ;
.
.
.
}
}

Note that HDFS is NOT an option as NiFi may run without any Hadoop installation in picture.

What are the options to do this - I was wondering if Zookeeper can be used to store this data since it's small in size and NiFi is backed by ZK. I tried to find ways to use the Zookeeper API to persist these fields, in vain.

1 ACCEPTED SOLUTION

avatar

Hi @Kaliyug Antagonist,

NiFi provides a State API that allows you to do what you are looking for (using Zookeeper). Have a look here:

https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#state_manager

There is a lot of existing processors using this state management and I believe this is the best way for you to understand how it works. For example all ListX processors should store a state to be sure they pick up at the right position in case of restart.

Hope this helps.

View solution in original post

1 REPLY 1

avatar

Hi @Kaliyug Antagonist,

NiFi provides a State API that allows you to do what you are looking for (using Zookeeper). Have a look here:

https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#state_manager

There is a lot of existing processors using this state management and I believe this is the best way for you to understand how it works. For example all ListX processors should store a state to be sure they pick up at the right position in case of restart.

Hope this helps.