About bbende

bbende · ‎12-16-2016

You shouldn't need to use an additional load balancer. The URL you enter in the RPG is only used for the initial connection to learn about the nodes in the cluster, from there the RPG talks directly to all nodes. So the main failure case is if your NiFi restarted at a time when the URL in the RPG happened to be down, this will be addressed in an upcoming release which will allow you to enter multiple URLs: https://issues.apache.org/jira/browse/NIFI-3026

bbende · ‎12-15-2016

If you do a GET for the root group using /process-groups/{id} then that response will have all the other process groups under it, you can look at the ProcessGroupEntity example JSON response, and then component -> contents -> processGroups or processors. The root group id won't change so that should be your starting point.

bbende · ‎12-14-2016

Interesting, can you confirm which version of NiFi? I'm assuming 1.1 since you have the new UI with colors.

bbende · ‎12-14-2016

There are currently a couple of options... If you just want to figure out where a flow file came from for troubleshooting/debugging, then using provenance can tell you this by looking at the transit URI of the RECEIVE event. If you want to make a decision somewhere in your NiFi flow based on which MiNiFi sent the flow file, then currently you would need set an attribute on the MiNiFi side like "minifi.host = ${hostname()}" in an UpdateAttribute processor, so that when it got transferred to NiFi that attribute would be there. There is a pull request open to make NiFi automatically create this attribute for you when receiving the flow files via site-to-site, so basically the same info that is available on the RECEIVE event would be available in attributes: https://issues.apache.org/jira/browse/NIFI-2585 https://github.com/apache/nifi/pull/1320

bbende · ‎12-14-2016

You should stop the processor that is after the queue, and wait until the number in the top-right corner of that processor goes away. In your photo it shows "1" in the next processor which means it is currently running and likely operating on some of the flow files which can't be cleared from the queue. Once the processor is stopped and the number is gone then you can try clearing. If you don't care about any of your data in the flow and just want to wipe everything, you can stop NiFi and delete the "_repository" directories and start back up. That will clear ALL data from your flow.

bbende · ‎12-14-2016

To go from CSV or JSON to Avro you would use ConvertJsonToAvro or ConvertCsvToAvro, there is no conversion for XML currently. With ConvertJsonToAvro and ConvertCsvToAvro you can directly enter the schema in the properties of the processor, or you can attempt to use InferAvroSchema before them. InferAvroSchema will guess a schema and put it into a flow file attribute which you can then reference as ${inferred.avro.schema} in the conversion processor. For ConvertAvroSchema you can either enter schemas directly into the properties, or you can reference flow file attributes using expression language (if you have the schemas in flow file attributes).

bbende · ‎12-13-2016

There currently isn't a way to do this. With expression language you are either referencing a function (listed here https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html) or variable. The variables could be flow file attributes, system properties passed through bootstrap.conf, or variables from the variable registry (https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#custom_properties). I don't know of a reason why there couldn't be an EL function like ${processGroupName()} which returned the name of the enclosing process group. It just hasn't come up before.

bbende · ‎12-12-2016

NONE means don't use a codec at all. AUTOMATIC is not meant to be used with PutHDFS, it is only for the read side (GetHDFS and FetchHDFS) DEFAULT means use the DefaultCodec class provided by Hadoop which I believe uses a zlib compression (slide 6 here http://www.slideshare.net/Hadoop_Summit/kamat-singh-june27425pmroom210cv2)

bbende · ‎12-12-2016

The data in the content repository should be controlled through the settings in nifi.properties: # Content Repository nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository nifi.content.claim.max.appendable.size=10 MB nifi.content.claim.max.flow.files=100 nifi.content.repository.directory.default=./content_repository nifi.content.repository.archive.max.retention.period=12 hours nifi.content.repository.archive.max.usage.percentage=50% nifi.content.repository.archive.enabled=true nifi.content.repository.always.sync=false nifi.content.viewer.url=/nifi-content-viewer/ Is the data in the archive folders (underneath each of the folders in the content_repository)? If so you can turn off archiving using the property above, or reduce the thresholds for how long to hold on to archived data. If the data is not in the archive folders then it is still deemed to be active in the flow, since multiple flowfiles are bundled together in a content claim, you can have a content claim with some flow files still active and it can't archive the whole thing until they are all no longer in the flow.

bbende · ‎12-12-2016

Yes that is a lot less than 700 devices generating 1GB each per hour 🙂 Most people want some kind of redundancy for production so you may still want to a smaller cluster, say 3 nodes, but yes you should be able to do that on one node.

Online	Offline
Last Visited	‎09-10-2020 01:23 PM

Member Since	‎09-29-2015 04:02 PM
Last Visited	‎09-10-2020 01:23 PM
Posts	871
Kudos received	709

Cloudera Community

Re: Using nifi registry in a nifi cluster.

Re: Is there a way to enable a stateful status upd...

Re: Automated Start/Stop of a NiFi Processor

Re: PublishKafkaRecord_0_10 1.2.0.3.0.1.1-5 Error:...

Re: how to configure mergecontent processor

Re: Load Balance NiFi Cluster

Re: NIFI-PROCESSOR : Monitoring

Re: Unable to clear Nifi Queue

Re: How does the input port know which device data...

Re: Unable to clear Nifi Queue

Re: Convert from one AVRO schema to another

Re: [Nifi] Capture Process Group Name Within Runni...

Re: Enabling LZO compression using NiFi PutHDFS

Re: In Apache NiFi 1.0, Can I delete older content...

Re: NiFi Hardware Recommendation