About bbende

bbende · ‎12-03-2018

NiFi Registry is a separate application and works the same whether NiFi is clustered or not. You can run NiFi Registry on one of the NiFi nodes or on a completely separate node, either way you just tell your NiFi cluster about the locaiton of NiFi Registry. It works the same whethere NiFi is clustered or standalone.

bbende · ‎10-17-2018

Your whole flow will perform better if you have flow files with many records, rather than 1 record per flow file. PutDatabaseRecord will set auto-commit to false on the connection, then start executing the statements, and if any failure happens it will call rollback on the connection.

bbende · ‎10-16-2018

Yes it is up to the processor to decide what to log, the framework doesn't really know what the processor is doing. In this case, the stuff that you would see in the log would be all the calls to the logger here: https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/GetHTTP.java#L492-L553 I suppose some logging statements could be added around the call to client.execute() but I'm not sure how you'd get more info about what the execute call is doing, you would just know it started or completed.

bbende · ‎10-16-2018

When you create a topic there are two different concepts - partitions and replication. If you have 3 brokers and create a topic with 1 partition, then the entire topic exists only on one of those brokers. If you create a topic with 3 paritions then 1/3 of the topic is on broker 1 as partition 1, 1/3 on broker 2 as partition 2, and 1/3 on broker 3 as partition 3. If you create a topic with 3 partitions AND replicaiton factor of 2, then its same as above except there is also a copy of each partition on another node. So parition 1 may be on broker 1 with a copy on broker 2, parition 2 maybe be on broker 2 with a copy on broker 3, and partition 3 may be on broker 3 with a copy on broker 1. In general, replication ensures that if a broker goes down then another broker still has the data, and partition allows for higher read/write throughput by dividing up the data across multiple nodes.

bbende · ‎10-16-2018

Once a processor is started it is then running according to its configured schedule. For example, Run Schedule of 5 seconds means it is executed every 5 seconds. Depending on the processor, each execution may produce one or more flow files that are transferred to a relationship, or it may produce an error which is reported by a bulletin which shows a red error in the corner of the processor. So generally you should either be seeing flow files being transferred to one of the relationships, or bulletins.

bbende · ‎10-03-2018

When you start a processor it is considered "scheduled" which means it then executes according to the scheduling strategy. The two scheduling strategies are "Timer Driven" like "Run every 5 mins" or CRON driven which executes according to a CRON expression. You can also use Timer Driven with a very low Run Schedule like 1 second, and then use the REST API to start and stop the processor when you need to.

bbende · ‎09-06-2018

I think in the controller service for Hortonworks Schema Registry the URL needs to be http://localhost:7788/api/v1 where as in your screenshot it is missing the http:// Also if you can upgrade to HDF 3.2 there is an option in the writers to inherit the schema from the reader, so the writer doesn't even need to be configured with the schema registry, only the reader would need it.

bbende · ‎09-05-2018

The configuration of the reader and writer is not totally correct... you have selected the strategy as "Schema Text" which means the schema will come from the value of the "Schema Text" property which you then have set to the default value of ${avro.schema}, and then in UpdateAttribute you set avro.schema to "AllItems" so it is trying to parse the string "AllItems" into an Avro schema and failing because it is not a JSON schema. If you want to use the "Schema Text" strategy then in UpdateAttribute the value of avro.schema needs to be the full text of your schema that you showed in your post. If you want to use the schema from HWX schema registry, then the access strategy needs to be "Schema Name" and the "Schema Name" property needs to reference the name of the schema in the HWX schema registry (this part you already have setup correctly, so just changing the strategy should work).

bbende · ‎09-05-2018

I answered this on stackoverflow: https://stackoverflow.com/questions/52188619/mergecontent-processor-is-not-giving-expected-result

bbende · ‎08-27-2018

This behavior is currently how it is designed to work. The presence of variables is captured with their initial value at the time, but then changes to the variable values do not trigger changes to be committed. The idea is that you create your flow in one environment with some number of variables, then promote it to another environment and change the variable values to be whatever is needed for the new environment, which should not require anything to be committed back to registry.

Online	Offline
Last Visited	‎09-10-2020 01:23 PM

Member Since	‎09-29-2015 04:02 PM
Last Visited	‎09-10-2020 01:23 PM
Posts	871
Kudos received	709

Cloudera Community

Re: Using nifi registry in a nifi cluster.

Re: Is there a way to enable a stateful status upd...

Re: Automated Start/Stop of a NiFi Processor

Re: PublishKafkaRecord_0_10 1.2.0.3.0.1.1-5 Error:...

Re: how to configure mergecontent processor

Re: Using nifi registry in a nifi cluster.

Re: PutDatabaseRecord Performance Issue

Re: Is there a way to enable a stateful status upd...

Re: Kafka topics and consumers configuration using...

Re: Is there a way to enable a stateful status upd...

Re: Automated Start/Stop of a NiFi Processor

Re: PublishKafkaRecord_0_10 1.2.0.3.0.1.1-5 Error:...

Re: PublishKafkaRecord_0_10 1.2.0.3.0.1.1-5 Error:...

Re: how to configure mergecontent processor

Re: Store the variables of subgroups into registry