Member since
09-29-2015
871
Posts
723
Kudos Received
255
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3380 | 12-03-2018 02:26 PM | |
2320 | 10-16-2018 01:37 PM | |
3637 | 10-03-2018 06:34 PM | |
2412 | 09-05-2018 07:44 PM | |
1836 | 09-05-2018 07:31 PM |
12-03-2018
02:26 PM
4 Kudos
NiFi Registry is a separate application and works the same whether NiFi is clustered or not. You can run NiFi Registry on one of the NiFi nodes or on a completely separate node, either way you just tell your NiFi cluster about the locaiton of NiFi Registry. It works the same whethere NiFi is clustered or standalone.
... View more
10-17-2018
01:45 PM
1 Kudo
Your whole flow will perform better if you have flow files with many records, rather than 1 record per flow file. PutDatabaseRecord will set auto-commit to false on the connection, then start executing the statements, and if any failure happens it will call rollback on the connection.
... View more
10-16-2018
02:37 PM
Yes it is up to the processor to decide what to log, the framework doesn't really know what the processor is doing. In this case, the stuff that you would see in the log would be all the calls to the logger here: https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/GetHTTP.java#L492-L553 I suppose some logging statements could be added around the call to client.execute() but I'm not sure how you'd get more info about what the execute call is doing, you would just know it started or completed.
... View more
10-16-2018
01:43 PM
When you create a topic there are two different concepts - partitions and replication. If you have 3 brokers and create a topic with 1 partition, then the entire topic exists only on one of those brokers. If you create a topic with 3 paritions then 1/3 of the topic is on broker 1 as partition 1, 1/3 on broker 2 as partition 2, and 1/3 on broker 3 as partition 3. If you create a topic with 3 partitions AND replicaiton factor of 2, then its same as above except there is also a copy of each partition on another node. So parition 1 may be on broker 1 with a copy on broker 2, parition 2 maybe be on broker 2 with a copy on broker 3, and partition 3 may be on broker 3 with a copy on broker 1. In general, replication ensures that if a broker goes down then another broker still has the data, and partition allows for higher read/write throughput by dividing up the data across multiple nodes.
... View more
10-16-2018
01:37 PM
Once a processor is started it is then running according to its configured schedule. For example, Run Schedule of 5 seconds means it is executed every 5 seconds. Depending on the processor, each execution may produce one or more flow files that are transferred to a relationship, or it may produce an error which is reported by a bulletin which shows a red error in the corner of the processor. So generally you should either be seeing flow files being transferred to one of the relationships, or bulletins.
... View more
10-03-2018
06:34 PM
2 Kudos
When you start a processor it is considered "scheduled" which means it then executes according to the scheduling strategy. The two scheduling strategies are "Timer Driven" like "Run every 5 mins" or CRON driven which executes according to a CRON expression. You can also use Timer Driven with a very low Run Schedule like 1 second, and then use the REST API to start and stop the processor when you need to.
... View more
09-06-2018
01:01 PM
I think in the controller service for Hortonworks Schema Registry the URL needs to be http://localhost:7788/api/v1 where as in your screenshot it is missing the http:// Also if you can upgrade to HDF 3.2 there is an option in the writers to inherit the schema from the reader, so the writer doesn't even need to be configured with the schema registry, only the reader would need it.
... View more
09-05-2018
07:44 PM
The configuration of the reader and writer is not totally correct... you have selected the strategy as "Schema Text" which means the schema will come from the value of the "Schema Text" property which you then have set to the default value of ${avro.schema}, and then in UpdateAttribute you set avro.schema to "AllItems" so it is trying to parse the string "AllItems" into an Avro schema and failing because it is not a JSON schema. If you want to use the "Schema Text" strategy then in UpdateAttribute the value of avro.schema needs to be the full text of your schema that you showed in your post. If you want to use the schema from HWX schema registry, then the access strategy needs to be "Schema Name" and the "Schema Name" property needs to reference the name of the schema in the HWX schema registry (this part you already have setup correctly, so just changing the strategy should work).
... View more
09-05-2018
07:31 PM
1 Kudo
I answered this on stackoverflow: https://stackoverflow.com/questions/52188619/mergecontent-processor-is-not-giving-expected-result
... View more
08-27-2018
05:29 PM
3 Kudos
This behavior is currently how it is designed to work. The presence of variables is captured with their initial value at the time, but then changes to the variable values do not trigger changes to be committed. The idea is that you create your flow in one environment with some number of variables, then promote it to another environment and change the variable values to be whatever is needed for the new environment, which should not require anything to be committed back to registry.
... View more