Member since
09-29-2015
871
Posts
723
Kudos Received
255
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4263 | 12-03-2018 02:26 PM | |
| 3203 | 10-16-2018 01:37 PM | |
| 4311 | 10-03-2018 06:34 PM | |
| 3168 | 09-05-2018 07:44 PM | |
| 2425 | 09-05-2018 07:31 PM |
12-09-2016
12:32 PM
3 Kudos
This is an extremely rough guess and it really depends on a lot of factors, but if you assume all devices produce 1GB an hour thats 700GB/hour, thats about 200MBs/second. You are probably looking at a 7-8 node NiFi cluster, where each node has 16-24 cpus, 16+ GB RAM, 10 Gigabit NICs. You will also need to ensure proper separation of the repositories on different disks.
... View more
12-09-2016
12:17 PM
1 Kudo
Unfortunately not every property currently supports using the variable registry. The way you can tell is in the documentation for a processor, by looking at the property description. If it says "Supports Expression Language: true" then you can reference variables. For example, with PutHDFS it looks like only the Directory property currently supports it: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hadoop.PutHDFS/index.html
... View more
12-08-2016
03:17 PM
@Avijeet Dash Take a look at this template for some examples. avroschemascenarios.xml
... View more
12-08-2016
03:01 PM
It depends how you construct your dataflow in NiFi... You could set it up so that you have several logical streams that each have their own ConvertCsvToAvro processor, or you could have several processors feeding into the same ConvertCsvToAvro processor. Kafka itself does not enforce anything related to a schema, but Confluent has a schema registry with serializers and deserializers and they can enforce that any message being written to a topic must conform to the schema for that topic.
... View more
12-08-2016
02:31 PM
Since NiFi 1.0, there are two types of controller services... You can create controller services from the global hamburger menu in the top right, but these are only to be used by any reporting tasks that use a controller service. For example, the site to site provenance reporting task can use an SSL context service. For processors they have to be created through the context pallet on the left hand side. If you are on the root canvas and have nothing selected, and then click the configuration icon in the pallet then you can create a controller service at the root group. These should be visible to any processors on the root group and with in any sub groups. If you go into a process group and create a controller service there, then it is visible to anything in that group and below, but not anything above it.
... View more
12-08-2016
02:25 PM
1 Kudo
Each CSV or JSON that comes in the InferAvroSchema could be different so it will infer the schema for each flow file and put the schema where you specify the schema destination, either flow file content or a flow file attribute. Then you can use that attribute in ConvertCsvToAvro as the schema by referencing ${inferred.avro.schema}. If you are sending only one type of CSV in to ConvertCsvToAvro then it would be more efficient for you to define the Avro schema you want and not use InferAvroSchema.
... View more
12-07-2016
06:11 PM
2 Kudos
This is a known problem with PutKafka: https://issues.apache.org/jira/browse/NIFI-2671 What versions of Kafka and NiFi are you using? PutKafka should only be used with a 0.8 Kafka broker, PublishKafka and PublishKafka_0_10 should be used with Kafka 0.9 and 0.10 respectively. For PublishKafka and PublishKafka_0_10, they were fixed to send messages larger than 1MB: https://issues.apache.org/jira/browse/NIFI-2614
... View more
12-07-2016
06:04 PM
So what happens when you copy a file into /home/test? There should be something in the minifi-app.log when this happens.
... View more
12-07-2016
05:37 PM
If you ping master.hadoop.com from the Raspberry Pi does it get a response?
... View more