Member since
02-02-2018
20
Posts
0
Kudos Received
0
Solutions
07-27-2018
03:33 PM
Matt, thanks a lot for all your help. I was able to refactor my dataflow, reducing the number of groups and keeping everything simple in a single dynamic flow. Just to elaborate a little bit better, here's what I did. Data coming in CSV format separated by pipes. e.g.: (transaction #, sequence #, table code) 123|456|35| 123|456|36| 123|456|100| First I split the flowfile into multiple ones using SplitText >> then I used the ExtractText processor to grab the 3rd field (table code) >> LookupAttribute setting the user-defined-field schema.name (to be used by AvroSchemaRegistry controller service) >> Push the data to Kafka and Hive using the appropriate processors. Thanks a lot!
... View more
07-25-2018
06:47 PM
Hi Matt. First of all, thank you so much for the explanation. My scenario currently falls into the 3rd one you described: I have multiple table codes coming in a single flowfile. Could you please elaborate on how to use the PartitionRecord processor? I tried here using the CSVReader and CSVSetWriter controller services, but they ask for an Avro schema as well. All the tables' structures I'm working with right now have only the 3 first fields in common (the last one being the table code). The rest of them varies, so I got a little bit confused on how to set this avro schema.
... View more
07-25-2018
05:53 PM
Hi experts, Good day! I've been using Nifi for a couple of months, so I'm still learning lots of new things every day. I'm building a dataflow to get csv data (separated by pipes - '|' ) and push it to different targets (e.g. Hive, SQL Server, and Kafka). The project started fine but the dataflow started getting bigger and bigger and now I'm finding it difficult to manage. I just wanted to ask for some help understanding if I'm currently working with the best possible scenario. More details below. I'm getting data from a ListenHTTP processor. Data comes as csv separated by pipes. One of the fields is a code that identifies which table the data should be pushed to, so I've created one process group for each "table". Here's where I think the dataflow gets complicated. Each of those groups (23, to be precise) has 4 other groups, each responsible to push data to a specific target. Since I have a Hive dataflow inside these groups, I had to create the Avro schema defining the structure for each table. I was just wondering if I could substitute this dataflow with a single one that evaluates the code in the csv, and "chooses" the correct avro schema to be used. I did some research but couldn't progress further. If there's a way to do it, I could simply substitute those 23 groups with a single dynamic dataflow. Hopefully you can help me with this scenario. Thanks in advance! Sincerely, Cesar Rodrigues
... View more
Labels:
- Labels:
-
Apache NiFi
02-26-2018
04:17 PM
Hi, guys, I have a data flow in Nifi that gets a file from the server and converts it to Avro to stream data to Hive. In this flow, I have some sensitive information that I need to hash (SHA2_512). I checked that Nifi has a couple of processors to work with hash, but it seems they only do this for the whole file. Is there a way to hash a specific field? Before converting to Avro, my flow files are coming from the server as fields separated by pipes ('|'). Thanks in advance! Cheers
... View more
Labels:
- Labels:
-
Apache NiFi
02-14-2018
06:17 PM
Thanks, @Matt Burgess! This helped a lot 😉
... View more
02-07-2018
08:08 PM
@Abdelkrim Hadjidj Thank you! Could you please provide more details on how to use the schema registry? I'm having some trouble with that.
... View more
02-07-2018
08:06 PM
@Matt Burgess I've never used the Avro Schema before. Could you please explain how to name the fields in it? I checked the documentation, but it's a little bit confusing. Thanks in advance!
... View more
02-07-2018
05:45 PM
Hi, guys, So I have an incoming FlowFile with content text delimited by pipes ('|'), and I want to send this information to several destinations. To convert it to JSON, for example, I know I can use the AttributesToJSON processor, but how exactly can I access the FlowFile content and convert them to attributes? e.g. original FlowFile content: 1234567891285|37797|1| the brown fox FlowFile attributes (after converting): id = 1234567891285 sequence = 37797 category = 1 text = the brown fox ... and after that I could use AttributesToJSON to generate my JSON file. Any ideas on how to achieve this? Thanks in advance! Cheers.
... View more
Labels:
- Labels:
-
Apache NiFi
02-05-2018
02:03 PM
@Shu, thank you very much. It worked perfectly!
... View more
02-02-2018
09:28 PM
Hello, guys, I'm trying to use Nifi to split a text file into 2 other files. I just have a problem with that: I need to split them based on their category type. e.g. FlowFile content: Some fixed text |1| more text Another field |8| more text Last one |1| more text With that, I'd like to split this file into, for example: first FlowFile: Some fixed text |1| more text Last one |1| more text second FlowFile: Another field |8| more text Do you guys have any idea on how to accomplish that using Nifi? I appreciate any help you can provide. Thanks in advance, Cheers!
... View more
Labels:
- Labels:
-
Apache NiFi