Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Dataflow question and special case with duplicates

avatar
Explorer

Hello community,  

 

I started using Apache NIFI for my bachelor-thesis. The basics of the data flow are already working. But there are some cases I can not really get the grasp on. 

 

I get my files via HTTP, and they are mostly in TXT, CSV or XML.

 

How my workflow (data flow?) should look like:

- Multiple data sources (Question 1)

- Splitting the values in multiple lines

- Adding a timestamp as a column to each line (Question 2)

- Adding the source (name) as a column to each line (Question 2)

- Checking if the value was already seen (Question 3)

- Adding a new column to each line with the value "already seen" or "first seen" (Question 3)

- Merging the content

- Changing Filename

- PutFile (Question 4)

 

Question 1

Do I need to make a new Data flow for each new resource? Because otherwise they have all the same or a totally random file name at the end.

 

Question 2

If I add a column with the same value to each line, is it better to add the value before, at or after splitting the text?

 

Question 3

Right now my data gets saved in separate files, for example: dat_feed1.csv, data_feed2.csv.

How do I check if a value of the actual data flow is already in my locally saved data (CSV)?

I don't want to get rid of the duplicates. But I need to add a column which signalizes if the value was already seen or not. How is this possible?

 

Question 4

At last, I am struggling how to save my files, because I need them to be saved separately and additionally appended to a combined file. The separated files are basic and working fine. About appending to a file I read about different solutions, mostly about Groovy scripts. 

Is ExecuteGroovyScripts the right way to go? 

 

I hope you can help me, and I am looking forward to your answers. 

 

 

Best regards

 

Maurice

 
1 ACCEPTED SOLUTION

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
2 REPLIES 2

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Explorer

Hello Dennis, 

 

thank you for the reply. It really helped a lot!

 

Q1: That worked very well with the updateAttrbute processor. 

Q2: This also worked. I had the the settings of the csvWriter service (UpdateRecord) messed up. But it works fine. 

Q3: That is a bummer, I hoped it will would be a piece of cake to implement that. But i will look into one of the mentioned tools and figure it out.

Q4: True that. The file is going to explode with data.