Member since
01-31-2021
7
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
815 | 02-08-2021 10:57 AM |
03-21-2021
01:07 PM
Hello everyone, I need your help because I do not know how to proceed. Right now I have a PostgreSQL database with the following table: domains domain, source, timestamp, domainA, yourdomains, 128989372 domainB, yourdomaisn, 128923892 domainA, cyberclub, 13934829 domainD, cyberclub, 184994420 domainA, securityTeam, 118382938 My goal is to make some comparisons and alter the table. The most important one is to check every line for duplicates in the table in column "domain" like the first, the third and the last line and compare their timestamps. The one with the lowest timestamp gets a new column with the number 1. The next one gets 2 ... At the end I should see which source has how many ones. Which tool should I use? I got Apache Flink or Spark recommended? Or another SQL Tool? Or plain SQL with scripts? I am happy for every tip! Best regards Maurice
... View more
Labels:
- Labels:
-
Apache Flink
-
Apache NiFi
-
Apache Spark
02-08-2021
10:57 AM
Solved. Ihad the settings in the csv writer wrong.
... View more
02-04-2021
01:57 AM
Hello Dennis, thank you for the reply. It really helped a lot! Q1: That worked very well with the updateAttrbute processor. Q2: This also worked. I had the the settings of the csvWriter service (UpdateRecord) messed up. But it works fine. Q3: That is a bummer, I hoped it will would be a piece of cake to implement that. But i will look into one of the mentioned tools and figure it out. Q4: True that. The file is going to explode with data.
... View more
02-04-2021
01:51 AM
Hello everyone, i have data which has only one value per line, often not even a header. My goal is to conert the data to csv and format it afterwards (remove the comments and add some columns). Example base data: #This is my base data# ################# value1 value2 value3 Example finished data: value-description, source, timestamp value1, source1, 984359345 value2, source1, 98249732 value3, source1, 984u9834 Right now I'm able to achieve the output if I have an input which is already in a CSV format (comma separated). But if I try is with the example data above the output is the following: header-descriptionvalue1value2value3 This happens after the ReplaceText processor which should delete blank lines. The source and timestamp column is not yet added. This is the config of the processor: I am glad about every response. Best regards Maurice
... View more
Labels:
- Labels:
-
Apache NiFi
01-31-2021
02:31 AM
Hello community, I started using Apache NIFI for my bachelor-thesis. The basics of the data flow are already working. But there are some cases I can not really get the grasp on. I get my files via HTTP, and they are mostly in TXT, CSV or XML. How my workflow (data flow?) should look like: - Multiple data sources (Question 1) - Splitting the values in multiple lines - Adding a timestamp as a column to each line (Question 2) - Adding the source (name) as a column to each line (Question 2) - Checking if the value was already seen (Question 3) - Adding a new column to each line with the value "already seen" or "first seen" (Question 3) - Merging the content - Changing Filename - PutFile (Question 4) Question 1 Do I need to make a new Data flow for each new resource? Because otherwise they have all the same or a totally random file name at the end. Question 2 If I add a column with the same value to each line, is it better to add the value before, at or after splitting the text? Question 3 Right now my data gets saved in separate files, for example: dat_feed1.csv, data_feed2.csv. How do I check if a value of the actual data flow is already in my locally saved data (CSV)? I don't want to get rid of the duplicates. But I need to add a column which signalizes if the value was already seen or not. How is this possible? Question 4 At last, I am struggling how to save my files, because I need them to be saved separately and additionally appended to a combined file. The separated files are basic and working fine. About appending to a file I read about different solutions, mostly about Groovy scripts. Is ExecuteGroovyScripts the right way to go? I hope you can help me, and I am looking forward to your answers. Best regards Maurice
... View more
Labels:
- Labels:
-
Apache NiFi