Created 07-14-2021 11:38 AM
Hi Everyone, The goal is to be able to extend to several servers, in order to collect the files in a parallel way to count the number of lines and the number of files to store in a database each day. I collect files I count the number of files and the number of lines in the files. I want to store these values in a database. I installed apache cassandra and created a database and a table. when I insert the number of lines and documents, I check my table and I don't see any data. GetFile-->MergeContent-->CountText-->ReplaceText-->PutCassandraRecord
I want to check the data entered in the database table but I don't see any data.
here is the configuration of the PutcassandraRecord processor.
Created 07-15-2021 07:53 AM
I am working a lot with NIFI and Cassandra. Please update your post with incoming flowfile format, csv reader configuration, and any errors when you run your flow. These will help myself or others provide more concise reply and hopefully a solution.
Created 07-15-2021 09:20 AM
Hi @stevenmatison, The goal is to be able to extend to several servers, in order to collect the files in a parallel way to count the number of lines and the number of files to store in a database each day.the type of data contained in the data file
|226789|23-Feb-1996|0|1|1|0|0|0|1|0|3|0|0|6|0|2|0|0|6|9|7
|226780|08-Mar-1996|4|0|2|0|0|1|0|0|0|3|0|0|0|0|0|0|0|0|8
|222507|01-Jan-1995|0|0|5|0|0|1|0|0|0|0|6|0|0|0|0|0|0|0|5
|22308|01-Jan-1995|0|1|8|0|0|6|0|0|0|2|0|4|0|0|0|6|0|0|4
|222707|01-Jan-1995|0|1|0|0|5|0|0|0|1|0|0|6|0|0|7|0|1|0|2
I collect files I count the number of files and the number of lines in the files. I want to store these values in a database. I installed apache cassandra and created a database and a table. when I insert the number of lines and documents, I check my table and I don't see any data. GetFile-->MergeContent-->CountText-->ReplaceText-->PutCassandraRecord. The processors that make up the nifi flowfile I set up.
I want to store the number of files obtained and the number of lines obtained in cassandra if possible. So concerning the configuration of PutcassandraRecord you can make me a proposal.
Created 07-16-2021 04:48 AM
I understand.
First a NiFi dev recommendation I always suggest: do not route success and failure in the same route. Make them separate. You need to know if the flowfile goes to failure. Also, if you are ignoring certain routes (failure,retry,others) make a habit of routing all them to an output port so you can see where flowfile goes. This concept will help you know where a flowfile when after you push play.
One of my dev flows looks like:
Once I am satisfied the flow works, and my Success flowfile is on the bottom, i can auto terminate those failures. However, based on your flow, you may for example want to do something different with a failure, like log it, or send an email.
Next, I think if you do the above and run your flow, you might see flowfile NOT go to PutCassandraRecord. If it does make it, update the post with the content of the flowfile, and any errors from the PutCassandraRecord. We need to see those errors and what content you are delivering to the processor.
Created 07-16-2021 07:12 AM
Hi @stevenmatison, I don't see any data in the cassandra database. I don't see any errors either. Here are the screenshots.