Support Questions
Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

NiFi / Postgres Database Problems Inserting Records




Is there anyone out there that can help me with the following two problems?  Thanks in advance.


Problem #1:  I have a very simple CSV input file (it has 6 mock satellite names).  I am using the ExtractText processor to capture the satellite name into a user-defined property that becomes an attribute [Property: satellite_name, Value: (.{1,200})]

ExtractText makes 3 copies of the same attribute; I only want one.  I can see this when I look at the processor's data provenance - it makes: satellite_name, satellite_name.0, satellite_name.1, fills these with the first of the 6 mock satellite names, then picks up the other five and puts them in satellite_name.2, satellite_name.3, etc.


Problem #2:   I have tried a lot of different settings that I found in Google searches, but when I run the processor group, it does not stop inserting the flowfile data into the database, and it only inserts the first satellite name.


The "matchted" flow of my NiFi processors is: GetFile -> ExtractText -> PutSQL -> LogAttribute.  (The "unmatched" flow is GetFile -> ExtractText -> (different) LogAttribute.)


I do not have any problems using NiFi to access the Postgres database.  NiFi is running on Windows, Postgres is running on a Linux VM (on the same PC).  In the PutSQL processor, this is my insert statement:  

INSERT INTO satellite.satname (satellite_name) VALUES('${satellite_name}');


That's the vital information about my problems; I can provide screenshots if needed if you think you can help and would like more information.  I sincerely appreciate any time/effort anyone puts forward to helping me.  I am new to NiFi and have come a long way getting it to talk to a database running on a VM.


Thanks again,




@johndcal   I am excited that you are learning NiFi.  It is my favorite tool to solve any Use Case I have.   


Problem 1:   Your extractText logic is matching all values, not one, and is put them into attributes as an array (0,1,2,3,4,etc).   It's okay to use this in this manner, just use the values you need an ignore the rest. There are things you can do to single out each value better but I would recommend that you use a CSVRecordReader Controller Service to parse the CSV.      This allows much more control over the values as well as the schema.


Problem 2:  Can you send your flow screenshot?  You said it doesn't stop inserting the FlowFiles... is the first processor in your flow always on?  What I mean by this, if that first processsor's run schedule is 0 sec, it will always run, and continuously generate FlowFiles.     When I am creating a flow for first time, I set the first proc to some controllable run schedule.  For example 30 seconds.   I push play, then immediately stop it, then step the Flow Files through each downstream Processor, one at a time, testing at each queue that the FlowFiles attributes are as expected.   Once I know I have a full operating flow, then I address how to trigger the flow to start, or the appropriate timing to always run.




If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.  






Thanks much for your quick reply.  You were correct; I had my GetFile scheduled to zero, so I reset it to 30 seconds.  I do have a CSVReader configured, and made one change so that the first line would not be read as column headers (there are two settings for that, I had the first one correct, but not the second.)


My current text data file looks like this:






After a quick test, only the first value went into the database.  Looking at the Provenance Event for my ExtractText processor, here are the attributes it captured:


No value set
No value set
No value set
No value set
No value set
No value set
No value set
No value set


I wish there was one attribute (satellite_name) and each name was picked up as a separate record, but not in an array.  The array is ok, I just don't know how to deal with it using SQL in NiFi expression language.


The ExtractText Output Claim content is perfect; again, only first name is going to the database.


Thanks again for your help.



Super Guru

Use query record processor, have CSVReader and JSONWriter output.



SELECT satellite_name



Next processor can grab an attribute