Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

nifi -Ingesting a file from SFTP and insert into MYSQL using nifi

avatar
Rising Star

Hi community,

I would like to leverage nifi to ingest a file from sftp and insert its data into mysql database.I currently have been unsuccessful in doing so and will require all assistance in the right direction.Much appreciated for the help in advance and below are the specifics

(1)I should be able to list the files on sftp

(2) select a particular file or files

(3)Fetch the "desired file"

(4) Ingest "desired file" into database.

Below is more detail.

(i)Sample content of sftp file to ingest into via nifi via sftp processors later to insert into MYSQL

7246-samplecontentofsftpfile.jpg

(ii)Current designed nifi workflow

7249-nifiwrkflow.jpg

(iii)Error am getting with this workflow

ConvertJSONToSQL ERROR: failed to parse standardflowfilerecord due to processor exception as JSON unexpected character (''' (code39)) expected a valid value (number,string,array,object,'true','false' or 'null')

PutSQL ERROR: failed to update a database due to a failed batch update.There were a total of 1 flow file that failed, 0 that succeeded and 0 that were not excuted and will be rerouted to retry.

(iv)Processor Configs

1. ReplaceTxext

7262-replacetextattribute.jpg

sql statement: INSERT INTO NiFiUsecase001 (Column1, Column2, Column3, Column4, Column5) VALUES ('${Column1}', '${Column2}','${Column3}','${Column4}','${Column5}')

2. ConvertJSONToSQL

7263-convertjsontosql.jpg

3.PutSQL

7264-putsql.jpg

1 ACCEPTED SOLUTION

avatar
Master Guru

After your FetchSFTP, the bar-delimited content will be in the content of the flow file, not the attributes. That is followed by an AttributesToJson processor which will overwrite the flow file content with a JSON document containing attributes such as sftp.remote.host, sftp.remote.port, etc. (see the doc for AttributesToJson).

I think you may want a SplitText processor after your FetchSFTP processor, to create one flow file for each line in your file. Then you could have an ExtractText processor which could use a regex (with grouping) to create attributes such as column.1, column.2, etc. Then your ReplaceText can use those attributes.

View solution in original post

12 REPLIES 12

avatar
Master Guru

After your FetchSFTP, the bar-delimited content will be in the content of the flow file, not the attributes. That is followed by an AttributesToJson processor which will overwrite the flow file content with a JSON document containing attributes such as sftp.remote.host, sftp.remote.port, etc. (see the doc for AttributesToJson).

I think you may want a SplitText processor after your FetchSFTP processor, to create one flow file for each line in your file. Then you could have an ExtractText processor which could use a regex (with grouping) to create attributes such as column.1, column.2, etc. Then your ReplaceText can use those attributes.

avatar
Rising Star

Hi @Matt Burgess thanks a lot for responding I updated my workflow with the specifics you suggested but I still cannot insert into mysql db, am getting the same error as mentioned previously.I also attached my configs as well.

PutSQL ERROR: failed to update a database due to a failed batch update.There were a total of 1 flow file that failed, 0 that succeeded and 0 that were not excuted and will be rerouted to retry.

(i)Updated Workflw

7265-updatednifiwrkfl.jpg

(ii)SplitText Config

7266-splitxml.jpg

(iii)ExtractText Config

7268-extracttext.jpg

(iv) Both ReplaceText and PutSQL remain unchanged.

ps: What is the tiny number "1" that appears when processors are running.

Cheers

avatar
Rising Star

Hi @Matt Burgess could there other reasons such as available RAM on this nifi compute? Am currently using a t2 medium in AWS .

avatar
Rising Star

Hi All, @mclark, @Bryan Bende, @Brandon Wilson, @jfrazee, @Pierre Villard, @Andrew Grande I would appreciate if you could get assist me on the above.Please refer to the above for details and explanation.Thanks a lot!

avatar

Could you share the information you will find in the application log file? (./logs/nifi-app.log)

avatar
Rising Star

Hi All,

@Pierre Villard ,@mclark, @Bryan Bende, @Brandon Wilson, @jfrazee, @Andrew Grande , @Matt Burgess I have been able to insert into MYSQL by setting "Obtain Generated Keys =true" in PutSQL configuration (pics below) but the problem now is, there is an insane number of duplicates that got ingested into the mysql table(pics below). I would like to know what might be going on with my flow to cause this and how to fix it.Thanks a lot!!

(i)PutSql Configuration

7322-newputsql.jpg

(ii)ExtractText

7323-newextracttext.jpg

(iii)Original data to be ingested into MYSQL

7326-originaldata.jpg

(iv)Table count after PutSQL nifi ingest

7327-countfromflowfile.jpg

avatar

Are you sure that you don't ingest multiple times the same file with List/FetchSFTP?

avatar
Rising Star

@Pierre Villard sorry for the late reply. How do I verify am not ingesting multiple times the same file with List/FetchSFTP?

avatar
Rising Star

Hi All,

@Pierre Villard ,@mclark, @Bryan Bende, @Brandon Wilson, @jfrazee, @Andrew Grande , @Matt Burgess could you please assist me with the above.Greatly appreciate the help please.I have explained everything thoroughly.