Support Questions

Find answers, ask questions, and share your expertise

nifi -Ingesting a file from SFTP and insert into MYSQL using nifi

Contributor

Hi community,

I would like to leverage nifi to ingest a file from sftp and insert its data into mysql database.I currently have been unsuccessful in doing so and will require all assistance in the right direction.Much appreciated for the help in advance and below are the specifics

(1)I should be able to list the files on sftp

(2) select a particular file or files

(3)Fetch the "desired file"

(4) Ingest "desired file" into database.

Below is more detail.

(i)Sample content of sftp file to ingest into via nifi via sftp processors later to insert into MYSQL

7246-samplecontentofsftpfile.jpg

(ii)Current designed nifi workflow

7249-nifiwrkflow.jpg

(iii)Error am getting with this workflow

ConvertJSONToSQL ERROR: failed to parse standardflowfilerecord due to processor exception as JSON unexpected character (''' (code39)) expected a valid value (number,string,array,object,'true','false' or 'null')

PutSQL ERROR: failed to update a database due to a failed batch update.There were a total of 1 flow file that failed, 0 that succeeded and 0 that were not excuted and will be rerouted to retry.

(iv)Processor Configs

1. ReplaceTxext

7262-replacetextattribute.jpg

sql statement: INSERT INTO NiFiUsecase001 (Column1, Column2, Column3, Column4, Column5) VALUES ('${Column1}', '${Column2}','${Column3}','${Column4}','${Column5}')

2. ConvertJSONToSQL

7263-convertjsontosql.jpg

3.PutSQL

7264-putsql.jpg

1 ACCEPTED SOLUTION

Super Guru

After your FetchSFTP, the bar-delimited content will be in the content of the flow file, not the attributes. That is followed by an AttributesToJson processor which will overwrite the flow file content with a JSON document containing attributes such as sftp.remote.host, sftp.remote.port, etc. (see the doc for AttributesToJson).

I think you may want a SplitText processor after your FetchSFTP processor, to create one flow file for each line in your file. Then you could have an ExtractText processor which could use a regex (with grouping) to create attributes such as column.1, column.2, etc. Then your ReplaceText can use those attributes.

View solution in original post

12 REPLIES 12

Super Guru

After your FetchSFTP, the bar-delimited content will be in the content of the flow file, not the attributes. That is followed by an AttributesToJson processor which will overwrite the flow file content with a JSON document containing attributes such as sftp.remote.host, sftp.remote.port, etc. (see the doc for AttributesToJson).

I think you may want a SplitText processor after your FetchSFTP processor, to create one flow file for each line in your file. Then you could have an ExtractText processor which could use a regex (with grouping) to create attributes such as column.1, column.2, etc. Then your ReplaceText can use those attributes.

Contributor

Hi @Matt Burgess thanks a lot for responding I updated my workflow with the specifics you suggested but I still cannot insert into mysql db, am getting the same error as mentioned previously.I also attached my configs as well.

PutSQL ERROR: failed to update a database due to a failed batch update.There were a total of 1 flow file that failed, 0 that succeeded and 0 that were not excuted and will be rerouted to retry.

(i)Updated Workflw

7265-updatednifiwrkfl.jpg

(ii)SplitText Config

7266-splitxml.jpg

(iii)ExtractText Config

7268-extracttext.jpg

(iv) Both ReplaceText and PutSQL remain unchanged.

ps: What is the tiny number "1" that appears when processors are running.

Cheers

Contributor

Hi @Matt Burgess could there other reasons such as available RAM on this nifi compute? Am currently using a t2 medium in AWS .

Contributor

Hi All, @mclark, @Bryan Bende, @Brandon Wilson, @jfrazee, @Pierre Villard, @Andrew Grande I would appreciate if you could get assist me on the above.Please refer to the above for details and explanation.Thanks a lot!

Could you share the information you will find in the application log file? (./logs/nifi-app.log)

Contributor

Hi All,

@Pierre Villard ,@mclark, @Bryan Bende, @Brandon Wilson, @jfrazee, @Andrew Grande , @Matt Burgess I have been able to insert into MYSQL by setting "Obtain Generated Keys =true" in PutSQL configuration (pics below) but the problem now is, there is an insane number of duplicates that got ingested into the mysql table(pics below). I would like to know what might be going on with my flow to cause this and how to fix it.Thanks a lot!!

(i)PutSql Configuration

7322-newputsql.jpg

(ii)ExtractText

7323-newextracttext.jpg

(iii)Original data to be ingested into MYSQL

7326-originaldata.jpg

(iv)Table count after PutSQL nifi ingest

7327-countfromflowfile.jpg

Are you sure that you don't ingest multiple times the same file with List/FetchSFTP?

Contributor

@Pierre Villard sorry for the late reply. How do I verify am not ingesting multiple times the same file with List/FetchSFTP?

Contributor

Hi All,

@Pierre Villard ,@mclark, @Bryan Bende, @Brandon Wilson, @jfrazee, @Andrew Grande , @Matt Burgess could you please assist me with the above.Greatly appreciate the help please.I have explained everything thoroughly.

I am experiencing the same issue with the PutSQL processor, regardless of how many properly formatted SQL Insert statements I pass to the processor, it fails say that the entire batch failed because it was expecting N statements, where N = the number of SQL statements I have passed in, e.g. 30, 50, 100. It never seems to get past this point.

Expert Contributor

@David Kjerrumgaard Did you manage to get past this challenge? In my case, i get an error "invalid number" after i changed the Obtain Generated Keys property to true.

Expert Contributor

@David Kjerrumgaard I found out the issue was with my csv file. It had to do with the column names i was ingesting in the csv file. Also, the turning the Obtain Generated Keys property to false worked as well in my case.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.