Created on 09-01-2016 05:29 PM - edited 08-19-2019 02:21 AM
Hi community,
I would like to leverage nifi to ingest a file from sftp and insert its data into mysql database.I currently have been unsuccessful in doing so and will require all assistance in the right direction.Much appreciated for the help in advance and below are the specifics
(1)I should be able to list the files on sftp
(2) select a particular file or files
(3)Fetch the "desired file"
(4) Ingest "desired file" into database.
Below is more detail.
(i)Sample content of sftp file to ingest into via nifi via sftp processors later to insert into MYSQL
(ii)Current designed nifi workflow
(iii)Error am getting with this workflow
ConvertJSONToSQL ERROR: failed to parse standardflowfilerecord due to processor exception as JSON unexpected character (''' (code39)) expected a valid value (number,string,array,object,'true','false' or 'null')
PutSQL ERROR: failed to update a database due to a failed batch update.There were a total of 1 flow file that failed, 0 that succeeded and 0 that were not excuted and will be rerouted to retry.
(iv)Processor Configs
1. ReplaceTxext
sql statement: INSERT INTO NiFiUsecase001 (Column1, Column2, Column3, Column4, Column5) VALUES ('${Column1}', '${Column2}','${Column3}','${Column4}','${Column5}')
2. ConvertJSONToSQL
3.PutSQL
Created 09-01-2016 06:21 PM
After your FetchSFTP, the bar-delimited content will be in the content of the flow file, not the attributes. That is followed by an AttributesToJson processor which will overwrite the flow file content with a JSON document containing attributes such as sftp.remote.host, sftp.remote.port, etc. (see the doc for AttributesToJson).
I think you may want a SplitText processor after your FetchSFTP processor, to create one flow file for each line in your file. Then you could have an ExtractText processor which could use a regex (with grouping) to create attributes such as column.1, column.2, etc. Then your ReplaceText can use those attributes.
Created 09-01-2016 06:21 PM
After your FetchSFTP, the bar-delimited content will be in the content of the flow file, not the attributes. That is followed by an AttributesToJson processor which will overwrite the flow file content with a JSON document containing attributes such as sftp.remote.host, sftp.remote.port, etc. (see the doc for AttributesToJson).
I think you may want a SplitText processor after your FetchSFTP processor, to create one flow file for each line in your file. Then you could have an ExtractText processor which could use a regex (with grouping) to create attributes such as column.1, column.2, etc. Then your ReplaceText can use those attributes.
Created on 09-01-2016 09:12 PM - edited 08-19-2019 02:20 AM
Hi @Matt Burgess thanks a lot for responding I updated my workflow with the specifics you suggested but I still cannot insert into mysql db, am getting the same error as mentioned previously.I also attached my configs as well.
PutSQL ERROR: failed to update a database due to a failed batch update.There were a total of 1 flow file that failed, 0 that succeeded and 0 that were not excuted and will be rerouted to retry.
(i)Updated Workflw
(ii)SplitText Config
(iii)ExtractText Config
(iv) Both ReplaceText and PutSQL remain unchanged.
ps: What is the tiny number "1" that appears when processors are running.
Cheers
Created 09-02-2016 04:41 PM
Hi @Matt Burgess could there other reasons such as available RAM on this nifi compute? Am currently using a t2 medium in AWS .
Created 09-02-2016 05:27 PM
Hi All, @mclark, @Bryan Bende, @Brandon Wilson, @jfrazee, @Pierre Villard, @Andrew Grande I would appreciate if you could get assist me on the above.Please refer to the above for details and explanation.Thanks a lot!
Created 09-02-2016 06:15 PM
Could you share the information you will find in the application log file? (./logs/nifi-app.log)
Created on 09-02-2016 09:21 PM - edited 08-19-2019 02:20 AM
Hi All,
@Pierre Villard ,@mclark, @Bryan Bende, @Brandon Wilson, @jfrazee, @Andrew Grande , @Matt Burgess I have been able to insert into MYSQL by setting "Obtain Generated Keys =true" in PutSQL configuration (pics below) but the problem now is, there is an insane number of duplicates that got ingested into the mysql table(pics below). I would like to know what might be going on with my flow to cause this and how to fix it.Thanks a lot!!
(i)PutSql Configuration
(ii)ExtractText
(iii)Original data to be ingested into MYSQL
(iv)Table count after PutSQL nifi ingest
Created 09-07-2016 06:32 PM
Are you sure that you don't ingest multiple times the same file with List/FetchSFTP?
Created 09-21-2016 01:35 AM
@Pierre Villard sorry for the late reply. How do I verify am not ingesting multiple times the same file with List/FetchSFTP?
Created 09-07-2016 05:35 PM
Hi All,
@Pierre Villard ,@mclark, @Bryan Bende, @Brandon Wilson, @jfrazee, @Andrew Grande , @Matt Burgess could you please assist me with the above.Greatly appreciate the help please.I have explained everything thoroughly.