Created 06-19-2018 04:43 PM
I've got a data file that needs to be divided.
Inside the text file, the sections are divided:
How can I do it?
Created 06-20-2018 12:15 PM
Is your data on HDFS? If so, you would use the GetHDFS processor to load your file into a FlowFile. If your data is on your local NiFi node, then you would use a GetFile processor to load the file.
Next if you want to split by newline, you could use SplitText processor to split your file into multiple FlowFiles. If you only want to split by your '#@' and '#$' you can use the SplitContent processor. That processor will split based on a sequence of text characters (set the 'Byte Sequence Format' to 'text') so you can put in '#@' to split on. I'm not sure exactly how you'd like to divide your data but that should give you a starting point. You can chain multiple of these SplitContent processors together to split on multiple character sequences. Ultimately, your one file on disk will be converted into multiple FlowFiles in NiFi.
Take a look at the SplitContent processor for more info: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.6.0/org.apache...
Created 06-19-2018 08:46 PM
Hey @Vladislav Shcherbakov!
You can try to use the ExtratText processor and add a parameter for each value that you wanna get using Regex.
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.5.0/org.apache...
Hope this helps
Created 06-20-2018 10:32 AM
If you could show me an example, it would be rly nice, 'cause I don't quite understand the principle of action... Thanks!
Created 06-20-2018 12:15 PM
Is your data on HDFS? If so, you would use the GetHDFS processor to load your file into a FlowFile. If your data is on your local NiFi node, then you would use a GetFile processor to load the file.
Next if you want to split by newline, you could use SplitText processor to split your file into multiple FlowFiles. If you only want to split by your '#@' and '#$' you can use the SplitContent processor. That processor will split based on a sequence of text characters (set the 'Byte Sequence Format' to 'text') so you can put in '#@' to split on. I'm not sure exactly how you'd like to divide your data but that should give you a starting point. You can chain multiple of these SplitContent processors together to split on multiple character sequences. Ultimately, your one file on disk will be converted into multiple FlowFiles in NiFi.
Take a look at the SplitContent processor for more info: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.6.0/org.apache...
Created 06-20-2018 12:26 PM
Files on FTP server. I got them with ListFTP and FetchFTP.
Then I use RouteText(I think) to filter them by name.
And after it I need to parsing data with divide into parts(with SplitContent, I'll try).
And unload data on sql server with PutDatabaseRecord.