Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Divide file in nifi

Solved Go to solution

Divide file in nifi

Explorer

I've got a data file that needs to be divided.
Inside the text file, the sections are divided:

  • # @ - the beginning of the section with data,
  • # $ is the start of the data block,
  • Next # means the end of the Section.

How can I do it?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Divide file in nifi

Expert Contributor

Is your data on HDFS? If so, you would use the GetHDFS processor to load your file into a FlowFile. If your data is on your local NiFi node, then you would use a GetFile processor to load the file.

Next if you want to split by newline, you could use SplitText processor to split your file into multiple FlowFiles. If you only want to split by your '#@' and '#$' you can use the SplitContent processor. That processor will split based on a sequence of text characters (set the 'Byte Sequence Format' to 'text') so you can put in '#@' to split on. I'm not sure exactly how you'd like to divide your data but that should give you a starting point. You can chain multiple of these SplitContent processors together to split on multiple character sequences. Ultimately, your one file on disk will be converted into multiple FlowFiles in NiFi.

Take a look at the SplitContent processor for more info: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.6.0/org.apache...

4 REPLIES 4

Re: Divide file in nifi

Hey @Vladislav Shcherbakov!
You can try to use the ExtratText processor and add a parameter for each value that you wanna get using Regex.
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.5.0/org.apache...

Hope this helps

Re: Divide file in nifi

Explorer

If you could show me an example, it would be rly nice, 'cause I don't quite understand the principle of action... Thanks!

Re: Divide file in nifi

Expert Contributor

Is your data on HDFS? If so, you would use the GetHDFS processor to load your file into a FlowFile. If your data is on your local NiFi node, then you would use a GetFile processor to load the file.

Next if you want to split by newline, you could use SplitText processor to split your file into multiple FlowFiles. If you only want to split by your '#@' and '#$' you can use the SplitContent processor. That processor will split based on a sequence of text characters (set the 'Byte Sequence Format' to 'text') so you can put in '#@' to split on. I'm not sure exactly how you'd like to divide your data but that should give you a starting point. You can chain multiple of these SplitContent processors together to split on multiple character sequences. Ultimately, your one file on disk will be converted into multiple FlowFiles in NiFi.

Take a look at the SplitContent processor for more info: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.6.0/org.apache...

Re: Divide file in nifi

Explorer

Files on FTP server. I got them with ListFTP and FetchFTP.
Then I use RouteText(I think) to filter them by name.
And after it I need to parsing data with divide into parts(with SplitContent, I'll try).
And unload data on sql server with PutDatabaseRecord.

Don't have an account?
Coming from Hortonworks? Activate your account here