Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

,How can I enrich flow file attributes with data read from a file?

Solved Go to solution

,How can I enrich flow file attributes with data read from a file?

New Contributor

I've got to watch a directory that includes subdirectories. Every file I receive in any subdirectory needs to be processed, but the details depend on the subdirectory and the filename, for some subdirectories I have to send an email, for others I have to execute one or more commands, store the file at a specific location, etc. Multiple actions per flow file are possible as well.

I'm currently stuck on setting the attributes of the flowfile.

My current plan is as follows:

  1. GetFile processor to watch the directory A creates flow file B
  2. InvokeScriptedProcessor uses attributes like B.filename and B.path to read and parse file C from the filesystem and updates the attributes, such as sftp server, email [addresses, subject and body] and commands to execute on flow file B
  3. RouteOnAttribute to the sftp workflow
  4. RouteOnAttribute to the email workflow
  5. RouteOnAttribute to the command execution workflow

Is there a sane alternative to writing my own processor for task 2? The data file format of file C is pretty much up to me to define.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: ,How can I enrich flow file attributes with data read from a file?

Master Guru

@Daniel Frank

What format is your data in? (text?)

Is all the information you need in the content of these files?

The getFile processor already writes attributes for the following on every FlowFile it creates:

16123-screen-shot-2017-06-07-at-32159-pm.png

You could use the ExtractText processor to read the FlowFile content and extract bits to FlowFile Attributes.

Thanks,

Matt

4 REPLIES 4

Re: ,How can I enrich flow file attributes with data read from a file?

Master Guru

@Daniel Frank

What format is your data in? (text?)

Is all the information you need in the content of these files?

The getFile processor already writes attributes for the following on every FlowFile it creates:

16123-screen-shot-2017-06-07-at-32159-pm.png

You could use the ExtractText processor to read the FlowFile content and extract bits to FlowFile Attributes.

Thanks,

Matt

Re: ,How can I enrich flow file attributes with data read from a file?

New Contributor

My description was unclear about an important point, I have updated it.

Using the filename and path attributes of the flowfile I have to parse a file in the filesystem (or maybe query a database) to figure out what exactly needs to be done to the flow file contents.

Re: ,How can I enrich flow file attributes with data read from a file?

Master Guru

@Daniel Frank

If you use @Matt Clarke in your response, I do not get an email notification.

I am not following how you use the filename and path to file (B) to parse a totally different file (C) from the filesystem.

Have you looked at the FetchFile processor. It accepts a FlowFile as input and uses attributes set on the incoming FlowFile to specify what file to fetch and from where.

So you could getFile (B), extract what you need from file (B) into attributes that FetchFile can use to get File (C). FetchFile will stream the content of file (C) into the FlowFile originally belonging to File (B); however, the resulting FlowFile will retain all the FlowFile Attributes that already existed on FlowFile (B).

Thanks,

Matt

If you found this answer addressed your question, please mark as accepted to close out this thread in the community.

Re: ,How can I enrich flow file attributes with data read from a file?

New Contributor

@Matt Clarke

Thanks, that answer my question. I have to move the contents of the flowFile B to its attributes, then fetch file C and parse it with more processors to set more attributes on the flowfile and finally restore the contents of the flowfile to what I saved to the attributes.

Sound's like I'll go with a scripted processor instead, that will save a lot of headache for me :)

Don't have an account?
Coming from Hortonworks? Activate your account here