Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to read content of FlowFile

avatar
New Contributor

I would like know how I can to retrieve content of my flow file. 

First I have Kafka Consumer processor which consume message from Kafka Cluster and prepare flow file for the message then flow pass to ExecuteScript Processor where I need to get the message from Flow File so that I can create a new FlowFile attrubite using that and send it to next processor. 


I can view the message by clilcking on List Queue--> View button which show the content of Flow File (Actual Kafka Message). But I dont know how I can read the content of FlowFile in ExecuteScript Processor. 

 

Please suggest. 

 

Thank You, 

2 REPLIES 2

avatar
Contributor

Hi....you haven't mentioned what kind of script you are planning to use. Assuming you are using Groovy here is a sample script that should work : 

 

def flowFile = session.get()
if(!flowFile) return


flowFile = session.write(flowFile, { inputStream, outputStream ->
BufferedReader br = new BufferedReader(new InputStreamReader(inputStream))

String dummy = flowFile.getAttribute('dummy')
br.eachLine { //your logic here } //only if you want to process per line
} as StreamCallback)

session.transfer(flowFile, REL_SUCCESS)

 

Hope this helps. If the comment helps you to find a solution or move forward, please accept it as a solution for other community members

avatar
Super Guru

@DanMcCray1 Once you have the content from Kafka as a flowfile, your options are not just limited to ExecuteScript.   Depending on the type of content you can use the following ideas:

 

  1. EvaluateJsonPath - if the content is a single json, and you need one or more values inside the object then this is an easy way to get those values to attributes.
  2. ExtractText - if the content is text or some raw format, extractText allows you to regex match against the content to get values to attributes.
  3. QueryRecord w/ Record Readers & Record Writer - this is the most recommended method.  Assuming your data has structure (text,csv,json,etc) and/or multiple rows/objects you can define a reader, with schema, output format (record writer), and query the results very effectively.

 

If you indeed want to work with Execute Script you should start here:

 

https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-1/ta-p/248922

https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-2/ta-p/249018

https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-3/ta-p/249148

 

 

If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.  

 

Thanks,


Steven @ DFHZ