Created 08-08-2018 08:36 PM
I found Matt's cookbooks and I'm following the recipe for overwriting a FlowFile. It seems very simple and straightforward and I'm not sure what I'm missing.
My code is supposed to read the PDF in from the FlowFile, use PDFBox to extract first and last name from the form (it's an I9) and then output the results into a JSON which gets sent out in REL_SUCCESS. Instead it just outputs the PDF file to REL_SUCCESS. Not sure if it's never being read which is causing blank output or I'm writing it out wrong or what.
import java.nio.charset.StandardCharsets import org.apache.pdfbox.io.IOUtils import org.apache.pdfbox.pdmodel.PDDocument import org.apache.pdfbox.util.PDFTextStripperByArea import java.awt.Rectangle import org.apache.pdfbox.pdmodel.PDPage import com.google.gson.Gson import java.nio.charset.StandardCharsets def flowFile = session.get() flowFile = session.write(flowFile, { inputStream, outputStream -> try { //Load Flowfile contents PDDocument document = PDDocument.load(inputStream) PDFTextStripperByArea stripper = new PDFTextStripperByArea() //Get the first page List<PDPage> allPages = document.getDocumentCatalog().getAllPages() PDPage page = allPages.get(0) } catch (Exception e){ System.out.println(e.getMessage()) session.transfer(flowFile, REL_FAILURE) } //Define the areas to search and add them as search regions stripper = new PDFTextStripperByArea() Rectangle lname = new Rectangle(25, 226, 240, 15) stripper.addRegion("lname", lname) Rectangle fname = new Rectangle(276, 226, 240, 15) stripper.addRegion("fname", fname) //Load the results into a JSON def boxMap = [:] stripper.setSortByPosition(true) stripper.extractRegions(page) regions = stripper.getRegions() for (String region : regions) { String box = stripper.getTextForRegion(region) boxMap.put(region, box) } Gson gson = new Gson() //Remove random noise from the output json = gson.toJson(boxMap, LinkedHashMap.class) json = json.replace('\\n', '') json = json.replace('\\r', '') json = json.replace(',"', ',\n"') //Overwrite flowfile contents with JSON outputStream.write(json.getBytes(StandardCharsets.UTF_8)) } as StreamCallback) session.transfer(flowFile, REL_SUCCESS)
Help appreciated!
Created 08-09-2018 05:09 PM
This issue was caused by me not using try/catch properly. Since the files weren't visible to the rest of my code outside the try/catch, it was returning the PDF.
Created 08-09-2018 05:09 PM
This issue was caused by me not using try/catch properly. Since the files weren't visible to the rest of my code outside the try/catch, it was returning the PDF.