Hi,
I got this problem
I need to do some cleaning within columns in my flowfile. the columns are seperated by tabs, sometimes there can be \n within the columns so i cannot use splitlines to access the data, so i tried to use the csv library to read the inputstream.
But i get an error :
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
Caused by: javax.script.ScriptException: java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Cannot create PyString with non-byte value in <script> at line number 21
Here are my script which i am running in the ExecuteScript body
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
import csv
# Define a subclass of StreamCallback for use in session.write()
class PyStreamCallback(StreamCallback):
def __init__(self):
pass
def process(self, inputStream, outputStream):
newText =''
Text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
reader = csv.reader(Text,delimiter=' ')
for row in reader:
newText+=(' '.join(row)).rstrip('\n\r')
outputStream.write(newText.encode('utf-8'))
# end class
flowFile = session.get()
if(flowFile != None):
flowFile = session.write(flowFile, PyStreamCallback())
session.transfer(flowFile, REL_SUCCESS)
# implicit return at the end