Support Questions
Find answers, ask questions, and share your expertise

NiFi ExecuteScript Processor: error using string in python with special characters

Hi,

I try to use the executeScript Processor with python for converting strings with special characters like é . For encoding I use latin-1. The script is :

text = IOUtils.toString(inputStream, StandardCharsets.ISO_8859_1)
....
outputStream.write(bytearray(out.encode('latin-1')))

using this I get the follow error:

org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException: TypeError: write(): 1st arg can't be coerced to int, byte[] in <script> at line number 33

if I loop over the bytearray :

text = IOUtils.toString(inputStream, StandardCharsets.ISO_8859_1)
....
out = bytearray(out.encode('latin-1'))
for o in out :
   outputStream.write(o)

I don't get this error.

Thanks for your help

Chris

1 ACCEPTED SOLUTION

Accepted Solutions

Contributor

Hi @chris herssens,

I think it should work without the bytearray in the outputStream.write()

In the jython repl (just downloaded jython jar and ran the following command):

java -jar jython-standalone-2.7.0.jar

I was able to write a python encoded latin-1 string to an output stream and use the result to construct a Java string that matched the input:

>>> from java.io import ByteArrayOutputStream
>>> os = ByteArrayOutputStream()
>>> text = u'abcdé'
>>> os.write(text.encode('latin-1'))
>>> from java.lang import String
>>> String(os.toByteArray(), 'ISO-8859-1')
abcdé

View solution in original post

8 REPLIES 8

Contributor

Hi @chris herssens,

I think it should work without the bytearray in the outputStream.write()

In the jython repl (just downloaded jython jar and ran the following command):

java -jar jython-standalone-2.7.0.jar

I was able to write a python encoded latin-1 string to an output stream and use the result to construct a Java string that matched the input:

>>> from java.io import ByteArrayOutputStream
>>> os = ByteArrayOutputStream()
>>> text = u'abcdé'
>>> os.write(text.encode('latin-1'))
>>> from java.lang import String
>>> String(os.toByteArray(), 'ISO-8859-1')
abcdé

View solution in original post

Super Guru

Correct, the encode() method will return a bytearray already.

Explorer

Hi, i try this exemple but i don't understand how i can use it in "Python executeScript Processor".

Could you share an exemple of nifi python script or help me to change my code ?

import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
import os

class ModJSON(StreamCallback):
  def __init__(self):
        pass
  def process(self, inputStream, outputStream):
    text = u'abcdé'
    outputStream.write(text)

flowFile = session.get()
if (flowFile != None):
  flowFile = session.write(flowFile, ModJSON())
  flowFile = session.putAttribute(flowFile, "filename","test")
session.transfer(flowFile, REL_SUCCESS)
session.commit()

Hello,

In the "process" method you can read for instance data from the inputstream, change some things and output it

see

https://community.hortonworks.com/content/kbentry/75032/executescript-cookbook-part-1.html for more information on executescript

 text = IOUtils.toString(inputStream, IOUtils.toString(inputStream,StandardCharsets.ISO_8859_1)
...
 outputStream.write(text)

Explorer

I have this error " java.nio.charset.IllegalCharsetNameException: test ²éÃ"

with this input file : "test ²éà"

and this code :

import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
from java.io import ByteArrayOutputStream
from java.lang import String
import os


class ModJSON(StreamCallback):
  def __init__(self):
        pass
  def process(self, inputStream, outputStream):
        text = IOUtils.toString(inputStream, IOUtils.toString(inputStream,StandardCharsets.ISO_8859_1))
        outputStream.write(text)
flowFile = session.get()
if (flowFile != None):
  flowFile = session.write(flowFile, ModJSON())
  flowFile = session.putAttribute(flowFile, "filename", '_translated.json')
session.transfer(flowFile, REL_SUCCESS)
session.commit()



Super Guru

There are too many IOUtils.toString() calls there, the "text" line should read:

text = IOUtils.toString(inputStream, StandardCharsets.ISO_8859_1))

Explorer

thank you, it works perfectly !

can you change

outputStream.wrtie(text)

with

outputStream.write(text.encode('latin-1'))