Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

NiFi ExecuteScript Processor: error using string in python with special characters

avatar
New Member

Hi,

I try to use the executeScript Processor with python for converting strings with special characters like é . For encoding I use latin-1. The script is :

text = IOUtils.toString(inputStream, StandardCharsets.ISO_8859_1)
....
outputStream.write(bytearray(out.encode('latin-1')))

using this I get the follow error:

org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException: TypeError: write(): 1st arg can't be coerced to int, byte[] in <script> at line number 33

if I loop over the bytearray :

text = IOUtils.toString(inputStream, StandardCharsets.ISO_8859_1)
....
out = bytearray(out.encode('latin-1'))
for o in out :
   outputStream.write(o)

I don't get this error.

Thanks for your help

Chris

1 ACCEPTED SOLUTION

avatar
Rising Star

Hi @chris herssens,

I think it should work without the bytearray in the outputStream.write()

In the jython repl (just downloaded jython jar and ran the following command):

java -jar jython-standalone-2.7.0.jar

I was able to write a python encoded latin-1 string to an output stream and use the result to construct a Java string that matched the input:

>>> from java.io import ByteArrayOutputStream
>>> os = ByteArrayOutputStream()
>>> text = u'abcdé'
>>> os.write(text.encode('latin-1'))
>>> from java.lang import String
>>> String(os.toByteArray(), 'ISO-8859-1')
abcdé

View solution in original post

8 REPLIES 8

avatar
Rising Star

Hi @chris herssens,

I think it should work without the bytearray in the outputStream.write()

In the jython repl (just downloaded jython jar and ran the following command):

java -jar jython-standalone-2.7.0.jar

I was able to write a python encoded latin-1 string to an output stream and use the result to construct a Java string that matched the input:

>>> from java.io import ByteArrayOutputStream
>>> os = ByteArrayOutputStream()
>>> text = u'abcdé'
>>> os.write(text.encode('latin-1'))
>>> from java.lang import String
>>> String(os.toByteArray(), 'ISO-8859-1')
abcdé

avatar
Master Guru

Correct, the encode() method will return a bytearray already.

avatar
New Member

Hi, i try this exemple but i don't understand how i can use it in "Python executeScript Processor".

Could you share an exemple of nifi python script or help me to change my code ?

import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
import os

class ModJSON(StreamCallback):
  def __init__(self):
        pass
  def process(self, inputStream, outputStream):
    text = u'abcdé'
    outputStream.write(text)

flowFile = session.get()
if (flowFile != None):
  flowFile = session.write(flowFile, ModJSON())
  flowFile = session.putAttribute(flowFile, "filename","test")
session.transfer(flowFile, REL_SUCCESS)
session.commit()

avatar
New Member

Hello,

In the "process" method you can read for instance data from the inputstream, change some things and output it

see

https://community.hortonworks.com/content/kbentry/75032/executescript-cookbook-part-1.html for more information on executescript

 text = IOUtils.toString(inputStream, IOUtils.toString(inputStream,StandardCharsets.ISO_8859_1)
...
 outputStream.write(text)

avatar
New Member

I have this error " java.nio.charset.IllegalCharsetNameException: test ²éÃ"

with this input file : "test ²éà"

and this code :

import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
from java.io import ByteArrayOutputStream
from java.lang import String
import os


class ModJSON(StreamCallback):
  def __init__(self):
        pass
  def process(self, inputStream, outputStream):
        text = IOUtils.toString(inputStream, IOUtils.toString(inputStream,StandardCharsets.ISO_8859_1))
        outputStream.write(text)
flowFile = session.get()
if (flowFile != None):
  flowFile = session.write(flowFile, ModJSON())
  flowFile = session.putAttribute(flowFile, "filename", '_translated.json')
session.transfer(flowFile, REL_SUCCESS)
session.commit()



avatar
Master Guru

There are too many IOUtils.toString() calls there, the "text" line should read:

text = IOUtils.toString(inputStream, StandardCharsets.ISO_8859_1))

avatar
New Member

thank you, it works perfectly !

avatar
New Member

can you change

outputStream.wrtie(text)

with

outputStream.write(text.encode('latin-1'))