Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Execute script processor don't support utf-8 encoding

Highlighted

Execute script processor don't support utf-8 encoding

I have python script, I want to parse json -contains Arabic words -but it doesn't support utf-8 encoding.

I got below error.

My script

import json

import java.io

from org.apache.commons.io import IOUtils

from java.nio.charset import StandardCharsets

from org.apache.nifi.processor.io import StreamCallback

class ModJSON(StreamCallback):

def __init__(self):

pass

def process(self, inputStream, outputStream):

text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)

obj = json.loads(text) insertquery = "insert into Tweets_test values ('"+str(obj['id'])+"','"+obj['text'].encode('utf-8')+"','"+str(obj['id_str'])+"');"

outputStream.write(bytearray(insertquery))

flowFile = session.get()

if (flowFile != None):

flowFile = session.write(flowFile, ModJSON())

session.transfer(flowFile, REL_SUCCESS)

session.commit()

16504-excutescript.png

3 REPLIES 3
Highlighted

Re: Execute script processor don't support utf-8 encoding

Super Guru

I use bytearray() in my examples, but I haven't been able to figure out when you need it and when you don't. I suspect it might be when the type is 'unicode' or 'java.lang.String' instead of Jython's 'str' type. The following two lines worked for me:

insertquery = "insert into Tweets_test values ('"+str(obj['id'])+"','"+obj['text']+"','"+str(obj['id_str'])+"');"
outputStream.write(insertquery)

This page says that a Jython String will be coerced to byte[] when necessary, and that seems to be what's going on above.

Re: Execute script processor don't support utf-8 encoding

Thanks Matt,

I got above error when json contain Arabic words in text.

Highlighted

Re: Execute script processor don't support utf-8 encoding

Super Guru

I tested this with Arabic characters in my text field, and it worked fine. You're saying you still get the error when using my suggested lines?

Don't have an account?
Coming from Hortonworks? Activate your account here