Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

NiFi ExecuteScript Processor UnicodeEncodeError: 'ascii' codec can't encode character u'\xed'

NiFi ExecuteScript Processor UnicodeEncodeError: 'ascii' codec can't encode character u'\xed'

New Contributor

Hi,

I wrote an ExcuteScript Processor following the example to convert a json string to a csv string. I pick Python and it works well until it hits special characters like 'í'. I wonder if I can use some other encoding instead of UTF_8 to do 'text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)' and 'outputStream.write(bytearray(msg_csv.encode('utf-8')))'?

Error: UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 14: ordinal not in range(128) in <script> at line number 52

Thanks a lot for the help!

Stephanie

2 REPLIES 2

Re: NiFi ExecuteScript Processor UnicodeEncodeError: 'ascii' codec can't encode character u'\xed'

Yes, ExecuteScript has access to all of StandardCharsets (either by its static inner classes or by name, as you mention), and in Jython you should have access to all of its charsets too. Do you know the encoding of the incoming flow file? If it is variable yet available as an attribute (perhaps as part of the mime.type attribute), you can try passing in that value to IOUtils.toString() and/or msg_csv.encode(), using flowFile.getAttribute('mime.type') and parsing off the parameter (MIME type params are delimited after the type with semicolons, I think the param name is 'charset').

In your case you might just try StandardCharsets.UTF_16 and/or msg_csv.encode('utf-16') to see if that fixes it.

Re: NiFi ExecuteScript Processor UnicodeEncodeError: 'ascii' codec can't encode character u'\xed'

New Contributor

Hi Matt,

Thanks for the quick reply! I will try it out!

Don't have an account?
Coming from Hortonworks? Activate your account here