Support Questions

Find answers, ask questions, and share your expertise

Morphlines - converting a byte array field to a UTF-8 string

avatar
Contributor

I've got a record in Morphlines that includes a byte array field.  I want to convert that to a UTF-8 string, i.e. the equivalent of String(field, "UTF-8) in Java.

 

I can see the readClob command exists, but that works on a whole record rather than a single field.  Is there an alternative?

 

Appreciate that I should have stored my data differently, but it's not my format and that's what ETL tools (morphlines) are for! 😉

 

Thanks in advance.

1 ACCEPTED SOLUTION

avatar
Super Collaborator
You can try the java command, like so:

java {
code: """
String str = new String((byte[]) record.getFirstValue("myInputField"), "UTF-8");
record.put("myOutputField", str);
return getChild().process(record); // pass record to next command in chain
"""
}

Wolfgang.

View solution in original post

5 REPLIES 5

avatar
Super Collaborator
You can try the java command, like so:

java {
code: """
String str = new String((byte[]) record.getFirstValue("myInputField"), "UTF-8");
record.put("myOutputField", str);
return getChild().process(record); // pass record to next command in chain
"""
}

Wolfgang.

avatar
Contributor

Perfect, thanks for this.

 

I was also having a look at something like:

 

{ setValues { _attachment_body : "@{myInputField}" } }
{ readClob { charSet : "UTF-8" } }

 

Out of interest, are there any disadvantages with that approach?

 

Thanks.

 

 

avatar
Super Collaborator
That would work fine as well.

Wolfgang.

avatar
Expert Contributor

Hi, what kind of source you use to get attachment_body? Is it possible to use HttpSource as a source and morphlinesolrsink to process accepted payload (xml data)

avatar
Explorer

While reading parquet file, How to convert Parquet DECIMAL datatype to String.