Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

Morphlines - converting a byte array field to a UTF-8 string

SOLVED Go to solution

Morphlines - converting a byte array field to a UTF-8 string

Explorer

I've got a record in Morphlines that includes a byte array field.  I want to convert that to a UTF-8 string, i.e. the equivalent of String(field, "UTF-8) in Java.

 

I can see the readClob command exists, but that works on a whole record rather than a single field.  Is there an alternative?

 

Appreciate that I should have stored my data differently, but it's not my format and that's what ETL tools (morphlines) are for! ;)

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Morphlines - converting a byte array field to a UTF-8 string

Expert Contributor
You can try the java command, like so:

java {
code: """
String str = new String((byte[]) record.getFirstValue("myInputField"), "UTF-8");
record.put("myOutputField", str);
return getChild().process(record); // pass record to next command in chain
"""
}

Wolfgang.

5 REPLIES 5

Re: Morphlines - converting a byte array field to a UTF-8 string

Expert Contributor
You can try the java command, like so:

java {
code: """
String str = new String((byte[]) record.getFirstValue("myInputField"), "UTF-8");
record.put("myOutputField", str);
return getChild().process(record); // pass record to next command in chain
"""
}

Wolfgang.

Re: Morphlines - converting a byte array field to a UTF-8 string

Explorer

Perfect, thanks for this.

 

I was also having a look at something like:

 

{ setValues { _attachment_body : "@{myInputField}" } }
{ readClob { charSet : "UTF-8" } }

 

Out of interest, are there any disadvantages with that approach?

 

Thanks.

 

 

Re: Morphlines - converting a byte array field to a UTF-8 string

Expert Contributor
That would work fine as well.

Wolfgang.

Highlighted

Re: Morphlines - converting a byte array field to a UTF-8 string

Expert Contributor

Hi, what kind of source you use to get attachment_body? Is it possible to use HttpSource as a source and morphlinesolrsink to process accepted payload (xml data)

Re: Morphlines - converting a byte array field to a UTF-8 string

New Contributor

While reading parquet file, How to convert Parquet DECIMAL datatype to String.