Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Morphlines - converting a byte array field to a UTF-8 string

Solved Go to solution

Morphlines - converting a byte array field to a UTF-8 string

Explorer

I've got a record in Morphlines that includes a byte array field.  I want to convert that to a UTF-8 string, i.e. the equivalent of String(field, "UTF-8) in Java.

 

I can see the readClob command exists, but that works on a whole record rather than a single field.  Is there an alternative?

 

Appreciate that I should have stored my data differently, but it's not my format and that's what ETL tools (morphlines) are for! ;)

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Morphlines - converting a byte array field to a UTF-8 string

Expert Contributor
You can try the java command, like so:

java {
code: """
String str = new String((byte[]) record.getFirstValue("myInputField"), "UTF-8");
record.put("myOutputField", str);
return getChild().process(record); // pass record to next command in chain
"""
}

Wolfgang.

5 REPLIES 5

Re: Morphlines - converting a byte array field to a UTF-8 string

Expert Contributor
You can try the java command, like so:

java {
code: """
String str = new String((byte[]) record.getFirstValue("myInputField"), "UTF-8");
record.put("myOutputField", str);
return getChild().process(record); // pass record to next command in chain
"""
}

Wolfgang.

Highlighted

Re: Morphlines - converting a byte array field to a UTF-8 string

Explorer

Perfect, thanks for this.

 

I was also having a look at something like:

 

{ setValues { _attachment_body : "@{myInputField}" } }
{ readClob { charSet : "UTF-8" } }

 

Out of interest, are there any disadvantages with that approach?

 

Thanks.

 

 

Re: Morphlines - converting a byte array field to a UTF-8 string

Expert Contributor
That would work fine as well.

Wolfgang.

Re: Morphlines - converting a byte array field to a UTF-8 string

Expert Contributor

Hi, what kind of source you use to get attachment_body? Is it possible to use HttpSource as a source and morphlinesolrsink to process accepted payload (xml data)

Re: Morphlines - converting a byte array field to a UTF-8 string

New Contributor

While reading parquet file, How to convert Parquet DECIMAL datatype to String.

Don't have an account?
Coming from Hortonworks? Activate your account here