Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Choose right processors for a use case

Choose right processors for a use case

New Contributor

Hello,

I need to process 100Go by day :

A- to merge some messages json every 2 seconds into a single message (size about 20ko)
B- to transform this json merged into another json message (same size)
C- to compute somes values on last json messsage : addition somes values, difference between 2 timestamp

About Merge, i think use processor MergeContent.
About Transform, i think use processor JOLTTransformJSON.
About Compute, i think use processor ExecuteScript.

But i'm afraid about JOLTTransformJSON performance, because it loads in memory each JSON message.
In a another side, ExecuteScript is "stream processor", so we can presume it could be a better performance to do both "Transform & Compute" ? But on usage documentation, it says processor is "Experimental: Impact of sustained usage not yet verified."

So, could could you tell me right processors for my use case for a high performance & high reliability ?

Thanks,

Bertrand.

Don't have an account?
Coming from Hortonworks? Activate your account here