Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

near real time emoji extraction from text

Highlighted

near real time emoji extraction from text

New Contributor

i use hdf 3.2 and have a stream flow of text that may contain emoji. i need to extract it into array in scale of near real time then store it on elasticsearch. this stream flows in apache nifi with approximate 100 tweets per seconds.

what is the best or better solution/architecture to came on this need? i have couple of idea that listed below.

A) create a web service to extract emoji from input text and then send nifi flows on to it then gather response.

91703-ztgpz.jpg



B) same previous step, plus using apache kafka.

91704-eic3w.jpg




C) change architecture to use some feature of Apache Spark or Storm or Flink.

91705-tudzy.jpg



D) Elasticsearch custom mapping?

Don't have an account?
Coming from Hortonworks? Activate your account here