10-24-2018 07:46 AM - edited 10-24-2018 07:47 AM
I have a task that requires email ingestion as soon as is received in outlook, then extract some information by doing a search based on keywords and store the extracted information in hive :
Near/real-time email ingestion ---> extract value --> Store into hive
I read that NIFI can do the job but isn't included in Cloudera.
My question is there any Cloudera service (Flume/Kafka/Spark ....) that can connect to outlook capture emails that satisfy certain criteria, or do I have to make a python code using imaplib and run it using Cron on each time interval.
any given hint is appreciated.