<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Near/real-time Outlook email ingestion in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Near-real-time-Outlook-email-ingestion/m-p/81453#M84555</link>
    <description>&lt;P&gt;Hello everyone,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a task that requires email ingestion as soon as is received in outlook, then extract some information by doing a search based on keywords and store the extracted information in hive&amp;nbsp;:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Near/real-time&amp;nbsp;email ingestion ---&amp;gt; extract value --&amp;gt; Store into hive&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I read that NIFI can do the job but isn't included in Cloudera.&lt;/P&gt;&lt;P&gt;My question is there any Cloudera service&amp;nbsp;(Flume/Kafka/Spark ....) that can connect to outlook capture emails that satisfy&amp;nbsp;certain criteria, or do&amp;nbsp;I have to make a python code using&amp;nbsp;&lt;SPAN&gt;imaplib&lt;/SPAN&gt; and run it using Cron on each time interval.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;any given hint is appreciated.&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 13:49:52 GMT</pubDate>
    <dc:creator>NAITTOU</dc:creator>
    <dc:date>2022-09-16T13:49:52Z</dc:date>
    <item>
      <title>Near/real-time Outlook email ingestion</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Near-real-time-Outlook-email-ingestion/m-p/81453#M84555</link>
      <description>&lt;P&gt;Hello everyone,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a task that requires email ingestion as soon as is received in outlook, then extract some information by doing a search based on keywords and store the extracted information in hive&amp;nbsp;:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Near/real-time&amp;nbsp;email ingestion ---&amp;gt; extract value --&amp;gt; Store into hive&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I read that NIFI can do the job but isn't included in Cloudera.&lt;/P&gt;&lt;P&gt;My question is there any Cloudera service&amp;nbsp;(Flume/Kafka/Spark ....) that can connect to outlook capture emails that satisfy&amp;nbsp;certain criteria, or do&amp;nbsp;I have to make a python code using&amp;nbsp;&lt;SPAN&gt;imaplib&lt;/SPAN&gt; and run it using Cron on each time interval.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;any given hint is appreciated.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 13:49:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Near-real-time-Outlook-email-ingestion/m-p/81453#M84555</guid>
      <dc:creator>NAITTOU</dc:creator>
      <dc:date>2022-09-16T13:49:52Z</dc:date>
    </item>
    <item>
      <title>Re: Near/real-time Outlook email ingestion</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Near-real-time-Outlook-email-ingestion/m-p/87623#M84556</link>
      <description>You might have a look here: &lt;A href="https://dzone.com/articles/how-to-ingest-email-into-apache-hadoop-in-real-tim" target="_blank"&gt;https://dzone.com/articles/how-to-ingest-email-into-apache-hadoop-in-real-tim&lt;/A&gt;</description>
      <pubDate>Mon, 11 Mar 2019 22:55:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Near-real-time-Outlook-email-ingestion/m-p/87623#M84556</guid>
      <dc:creator>RobertM</dc:creator>
      <dc:date>2019-03-11T22:55:19Z</dc:date>
    </item>
    <item>
      <title>Re: Near/real-time Outlook email ingestion</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Near-real-time-Outlook-email-ingestion/m-p/89943#M84557</link>
      <description>&lt;P&gt;Since the question was asked, the situation has changed. As soon as Hortonworks and Cloudera merged, NiFi became supported by Cloudera.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Shortly after the integrations with CDH were also completed, so that NiFi is now a fully supported and integrated component.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hence the question already contains the answer: Please look into NiFi for solving this usecase.&lt;/P&gt;</description>
      <pubDate>Mon, 06 May 2019 11:42:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Near-real-time-Outlook-email-ingestion/m-p/89943#M84557</guid>
      <dc:creator>DennisJaheruddi</dc:creator>
      <dc:date>2019-05-06T11:42:13Z</dc:date>
    </item>
    <item>
      <title>Re: Near/real-time Outlook email ingestion</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Near-real-time-Outlook-email-ingestion/m-p/89945#M84558</link>
      <description>&lt;P&gt;&lt;A href="https://stackoverflow.com/questions/16251694/unicodeencodeerror-ascii-codec-cant-encode-character-u-u2026" target="_blank" rel="noopener"&gt;This&lt;/A&gt; seems relevant:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In Python 2,&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;unicode&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;objects can only be printed if they can be converted to ASCII. If it can't be encoded in ASCII, you'll get that error. You probably want to explicitly encode it and then print the resulting&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;str:&lt;/P&gt;&lt;PRE&gt;&lt;SPAN class="kwd"&gt;print&lt;/SPAN&gt;&lt;SPAN class="pln"&gt; post&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;text&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;.&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;encode&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="str"&gt;'utf-8'&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;)&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 06 May 2019 11:47:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Near-real-time-Outlook-email-ingestion/m-p/89945#M84558</guid>
      <dc:creator>DennisJaheruddi</dc:creator>
      <dc:date>2019-05-06T11:47:17Z</dc:date>
    </item>
    <item>
      <title>Re: Near/real-time Outlook email ingestion</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Near-real-time-Outlook-email-ingestion/m-p/89958#M84559</link>
      <description>&lt;P&gt;Hello guys,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Yeah, that was a long time ago,I managed to get the job by using the following framework :&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Logstash -&amp;gt; Kafka -&amp;gt; Spark&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 06 May 2019 14:33:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Near-real-time-Outlook-email-ingestion/m-p/89958#M84559</guid>
      <dc:creator>NAITTOU</dc:creator>
      <dc:date>2019-05-06T14:33:46Z</dc:date>
    </item>
  </channel>
</rss>

