<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Execute sqoop on NiFi in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Execute-sqoop-on-NiFi/m-p/143384#M52378</link>
    <description>&lt;P&gt;Is there any NiFi processor i can use in executing SQOOP?&lt;/P&gt;&lt;P&gt;If none, any data flow i can use in getting my table then save it to HDFS?&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;</description>
    <pubDate>Tue, 24 Jan 2017 14:20:03 GMT</pubDate>
    <dc:creator>regie_canada</dc:creator>
    <dc:date>2017-01-24T14:20:03Z</dc:date>
    <item>
      <title>Execute sqoop on NiFi</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Execute-sqoop-on-NiFi/m-p/143384#M52378</link>
      <description>&lt;P&gt;Is there any NiFi processor i can use in executing SQOOP?&lt;/P&gt;&lt;P&gt;If none, any data flow i can use in getting my table then save it to HDFS?&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Jan 2017 14:20:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Execute-sqoop-on-NiFi/m-p/143384#M52378</guid>
      <dc:creator>regie_canada</dc:creator>
      <dc:date>2017-01-24T14:20:03Z</dc:date>
    </item>
    <item>
      <title>Re: Execute sqoop on NiFi</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Execute-sqoop-on-NiFi/m-p/143385#M52379</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/14397/regiecanada.html" nodeid="14397"&gt;@regie canada&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;If you really want to use Sqoop, then you would need to use something like ExecuteStreamCommand / ExecuteProcess processors. However, this is not something I'd recommend unless you need the features provided by Sqoop.&lt;/P&gt;&lt;P&gt;If you want a solution fully provided by NiFi, then depending on your source database, you can use the JDBC processors to get the data of your table and then use something like PutHDFS to send the data into HDFS. A common approach is something like GenerateTableFetch on the primary node and QueryDatabaseTable on all nodes. The first processor will generate SQL queries to fetch the data by "page" of specified size, and the second will actually get the data. This way, all nodes of your NiFi cluster can be used to get the data from the database.&lt;/P&gt;&lt;P&gt;You can have a look to the documentation here:&lt;/P&gt;&lt;P&gt;&lt;A href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.GenerateTableFetch/index.html" target="_blank"&gt;https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.GenerateTableFetch/index.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.QueryDatabaseTable/index.html" target="_blank"&gt;https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.QueryDatabaseTable/index.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;You have additional SQL/JDBC processors based on your needs.&lt;/P&gt;&lt;P&gt;This article should get you started:&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/articles/51902/incremental-fetch-in-nifi-with-querydatabasetable.html" target="_blank"&gt;https://community.hortonworks.com/articles/51902/incremental-fetch-in-nifi-with-querydatabasetable.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Jan 2017 17:30:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Execute-sqoop-on-NiFi/m-p/143385#M52379</guid>
      <dc:creator>pvillard</dc:creator>
      <dc:date>2017-01-24T17:30:41Z</dc:date>
    </item>
    <item>
      <title>Re: Execute sqoop on NiFi</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Execute-sqoop-on-NiFi/m-p/143386#M52380</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/14397/regiecanada.html" nodeid="14397"&gt;@regie canada&lt;/A&gt;, check my blog post on this subject&lt;/P&gt;&lt;P&gt;&lt;A href="http://boristyukin.com/how-to-run-sqoop-from-nifi/"&gt;How to run Sqoop from NiFi&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Boris&lt;/P&gt;</description>
      <pubDate>Wed, 28 Feb 2018 04:58:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Execute-sqoop-on-NiFi/m-p/143386#M52380</guid>
      <dc:creator>BorisTyukin</dc:creator>
      <dc:date>2018-02-28T04:58:25Z</dc:date>
    </item>
    <item>
      <title>Re: Execute sqoop on NiFi</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Execute-sqoop-on-NiFi/m-p/143387#M52381</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/5078/pvillard.html" nodeid="5078"&gt;@Pierre Villard&lt;/A&gt;: "A common approach is something like GenerateTableFetch on the primary node and QueryDatabaseTable on all nodes. The first processor will generate SQL queries to fetch the data by "page" of specified size, and the second will actually get the data. This way, all nodes of your NiFi cluster can be used to get the data from the database.": &lt;/P&gt;&lt;P&gt;Will I need to make a (local) RPG after the GenerateTableFetch to get them running in parallel?&lt;/P&gt;&lt;P&gt;Any experience on performance for making full RDBMS table dumps using this method vs Sqoop?&lt;/P&gt;</description>
      <pubDate>Mon, 09 Jul 2018 17:25:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Execute-sqoop-on-NiFi/m-p/143387#M52381</guid>
      <dc:creator>henrikolsen</dc:creator>
      <dc:date>2018-07-09T17:25:57Z</dc:date>
    </item>
  </channel>
</rss>

