<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Are there any benchmarks for SQOOP data transfer rate ? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151040#M20399</link>
    <description>&lt;P&gt;Are there any benchmarks for SQOOP data transfers from an ORACE RDBMS to Hadoop cluster ? &lt;/P&gt;&lt;P&gt;Both Hadoop cluster and ORACLE servers are located in same datacenter and connected by 10G network and 10G TOR switches. What sort of data transfer rates I can really expect if I can run data transfer at a time when ORACLE servers are not being used by any other applications. I am able to get a rate of around ~200Mbps but I am not sure if that is the maximum that I can expect.&lt;/P&gt;</description>
    <pubDate>Sun, 21 Feb 2016 05:56:02 GMT</pubDate>
    <dc:creator>shishir_saxena4</dc:creator>
    <dc:date>2016-02-21T05:56:02Z</dc:date>
    <item>
      <title>Are there any benchmarks for SQOOP data transfer rate ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151040#M20399</link>
      <description>&lt;P&gt;Are there any benchmarks for SQOOP data transfers from an ORACE RDBMS to Hadoop cluster ? &lt;/P&gt;&lt;P&gt;Both Hadoop cluster and ORACLE servers are located in same datacenter and connected by 10G network and 10G TOR switches. What sort of data transfer rates I can really expect if I can run data transfer at a time when ORACLE servers are not being used by any other applications. I am able to get a rate of around ~200Mbps but I am not sure if that is the maximum that I can expect.&lt;/P&gt;</description>
      <pubDate>Sun, 21 Feb 2016 05:56:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151040#M20399</guid>
      <dc:creator>shishir_saxena4</dc:creator>
      <dc:date>2016-02-21T05:56:02Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any benchmarks for SQOOP data transfer rate ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151041#M20400</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/2820/shishirsaxena3.html" nodeid="2820"&gt;@Shishir Saxena&lt;/A&gt;&lt;P&gt;I don't think there is any benchmarks like that. &lt;/P&gt;&lt;P&gt;You can follow this &lt;A href="http://www.slideshare.net/alxslva/effective-sqoop-best-practices-pitfalls-and-lessons-40370936" target="_blank"&gt;http://www.slideshare.net/alxslva/effective-sqoop-best-practices-pitfalls-and-lessons-40370936&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Also, make sure that you have stats generated on Oracle Tables. &lt;/P&gt;&lt;P&gt;Another &lt;A target="_blank" href="http://blog.cloudera.com/blog/2014/11/how-apache-sqoop-1-4-5-improves-oracle-databaseapache-hadoop-integration/"&gt;link&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="http://blog.cloudera.com/blog/2014/11/how-apache-sqoop-1-4-5-improves-oracle-databaseapache-hadoop-integration/"&gt;&lt;/A&gt;Direct = True and number of mappers plays a big role.&lt;/P&gt;&lt;P&gt;Your setup looks really good as you have source and trage are in the same DC and 10G network is there.&lt;/P&gt;</description>
      <pubDate>Sun, 21 Feb 2016 06:00:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151041#M20400</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-02-21T06:00:25Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any benchmarks for SQOOP data transfer rate ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151042#M20401</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2820/shishirsaxena3.html" nodeid="2820"&gt;@Shishir Saxena&lt;/A&gt;  See this guide &lt;A href="http://www.slideshare.net/gharriso/quest-hadoop-and-rdbms-with-sqoop-hw10-6263893" target="_blank"&gt;http://www.slideshare.net/gharriso/quest-hadoop-and-rdbms-with-sqoop-hw10-6263893&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 21 Feb 2016 06:01:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151042#M20401</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-02-21T06:01:14Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any benchmarks for SQOOP data transfer rate ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151043#M20402</link>
      <description>&lt;P&gt;Thanks Neeraj. This was useful, though I still don't have a benchmark. In Quest example, they were able to achieve 50GB table in 1000 sec for effective rate of 50Mbps.&lt;/P&gt;&lt;P&gt;I also found some info here &lt;/P&gt;&lt;P&gt;&lt;A href="http://grokbase.com/t/sqoop/user/146jhv8577/sqoop-to-oracle-transfer-rates"&gt;http://grokbase.com/t/sqoop/user/146jhv8577/sqoop-to-oracle-transfer-rates&lt;/A&gt;&lt;/P&gt;&lt;P&gt;and here&lt;/P&gt;&lt;P&gt;&lt;A href="http://blog.cloudera.com/blog/2014/11/how-apache-sqoop-1-4-5-improves-oracle-databaseapache-hadoop-integration/"&gt;http://blog.cloudera.com/blog/2014/11/how-apache-sqoop-1-4-5-improves-oracle-databaseapache-hadoop-integration/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;In last case, it looks like 310GB table took only 100 seconds ( with around 25 mappers) in best case for a transfer rate of ~3.1 Gbps. That makes much more sense. &lt;/P&gt;&lt;P&gt;I will try to find out more details about my Oracle server configuration to see what else I can do to improve my performance. &lt;/P&gt;</description>
      <pubDate>Sun, 21 Feb 2016 07:05:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151043#M20402</guid>
      <dc:creator>shishir_saxena4</dc:creator>
      <dc:date>2016-02-21T07:05:13Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any benchmarks for SQOOP data transfer rate ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151044#M20403</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2820/shishirsaxena3.html" nodeid="2820"&gt;@Shishir Saxena&lt;/A&gt; Ok. I am going to share these numbers based on my experience..No official numbers&lt;/P&gt;&lt;P&gt;5 nodes cluster with 96GB , Dual 8 Core over 10G network from different datacenter &lt;/P&gt;&lt;P&gt;4 billion rows with 30 mappers = 40 mins&lt;/P&gt;&lt;P&gt;86 million rows ~ 12 mins&lt;/P&gt;&lt;P&gt;My best suggestion is to run a dummy test and based on that you can estimate the timings. &lt;/P&gt;</description>
      <pubDate>Sun, 21 Feb 2016 08:14:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151044#M20403</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-02-21T08:14:50Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any benchmarks for SQOOP data transfer rate ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151045#M20404</link>
      <description>&lt;P&gt;Thank You Neeraj. I am running benchmarks on our cluster. Just wanted to understand what max upper limit I can target. Thank you again for quick response and so much help.&lt;/P&gt;</description>
      <pubDate>Sun, 21 Feb 2016 09:13:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151045#M20404</guid>
      <dc:creator>shishir_saxena4</dc:creator>
      <dc:date>2016-02-21T09:13:27Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any benchmarks for SQOOP data transfer rate ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151046#M20405</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/2820/shishirsaxena3.html" nodeid="2820"&gt;@Shishir Saxena&lt;/A&gt;, Oracle connector for Hadoop, the so-called Oraoop is included in Sqoop-1.4.5 and 1.4.6 (shipped with HDP-2.3.x). Sqoop user guide has a very detailed explanation&lt;A href="https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_data_connector_for_oracle_and_hadoop"&gt; here&lt;/A&gt;. It's enabled when "--direct" is used. Regarding benchmarks it's the best to build your own, for example using Sqoop with and without Oraoop with different number of mappers, various table sizes etc.&lt;/P&gt;</description>
      <pubDate>Sun, 21 Feb 2016 15:08:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151046#M20405</guid>
      <dc:creator>pminovic</dc:creator>
      <dc:date>2016-02-21T15:08:49Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any benchmarks for SQOOP data transfer rate ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151047#M20406</link>
      <description>&lt;P&gt;Thanks. Looks like that is my only choice.&lt;/P&gt;</description>
      <pubDate>Mon, 22 Feb 2016 10:39:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Are-there-any-benchmarks-for-SQOOP-data-transfer-rate/m-p/151047#M20406</guid>
      <dc:creator>shishir_saxena4</dc:creator>
      <dc:date>2016-02-22T10:39:41Z</dc:date>
    </item>
  </channel>
</rss>

