<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Best way to migrate MS Access Databases to Hadoop in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-way-to-migrate-MS-Access-Databases-to-Hadoop/m-p/154992#M20723</link>
    <description>&lt;P&gt;If you can work your way through SQL Server and sqoop, I agree that's probably the cleanest option.  If you're looking for something that you can automate entirely on the cluster or don't have the luxury of pushing the data through SQL Server, then here's another option.&lt;/P&gt;&lt;P&gt;There's a very simple Open Source toolset called &lt;A target="_blank" href="https://github.com/brianb/mdbtools"&gt;mdbtools&lt;/A&gt; that makes it really easy to extract metadata and data from MS Access databases.  In a series of about 10 lines of a shell script, you can get a list of tables in the mdb, dump the data out to text files, import those to HDFS, and wrap a generic Hive schema around the files.&lt;/P&gt;&lt;P&gt;Since you're going through intermediate text files, you might not be able to support some character sets and could run into an issues or two with file formats that can be cleaned up with a secondary sed or perl script.  If you don't want to go through SQL Server to get the data transferred over, though, this might be a good solution for you.&lt;/P&gt;</description>
    <pubDate>Thu, 25 Feb 2016 03:38:20 GMT</pubDate>
    <dc:creator>paul_boal</dc:creator>
    <dc:date>2016-02-25T03:38:20Z</dc:date>
    <item>
      <title>Best way to migrate MS Access Databases to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-way-to-migrate-MS-Access-Databases-to-Hadoop/m-p/154988#M20719</link>
      <description>&lt;P&gt;Team,&lt;/P&gt;&lt;P&gt;One of customer has thousands of MS Access DBs. What is best way to migrate them to Hive/Hadoop?&lt;/P&gt;&lt;P&gt;Any tools/experience in the space?&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 04:56:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-way-to-migrate-MS-Access-Databases-to-Hadoop/m-p/154988#M20719</guid>
      <dc:creator>nasghar</dc:creator>
      <dc:date>2016-02-24T04:56:41Z</dc:date>
    </item>
    <item>
      <title>Re: Best way to migrate MS Access Databases to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-way-to-migrate-MS-Access-Databases-to-Hadoop/m-p/154989#M20720</link>
      <description>&lt;P&gt; &lt;A rel="user" href="https://community.cloudera.com/users/151/nasghar.html" nodeid="151"&gt;@nasghar&lt;/A&gt; though you can export MS Access to a csv and import that into Hive I would suggest instead importing the data into SQL Server and use Sqoop. &lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 05:08:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-way-to-migrate-MS-Access-Databases-to-Hadoop/m-p/154989#M20720</guid>
      <dc:creator>SQLShaw</dc:creator>
      <dc:date>2016-02-24T05:08:37Z</dc:date>
    </item>
    <item>
      <title>Re: Best way to migrate MS Access Databases to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-way-to-migrate-MS-Access-Databases-to-Hadoop/m-p/154990#M20721</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/151/nasghar.html" nodeid="151"&gt;@nasghar&lt;/A&gt;&lt;P&gt;See this thread &lt;A href="https://community.hortonworks.com/questions/4249/can-microsofthortonworks-odbc-driver-provide-bi-di.html#comment-4259" target="_blank"&gt;https://community.hortonworks.com/questions/4249/can-microsofthortonworks-odbc-driver-provide-bi-di.html#comment-4259&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 07:59:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-way-to-migrate-MS-Access-Databases-to-Hadoop/m-p/154990#M20721</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-02-24T07:59:00Z</dc:date>
    </item>
    <item>
      <title>Re: Best way to migrate MS Access Databases to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-way-to-migrate-MS-Access-Databases-to-Hadoop/m-p/154991#M20722</link>
      <description>&lt;P&gt;I agree with Scott.  Bringing data from Hadoop into Access is no big deal.  We had an Access database use ODBC to retrieve data from Hadoop with effective results (fast enough).&lt;/P&gt;&lt;P&gt;BUT if you try to send data from Access into Hadoop - painfully slow.  Row by Row.  Sending it off to Sql Server requires very little code in Access, and then you have a durable store of your data.  Sending it from SQL Server to Hadoop is effective and fast.  &lt;/P&gt;&lt;P&gt;Here's a simple sample of a Sqoop script to send data from SQL Server to Hadoop.  It reads table Customer from SQL Server DB SQLTestDB and copies it into Hadoop database TestDB tablename Customer.  It overwrites any existing data in the Hive table.  It also uses 1 mapper.   &lt;/P&gt;&lt;PRE&gt;sqoop import --connect "jdbc:sqlserver://&amp;lt;IP Address&amp;gt;:1433;database=SQLTestDB" \
--username root \
--password hadoop \
--table Customer \
--hive-import --hive-overwrite \
--hive-table TestDB.Customer \
-m 1 &lt;/PRE&gt;</description>
      <pubDate>Wed, 24 Feb 2016 19:40:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-way-to-migrate-MS-Access-Databases-to-Hadoop/m-p/154991#M20722</guid>
      <dc:creator>bpreachuk</dc:creator>
      <dc:date>2016-02-24T19:40:31Z</dc:date>
    </item>
    <item>
      <title>Re: Best way to migrate MS Access Databases to Hadoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-way-to-migrate-MS-Access-Databases-to-Hadoop/m-p/154992#M20723</link>
      <description>&lt;P&gt;If you can work your way through SQL Server and sqoop, I agree that's probably the cleanest option.  If you're looking for something that you can automate entirely on the cluster or don't have the luxury of pushing the data through SQL Server, then here's another option.&lt;/P&gt;&lt;P&gt;There's a very simple Open Source toolset called &lt;A target="_blank" href="https://github.com/brianb/mdbtools"&gt;mdbtools&lt;/A&gt; that makes it really easy to extract metadata and data from MS Access databases.  In a series of about 10 lines of a shell script, you can get a list of tables in the mdb, dump the data out to text files, import those to HDFS, and wrap a generic Hive schema around the files.&lt;/P&gt;&lt;P&gt;Since you're going through intermediate text files, you might not be able to support some character sets and could run into an issues or two with file formats that can be cleaned up with a secondary sed or perl script.  If you don't want to go through SQL Server to get the data transferred over, though, this might be a good solution for you.&lt;/P&gt;</description>
      <pubDate>Thu, 25 Feb 2016 03:38:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Best-way-to-migrate-MS-Access-Databases-to-Hadoop/m-p/154992#M20723</guid>
      <dc:creator>paul_boal</dc:creator>
      <dc:date>2016-02-25T03:38:20Z</dc:date>
    </item>
  </channel>
</rss>

