Created on 11-18-2015 05:18 PM - edited 09-16-2022 02:49 AM
Customer has an Access database application and they would like to (a) connect to Hive and extract data, and also (b) take resulting Access DB data and ODBC connect & load it into an existing Hive table from within Microsoft Access (not via Sqoop or via flat file). I am currently installing the driver and will soon test out option (a), but would like to know if option (b) is possible.
Thanks in advance!
Created 11-21-2015 02:47 PM
@bpreachuk Using ODBC driver is the cleanest way to do it.
Access --> RDBMS --> Hive using Sqoop.
Looking forward to see the final update
Kudos to this guy Link
Created 11-18-2015 05:42 PM
Thanks Birender and Neeraj. I was looking into the Access -> SQL Server -> Hive approach since we already have a SQL Server on the box - but if I can get it to work via MS Access directly I will do so. I'll update the ticket here with how it all turns out. I will write up what worked if I am able to easily get the Access -> Hive Insert connection working.
Created 11-18-2015 05:44 PM
Great, looking forward to hearing the results.
Created 11-21-2015 02:47 PM
@bpreachuk Using ODBC driver is the cleanest way to do it.
Access --> RDBMS --> Hive using Sqoop.
Looking forward to see the final update
Kudos to this guy Link
Created 11-25-2015 07:47 PM
Here is an update on this task:
ODBC read from Hive -> Access works fine (not a surprise).
ODBC from Access to Hive runs as RBAR - one row at a time. This is sub-optimal in the RDBMS world, but REALLY crummy in the Hadoop world. This means 1 Hive session per row updated. We were getting about 150 rows updated every 10 minutes.
When sending data form Access to Hive, we will be implementing using Access -> SQL Server -> Sqoop into Hive.
We may try out Simba drivers in the future - more for curiosity - to see if they perform better with updates - or have the ability to batch updates into Hive.
Thanks Neeraj and Birender!
Created 11-25-2015 07:48 PM
Thanks @bpreachuk for updating the thread!!!