- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hive - Create external table on database (MySQL, Oracle, db2...)
- Labels:
-
Apache Hive
Created 05-02-2017 03:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I would like to create an external table in Hive on different databases (MySQL, Oracle, DB2..) because I do not want to move the data, either in HDFS or in Hive directly.
How can I do that?
Created 05-02-2017 03:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
There is a Hive Storage Handler for JDBC that allows you to do this: https://github.com/qubole/Hive-JDBC-Storage-Handler
Example HQL:
DROP TABLE HiveTable;
CREATE EXTERNAL TABLE HiveTable(
id INT,
id_double DOUBLE,
names STRING,
test INT
)
STORED BY 'org.apache.hadoop.hive.jdbc.storagehandler.JdbcStorageHandler'
TBLPROPERTIES (
"mapred.jdbc.driver.class"="com.mysql.jdbc.Driver",
"mapred.jdbc.url"="jdbc:mysql://localhost:3306/rstore",
"mapred.jdbc.username"="root",
"mapred.jdbc.input.table.name"="JDBCTable",
"mapred.jdbc.output.table.name"="JDBCTable",
"mapred.jdbc.password"="",
"mapred.jdbc.hive.lazy.split"= "false"
);
Created 05-02-2017 03:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
There is a Hive Storage Handler for JDBC that allows you to do this: https://github.com/qubole/Hive-JDBC-Storage-Handler
Example HQL:
DROP TABLE HiveTable;
CREATE EXTERNAL TABLE HiveTable(
id INT,
id_double DOUBLE,
names STRING,
test INT
)
STORED BY 'org.apache.hadoop.hive.jdbc.storagehandler.JdbcStorageHandler'
TBLPROPERTIES (
"mapred.jdbc.driver.class"="com.mysql.jdbc.Driver",
"mapred.jdbc.url"="jdbc:mysql://localhost:3306/rstore",
"mapred.jdbc.username"="root",
"mapred.jdbc.input.table.name"="JDBCTable",
"mapred.jdbc.output.table.name"="JDBCTable",
"mapred.jdbc.password"="",
"mapred.jdbc.hive.lazy.split"= "false"
);
Created 05-02-2017 04:07 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you @Ward Bekker
Can I use it for any DBs?
Created 05-02-2017 04:43 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Looks like any db with a JDBC driver, but I personally never used this handler, so can't vouch for it. I would recommend to test it for you DB's (MySQL, Oracle, db2...).
Created 05-02-2017 07:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are using Greenplum then there is an existing protocol which will take care of your use case. gphdfs protocol. Its simple and easy but it will support only TEXT and CSV as of now.
Check the above link for gphdfs protocol.
Created on 11-23-2023 02:54 PM - edited 11-23-2023 02:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
CREATE EXTERNAL TABLE dwsimp.dim_agrupamento (
id INT,
agrupamento_nome STRING,
agrupamento_ordem INT,
dim_relatorio_id INT,
agrupamento_campo STRING
)
STORED AS ORC
TBLPROPERTIES
(org.apache.hadoop.hive.jdbc.storagehandler.JdbcStorageHandler,
mapred.jdbc.driver.class = "oracle.jdbc.OracleDriver",
mapred.jdbc.url = "jdbc:oracle:thin:@//jdbc:oracle:thin:@//host:port/servicename",
mapred.jdbc.username = "user",
mapred.jdbc.password= "password",
mapred.jdbc.input.table.name="JDBCTable",
mapred.jdbc.output.table.name="JDBCTable",
mapred.jdbc.hive.lazy.split"= "false");
Error: Error while compiling statement: FAILED: ParseException line 10:2 cannot recognize input near 'org' '.' 'apache' in table properties list (state=42000,code=40000)
