Created 05-02-2017 03:44 PM
Hi,
I would like to create an external table in Hive on different databases (MySQL, Oracle, DB2..) because I do not want to move the data, either in HDFS or in Hive directly.
How can I do that?
Created 05-02-2017 03:56 PM
Hi,
There is a Hive Storage Handler for JDBC that allows you to do this: https://github.com/qubole/Hive-JDBC-Storage-Handler
Example HQL:
DROP TABLE HiveTable;
CREATE EXTERNAL TABLE HiveTable(
id INT,
id_double DOUBLE,
names STRING,
test INT
)
STORED BY 'org.apache.hadoop.hive.jdbc.storagehandler.JdbcStorageHandler'
TBLPROPERTIES (
"mapred.jdbc.driver.class"="com.mysql.jdbc.Driver",
"mapred.jdbc.url"="jdbc:mysql://localhost:3306/rstore",
"mapred.jdbc.username"="root",
"mapred.jdbc.input.table.name"="JDBCTable",
"mapred.jdbc.output.table.name"="JDBCTable",
"mapred.jdbc.password"="",
"mapred.jdbc.hive.lazy.split"= "false"
);
Created 05-02-2017 03:56 PM
Hi,
There is a Hive Storage Handler for JDBC that allows you to do this: https://github.com/qubole/Hive-JDBC-Storage-Handler
Example HQL:
DROP TABLE HiveTable;
CREATE EXTERNAL TABLE HiveTable(
id INT,
id_double DOUBLE,
names STRING,
test INT
)
STORED BY 'org.apache.hadoop.hive.jdbc.storagehandler.JdbcStorageHandler'
TBLPROPERTIES (
"mapred.jdbc.driver.class"="com.mysql.jdbc.Driver",
"mapred.jdbc.url"="jdbc:mysql://localhost:3306/rstore",
"mapred.jdbc.username"="root",
"mapred.jdbc.input.table.name"="JDBCTable",
"mapred.jdbc.output.table.name"="JDBCTable",
"mapred.jdbc.password"="",
"mapred.jdbc.hive.lazy.split"= "false"
);
Created 05-02-2017 04:07 PM
Thank you @Ward Bekker
Can I use it for any DBs?
Created 05-02-2017 04:43 PM
Looks like any db with a JDBC driver, but I personally never used this handler, so can't vouch for it. I would recommend to test it for you DB's (MySQL, Oracle, db2...).
Created 05-02-2017 07:55 PM
If you are using Greenplum then there is an existing protocol which will take care of your use case. gphdfs protocol. Its simple and easy but it will support only TEXT and CSV as of now.
Check the above link for gphdfs protocol.
Created on 11-23-2023 02:54 PM - edited 11-23-2023 02:54 PM
CREATE EXTERNAL TABLE dwsimp.dim_agrupamento (
id INT,
agrupamento_nome STRING,
agrupamento_ordem INT,
dim_relatorio_id INT,
agrupamento_campo STRING
)
STORED AS ORC
TBLPROPERTIES
(org.apache.hadoop.hive.jdbc.storagehandler.JdbcStorageHandler,
mapred.jdbc.driver.class = "oracle.jdbc.OracleDriver",
mapred.jdbc.url = "jdbc:oracle:thin:@//jdbc:oracle:thin:@//host:port/servicename",
mapred.jdbc.username = "user",
mapred.jdbc.password= "password",
mapred.jdbc.input.table.name="JDBCTable",
mapred.jdbc.output.table.name="JDBCTable",
mapred.jdbc.hive.lazy.split"= "false");
Error: Error while compiling statement: FAILED: ParseException line 10:2 cannot recognize input near 'org' '.' 'apache' in table properties list (state=42000,code=40000)