Support Questions
Find answers, ask questions, and share your expertise

How can I query HBase from Hive?

hive> CREATE TABLE test3(id string, val decimal) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,0:val");

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:MetaException(message:java.io.IOException: java.lang.reflect.InvocationTargetException

at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)

at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)

at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:413)

at org.apache.hadoop.hbase.client.ConnectionManager.getConnectionInternal(ConnectionManager.java:291)

at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:222)

at org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:120)

at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:200)

at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:665)

at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:658)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)

at com.sun.proxy.$Proxy5.createTable(Unknown Source)

at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:717)

at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4170)

at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:307)

at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)

at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)

at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)

at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)

at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)

at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)

at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)

at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)

at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)

at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)

at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at org.apache.hadoop.util.RunJar.run(RunJar.java:221)

at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Caused by: java.lang.reflect.InvocationTargetException

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:422)

at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)

... 36 more

Caused by: java.lang.UnsupportedOperationException: Unable to find org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory

at org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:36)

at org.apache.hadoop.hbase.ipc.RpcControllerFactory.instantiate(RpcControllerFactory.java:58)

at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.createAsyncProcess(ConnectionManager.java:2242)

at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:690)

at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:630)

... 41 more

Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory

at java.net.URLClassLoader.findClass(URLClassLoader.java:381)

at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)

at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

at java.lang.Class.forName0(Native Method)

at java.lang.Class.forName(Class.java:264)

at org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:32)

... 45 more

)

at org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:124)

at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:200)

at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:665)

at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:658)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)

at com.sun.proxy.$Proxy5.createTable(Unknown Source)

at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:717)

at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4170)

at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:307)

at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)

at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)

at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)

at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)

at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)

at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)

at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)

at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)

at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)

at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)

at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at org.apache.hadoop.util.RunJar.run(RunJar.java:221)

at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

)

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How can I query HBase from Hive?

with help of @Josh Elser

For 2.3.2 with phoenix enabled on ambari, you need to add phoenix jar to hive lib to make it to work, then restart hive cli / hiveserver2:

mkdir /usr/hdp/current/hive-server2/auxlib/
cp /usr/hdp/2.3.2.0-2950/phoenix/phoenix-client.jar /usr/hdp/current/hive-server2/auxlib/

without this jar, you will get erro below, as describe by @Randy Gelhausen

Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory

UPDATE: Interesting that with sandbox 2.3.2, only phoenix-server.jar works, phoenix-client.jar does not work.

View solution in original post

8 REPLIES 8

Re: How can I query HBase from Hive?

Rising Star

@Randy Gelhausen - Can you try the below:

CREATE TABLE test3(id string, val decimal) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,0:val') TBLPROPERTIES ('hbase.table.name' = 'testhbase1');

Re: How can I query HBase from Hive?

Unfortunately not. Hive throws the same error about missing classes. I think the solution will involve setting environment classpath variables, but none of the examples I can find online are working with HDP 2.3.2.

Re: How can I query HBase from Hive?

Master Collaborator

Make sure hbase is installed on the machine where you are running the hive cli from. In presence of hbase install, hive cli automatically includes the required hbase libraries in the classpath.

Re: How can I query HBase from Hive?

@randy Did you try this?

In order to use the HBase library, we need to make the HBase jars and configuration available to the local Hive process (at least until HIVE-5518 is resolved). Do that by specifying a value for the HADOOP_CLASSPATH environment variable before executing the statement.

Re: How can I query HBase from Hive?

Rising Star

Hey @Randy Gelhausen, I went through an exercise like this for a client and took notes on what I did so I can repeat. Note this was with an earlier version of HDP and I haven't tried it HDP 2.3 yet but maybe worth a shot.....

Step 1 Here is how I created the hbase table and placed data in it.

hbase shell
hbase(main):001:0>create 'short_urls', {NAME => 'u'}, {NAME => 's’}
hbase(main):062:0> put 'short_urls', ‘bit.ly/aaaa', 's:hits', '100'
hbase(main):063:0> put 'short_urls', ‘bit.ly/aaaa', 'u:url', 'hbase.apache.org'
hbase(main):062:0> put 'short_urls', ‘bit.ly/abcd', 's:hits', ‘123'
hbase(main):063:0> put 'short_urls', ‘bit.ly/abcd', 'u:url', ‘example.com/foo'
hbase(main):064:0> scan 'short_urls'

ROW  COLUMN+CELL bit.lyaaaa  
column=s:hits, timestamp=1412121062283, value=100 bit.lyaaaa  
column=u:url, timestamp=1412121071821,value=hbase.apache.org
1 row(s) in 0.0080 seconds
Step2: This is how I launched hive and created an external table pointing at HBase.
hive --auxpath /usr/lib/hive-hcatalog/share/hcatalog/storage-handlers/hbase/lib/hive-hcatalog-hbase-storage-handler-0.13.0.2.1.1.0-385.jar,/usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core-0.13.0.2.1.1.0-385.jar,/usr/lib/hive/lib/zookeeper-3.4.5.2.1.1.0-385.jar,/usr/lib/hive/lib/guava-11.0.2.jar 


CREATE EXTERNAL TABLE short_urls(short_url string, url string, hit_count string)STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler’ WITH SERDEPROPERTIES("hbase.columns.mapping" = ":key, u:url, s:hits”)TBLPROPERTIES("hbase.table.name" = "short_urls”); 
Step3: From Hive now you can query HBase.
hive> select * from short_urls;

OK

bit.lyaaaa hbase.apache.org 100
bit.lyabcd example.com/foo 123

Time taken: 0.445 seconds, Fetched: 2 row(s)

Re: How can I query HBase from Hive?

@Dan Rice Pretty cool design pattern. Any cautions or gotcha's when enabling this?

Re: How can I query HBase from Hive?

with help of @Josh Elser

For 2.3.2 with phoenix enabled on ambari, you need to add phoenix jar to hive lib to make it to work, then restart hive cli / hiveserver2:

mkdir /usr/hdp/current/hive-server2/auxlib/
cp /usr/hdp/2.3.2.0-2950/phoenix/phoenix-client.jar /usr/hdp/current/hive-server2/auxlib/

without this jar, you will get erro below, as describe by @Randy Gelhausen

Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory

UPDATE: Interesting that with sandbox 2.3.2, only phoenix-server.jar works, phoenix-client.jar does not work.

View solution in original post

Re: How can I query HBase from Hive?

Contributor

I have made the changes according to answer by @Guilherme Braccialli but after adding the jars and put the following setting in

custom hive-env

export HIVE_AUX_JARS_PATH=/usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar

custome hive-site

export HIVE_AUX_JARS_PATH="${HIVE_AUX_JARS_PATH}:/usr/hdp/current/phoenix-client/phoenix-hive.jar"

but not lukc and WebCat Server is not starting :

ERROR from log:-

log4j:WARN No such property [maxFileSize] in org.apache.log4j.DailyRollingFileAppender. log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.DailyRollingFileAppender. Exception in thread "main" java.lang.IllegalStateException: Variable substitution depth too large: 20 "${HIVE_AUX_JARS_PATH}:/usr/hdp/current/phoenix-client/phoenix-hive.jar" at org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:967) at org.apache.hadoop.conf.Configuration.get(Configuration.java:987) at org.apache.hadoop.hive.conf.HiveConfUtil.dumpConfig(HiveConfUtil.java:77) at org.apache.hadoop.hive.conf.HiveConfUtil.dumpConfig(HiveConfUtil.java:59) at org.apache.hive.hcatalog.templeton.AppConfig.dumpEnvironent(AppConfig.java:256) at org.apache.hive.hcatalog.templeton.AppConfig.init(AppConfig.java:198) at org.apache.hive.hcatalog.templeton.AppConfig.<init>(AppConfig.java:173) at org.apache.hive.hcatalog.templeton.Main.loadConfig(Main.java:97) at org.apache.hive.hcatalog.templeton.Main.init(Main.java:81) at org.apache.hive.hcatalog.templeton.Main.<init>(Main.java:76) at org.apache.hive.hcatalog.templeton.Main.main(Main.java:289) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148)

after