Created 08-14-2019 01:15 AM
Hi all,
I've been looking for solutions to expose Kafka data in Hive. I found a few things, however HiveKafkaStorageHandler caught my attention :).
I was looking for trying it out, but I don't find anywhere in the documentation how does it support a connection to Kafka through SSL (which is a requirement for my setup).
Can someone tell me for sure if this storage handler supports a broker listener that uses SSL?
Thanks a lot in advance !
Created 08-18-2019 11:19 PM
Are you using HDP 2.3? if so, you can try with HDP 3.1 which has KafkaStorageHandler.
If you are using opensource hive 2.3 then try to use any version above 3.1.
Created 08-14-2019 10:23 AM
yes it supports SSL, you can pass required params in TBLPROPERTIES
Example:
CREATE EXTERNAL TABLE kafka_hive_table_edgenode_SASL_SSL(`Country Name` string , `Language` string, `_id` struct<`$oid`:string,`name`:string>, `account` struct<`$accountstat`:string>) STORED BY 'org.apache.hadoop.hive.kafka.KafkaStorageHandler' TBLPROPERTIES("kafka.topic" = "kafka_hive_topic", "kafka.bootstrap.servers"="<kafka-broker>:<port>","kafka.consumer.security.protocol" = "SASL_SSL","kafka.consumer.ssl.truststore.location"="<truststore-path>","kafka.consumer.ssl.truststore.password"="<password>");
Created 08-14-2019 04:35 PM
Hi @Manohar Vanam,
Thanks a lot for your quick answer.
I was trying to test KafkaStorageHandler but I was not able to get it to work yet. I am having the following error:
Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/metadata/StorageHandlerInfo
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.hive.ql.parse.ParseUtils.ensureClassExists(ParseUtils.java:261)
at org.apache.hadoop.hive.ql.parse.StorageFormat.fillStorageFormat(StorageFormat.java:64)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:11907)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:11040)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11153)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:286)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.metadata.StorageHandlerInfo
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
I think this might be due to the version of Hive I am using. The setup I have uses Hive 2.3.x but apparently the lib for KafkaStorageHandler is for 3.1.x.
Is this the issue? Have you had the same problem?
Thanks!
Created 08-18-2019 11:19 PM
Are you using HDP 2.3? if so, you can try with HDP 3.1 which has KafkaStorageHandler.
If you are using opensource hive 2.3 then try to use any version above 3.1.
Created 08-22-2019 01:08 AM
Hi,
I am actually using an Amazon EMR, which does not yet support hive 3.x in any version of EMR clusters, so this might make me drop this idea :(.
Is there any workaround? Perhaps something like a set of libs we can import in Hive 2.x so it supports HiveKafkaStorageHandler?
Thanks again!
Created 09-28-2023 12:02 AM
@saulo_sobreiro I got into the same issue today..were you be able to resolve this issue and which hive version , it seems to be working?