About saulo_sobreiro

saulo_sobreiro · ‎08-22-2019

Hi, I am actually using an Amazon EMR, which does not yet support hive 3.x in any version of EMR clusters, so this might make me drop this idea :(. Is there any workaround? Perhaps something like a set of libs we can import in Hive 2.x so it supports HiveKafkaStorageHandler? Thanks again!

saulo_sobreiro · ‎08-14-2019

Hi @Manohar Vanam, Thanks a lot for your quick answer. I was trying to test KafkaStorageHandler but I was not able to get it to work yet. I am having the following error: Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/metadata/StorageHandlerInfo at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.hadoop.hive.ql.parse.ParseUtils.ensureClassExists(ParseUtils.java:261) at org.apache.hadoop.hive.ql.parse.StorageFormat.fillStorageFormat(StorageFormat.java:64) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:11907) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:11040) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11153) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:286) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:239) at org.apache.hadoop.util.RunJar.main(RunJar.java:153) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.metadata.StorageHandlerInfo at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) I think this might be due to the version of Hive I am using. The setup I have uses Hive 2.3.x but apparently the lib for KafkaStorageHandler is for 3.1.x. Is this the issue? Have you had the same problem? Thanks!

saulo_sobreiro · ‎08-14-2019

Hi all, I've been looking for solutions to expose Kafka data in Hive. I found a few things, however HiveKafkaStorageHandler caught my attention :). I was looking for trying it out, but I don't find anywhere in the documentation how does it support a connection to Kafka through SSL (which is a requirement for my setup). Can someone tell me for sure if this storage handler supports a broker listener that uses SSL? Thanks a lot in advance !

saulo_sobreiro · ‎12-13-2017

Hi Mark, Did you manage to solve this problem? I am facing the exact same situation.

saulo_sobreiro · ‎07-28-2017

I will try to reproduce the situation and then apply a fix! Thanks a lot for your feedback.

saulo_sobreiro · ‎07-27-2017

Just updating phoenix to version 4.7 would solve this issue then? Is it possible to update just the Phoenix version wihtout updating the whole cluster? Thank you.

saulo_sobreiro · ‎07-27-2017

Hi Josh, I am using HDP 2.4.0. The query I am using looks like the following: upsert into TABLE_HBASE_DENORM select TABLE_A.MPK, TABLE_A.AAAA, TABLE_A.BBBB, TABLE_A.CCCC, TABLE_A.DDDD, TABLE_A.EEEE, TABLE_B.AAA, TABLE_B.BBB, TABLE_B.CCC, ... 26 other parameters ... TABLE_A.M4, TABLE_A.M3, TABLE_A.M2, TABLE_A.M1, TABLE_A.Q4, TABLE_A.Q3, TABLE_A.Q2, TABLE_A.Q1 from TABLE_A, TABLE_B, TABLE_C, TABLE_F, TABLE_E, TABLE_D where TABLE_A.AAAA = TABLE_B.AAAA AND TABLE_A.BBBB = TABLE_C.BBBB AND TABLE_A.CCCC = TABLE_D.CCCC AND TABLE_A.DDDD = TABLE_E.DDDD AND TABLE_A.EEEE = TABLE_F.EEEE AND TABLE_A.Q3 >= TO_TIMESTAMP('2017-03-04 10:40:05') AND TABLE_A.Q3 < TO_TIMESTAMP('2017-03-05 10:40:05') ;

saulo_sobreiro · ‎07-22-2017

Hi all, I am getting an ArrayIndexOutOfBoundsException on a phoenix query and I would like to know if you have any suggestion to solve this problem. The error is throwing when I select data from some tables to upsert into another. The select query gets all the data from a table that has a considerable amount of data (3 374 590 registers) and merges with data from 5 smaller tables. I have 4 DataNode/RegiongServer/PhoenixQS (almost dedicated) nodes in the cluster (6 vCPU; 32GB RAM per node), so I believe that resources are not the problem. As a workarround I am filtering by date in order to be able to transfer the data between the tables. It's odd to notice that sometimes I am able to upsert 50k register in a query (a few days), and sometimes I am limited to 9k registers (arround 2 days) or less. I get the error even using a hint to change the join algorithm. An exemple of the errors I am getting: Error: java.lang.ArrayIndexOutOfBoundsException: -6 (state=08000,code=101) org.apache.phoenix.exception.PhoenixIOException: java.lang.ArrayIndexOutOfBoundsException: -6 at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:538) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:510) at org.apache.phoenix.iterate.RoundRobinResultIterator.getIterators(RoundRobinResultIterator.java:176) at org.apache.phoenix.iterate.RoundRobinResultIterator.next(RoundRobinResultIterator.java:91) at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) at org.apache.phoenix.compile.UpsertCompiler$2.execute(UpsertCompiler.java:737) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:305) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:297) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:295) at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1255) at sqlline.Commands.execute(Commands.java:822) at sqlline.Commands.sql(Commands.java:732) at sqlline.SqlLine.dispatch(SqlLine.java:808) at sqlline.SqlLine.begin(SqlLine.java:681) at sqlline.SqlLine.start(SqlLine.java:398) at sqlline.SqlLine.main(SqlLine.java:292) Caused by: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: -6 at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:202) at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:534) ... 16 more Caused by: java.lang.ArrayIndexOutOfBoundsException: -6 at org.apache.phoenix.util.ByteUtil.vlongFromBytes(ByteUtil.java:329) at org.apache.phoenix.util.ByteUtil.vintFromBytes(ByteUtil.java:316) at org.apache.phoenix.schema.KeyValueSchema.next(KeyValueSchema.java:208) at org.apache.phoenix.schema.KeyValueSchema.iterator(KeyValueSchema.java:165) at org.apache.phoenix.schema.KeyValueSchema.iterator(KeyValueSchema.java:171) at org.apache.phoenix.schema.KeyValueSchema.iterator(KeyValueSchema.java:175) at org.apache.phoenix.expression.ProjectedColumnExpression.evaluate(ProjectedColumnExpression.java:112) at org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:69) at org.apache.phoenix.jdbc.PhoenixResultSet.getObject(PhoenixResultSet.java:515) at org.apache.phoenix.compile.UpsertCompiler.upsertSelect(UpsertCompiler.java:164) at org.apache.phoenix.compile.UpsertCompiler.access$000(UpsertCompiler.java:105) at org.apache.phoenix.compile.UpsertCompiler$UpsertingParallelIteratorFactory.mutate(UpsertCompiler.java:221) at org.apache.phoenix.compile.MutatingParallelIteratorFactory.newIterator(MutatingParallelIteratorFactory.java:61) at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:99) at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:90) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745 I am using HDP 2.4.0, so phoenix 4.4. The configurations I am using are: HBase RegionServer Maximum Memory=12288 (12GB) HBase Master Maximum Memory=12288 (12GB) Number of Handlers per RegionServer=30 Memstore Flush Size=128MB Maximum Record Size=1MB Maximum Region File Size=10GB % of RegionServer Allocated to Read Buffers=40% % of RegionServer Allocated to Write Buffers=40% HBase RPC Timeout=6min Zookeeper Session Timeout=6min Phoenix Query Timeout=6min Number of Fetched Rows when Scanning from Disk=10000 dfs.client.read.shortcircuit=true dfs.client.read.shortcircuit.buffer.size=131072 hbase.hstore.min.locality.to.skip.major.compact=0.7 hbase.ipc.server.callqueue.read.ratio=0.8 hbase.ipc.server.callqueue.scan.ratio=0.8 phoenix.coprocessor.maxServerCacheTimeToLiveMs=30000 phoenix.mutate.batchSize=100000 phoenix.query.maxServerCacheBytes=8589934592 phoenix.query.queueSize=7500 phoenix.query.threadPoolSize=512 The other configurations are the default. What should I change in order to make a query of this size work properly? Thanks in advance.

saulo_sobreiro · ‎07-10-2017

Hi, were you able to solve this issue? I am having the same problem.

saulo_sobreiro · ‎06-03-2017

I will then keep it working with the ssh session. Thank you very much for your feedback.

Online	Offline
Last Visited	‎08-22-2019 04:29 AM

Member Since	‎05-22-2017 05:33 PM
Last Visited	‎08-22-2019 04:29 AM
Posts	14

Cloudera Community

Re: Does HiveKafkaStorageHandler support SSL?

Re: Does HiveKafkaStorageHandler support SSL?

Does HiveKafkaStorageHandler support SSL?

Re: HDP Kafka service setup with SASL/PLAIN not wo...

Re: Phoenix query throwing ArrayIndexOutOfBoundsEx...

Re: Phoenix query throwing ArrayIndexOutOfBoundsEx...

Re: Phoenix query throwing ArrayIndexOutOfBoundsEx...

Phoenix query throwing ArrayIndexOutOfBoundsExcept...

Re: kafka after ssl produce and consume message fa...

Re: Is it possible to configure Oozie job to run o...