Member since
09-21-2017
3
Posts
0
Kudos Received
0
Solutions
05-13-2019
07:24 AM
Hi Tarek, when I set Hive interactive correctly (tuning of resources is the most critical part otherwise reading was failing) all was running fine and smoothly. In the end I built whole pipeline completely on Spark only as Hive Interactive was not needed anymore and for a large streaming or heavy batches was unstable - to many connections, some were already closed etc. I'm talking about volumes like 1,5 billions with foreachBatch sink. At this moment I can do a stream reading and compacting at the same time.
... View more
01-08-2019
11:01 PM
On the fresh new cluster based on HDP 3.1.0 (Kerberized) I'm still facing to a problem with Spark and Hive reading. Connection via HWC is not working. When a try run hive.table("default.table1").show I'll get an error message: java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: shadehive.org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.createBatchDataReaderFactories(HiveWarehouseDataSourceReader.java:166)
at org.apache.spark.sql.execution.datasources.v2.DataSourceV2ScanExec.inputRDD$lzycompute(DataSourceV2ScanExec.scala:64)
at org.apache.spark.sql.execution.datasources.v2.DataSourceV2ScanExec.inputRDD(DataSourceV2ScanExec.scala:60)
at org.apache.spark.sql.execution.datasources.v2.DataSourceV2ScanExec.inputRDDs(DataSourceV2ScanExec.scala:79)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:605)
at org.apache.spark.sql.execution.SparkPlan$anonfun$execute$1.apply(SparkPlan.scala:131)
at org.apache.spark.sql.execution.SparkPlan$anonfun$execute$1.apply(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:337)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$collectFromPlan(Dataset.scala:3278)
at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2489)
at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2489)
at org.apache.spark.sql.Dataset$anonfun$52.apply(Dataset.scala:3259)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2489)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2703)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:254)
at org.apache.spark.sql.Dataset.show(Dataset.scala:723)
at org.apache.spark.sql.Dataset.show(Dataset.scala:682)
at org.apache.spark.sql.Dataset.show(Dataset.scala:691)
... 47 elided
Caused by: java.lang.RuntimeException: java.io.IOException: shadehive.org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.getSplitsFactories(HiveWarehouseDataSourceReader.java:182)
at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.createBatchDataReaderFactories(HiveWarehouseDataSourceReader.java:162)
... 72 more
Caused by: java.io.IOException: shadehive.org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
at org.apache.hadoop.hive.llap.LlapBaseInputFormat.getSplits(LlapBaseInputFormat.java:298)
at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.getSplitsFactories(HiveWarehouseDataSourceReader.java:176)
... 73 more
Caused by: shadehive.org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
at shadehive.org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300)
at shadehive.org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286)
at shadehive.org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:379)
at org.apache.hadoop.hive.llap.LlapBaseInputFormat.getSplits(LlapBaseInputFormat.java:280)
... 74 more
Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:478)
at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:952)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy72.fetchResults(Unknown Source)
at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:564)
at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:792)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:647)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
... 3 more
Caused by: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2738)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
... 25 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:225)
at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
... 28 more
Caused by: java.io.IOException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.getSplits(GenericUDTFGetSplits.java:498)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:210)
... 39 more
Caused by: java.lang.NullPointerException: null
at org.apache.hadoop.hive.llap.LlapUtil.generateClusterName(LlapUtil.java:117)
at org.apache.hadoop.hive.llap.coordinator.LlapCoordinator.getLlapSigner(LlapCoordinator.java:103)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.getSplits(GenericUDTFGetSplits.java:441)
... 40 more I've checked official documentation and github page. All properties are OK, but still cannot read any data from hive. I'm using a standard hive jdbc connection, not interactive one since I'm not planning to use a LLAP engine. Any idea what to set or check to avoid this error? PS: I'm able to read metadata via hive connector such as database ... even writing to Hive is workrking, but not reading tables to DF.
... View more
Labels: