Created 06-01-2019 12:09 AM
While initializing spark-shell, I have provided following jar for the hive-spark connectivity
/usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar
I have configured below 5 properties as provided by the below link for the connectivity
Properties:
Following code was executed to fetch data from hive:
import com.hortonworks.hwc.HiveWarehouseSession val hive = HiveWarehouseSession.session(spark).build() val df = hive.executeQuery("show databases")
The following error is displayed:
ERROR LlapBaseInputFormat: Closing connection due to error shadehive.org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.NullPointerException at shadehive.org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300) at shadehive.org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286) at shadehive.org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:374) at org.apache.hadoop.hive.llap.LlapBaseInputFormat.getSplits(LlapBaseInputFormat.java:263) at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.getTableSchema(HiveWarehouseDataSourceReader.java:109) at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.readSchema(HiveWarehouseDataSourceReader.java:123) at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$.apply(DataSourceV2Relation.scala:56) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:224) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164) at com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl.executeQuery(HiveWarehouseSessionImpl.java:62) at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:26) at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:31) at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:33) at $line16.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:35) at $line16.$read$$iw$$iw$$iw$$iw.<init>(<console>:37) at $line16.$read$$iw$$iw$$iw.<init>(<console>:39) at $line16.$read$$iw$$iw.<init>(<console>:41) at $line16.$read$$iw.<init>(<console>:43) at $line16.$read.<init>(<console>:45) at $line16.$read$.<init>(<console>:49) at $line16.$read$.<clinit>(<console>) at $line16.$eval$.$print$lzycompute(<console>:7) at $line16.$eval$.$print(<console>:6) at $line16.$eval.$print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637) at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19) at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565) at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807) at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681) at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395) at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:415) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:923) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909) at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97) at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909) at org.apache.spark.repl.Main$.doMain(Main.scala:76) at org.apache.spark.repl.Main$.main(Main.scala:56) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.NullPointerException at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:465) at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:309) at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:905) at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) at com.sun.proxy.$Proxy51.fetchResults(Unknown Source) at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:561) at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:786) at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837) at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2695) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:460) ... 24 more Caused by: java.lang.NullPointerException: null at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:272) at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204) at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116) at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) ... 27 more
Created on 08-06-2019 04:30 PM - edited 08-17-2019 03:13 PM
hive.executeQuery("query") is for read\write operations. But here you are trying a catalog operation. For the catalog operation, you need to use
hive.execute("statement")
Refer API documentation here : https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_hivewarehouse...
For your specific case of getting the database list, there is a pre-defined catalog operation
hive.showDatabases()
BTW, this issue has nothing to do with the ingestion. Please make sure to that the title represents the exact problem.