Support Questions
Find answers, ask questions, and share your expertise

Not able to ingest data in hive using spark in HDP3

New Contributor

While initializing spark-shell, I have provided following jar for the hive-spark connectivity

/usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar


I have configured below 5 properties as provided by the below link for the connectivity

Link: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_configure_a_s...


Properties:

  1. spark.sql.hive.hiveserver2.jdbc.url
  2. spark.datasource.hive.warehouse.metastoreUri
  3. spark.datasource.hive.warehouse.load.staging.dir
  4. spark.hadoop.hive.llap.daemon.service.hosts
  5. spark.hadoop.hive.zookeeper.quorum


Following code was executed to fetch data from hive:

import com.hortonworks.hwc.HiveWarehouseSession

val hive = HiveWarehouseSession.session(spark).build()

val df = hive.executeQuery("show databases")


The following error is displayed:

ERROR LlapBaseInputFormat: Closing connection due to error

shadehive.org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.NullPointerException

at shadehive.org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300)

at shadehive.org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286)

at shadehive.org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:374)

at org.apache.hadoop.hive.llap.LlapBaseInputFormat.getSplits(LlapBaseInputFormat.java:263)

at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.getTableSchema(HiveWarehouseDataSourceReader.java:109)

at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.readSchema(HiveWarehouseDataSourceReader.java:123)

at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$.apply(DataSourceV2Relation.scala:56)

at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:224)

at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)

at com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl.executeQuery(HiveWarehouseSessionImpl.java:62)

at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:26)

at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:31)

at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:33)

at $line16.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:35)

at $line16.$read$$iw$$iw$$iw$$iw.<init>(<console>:37)

at $line16.$read$$iw$$iw$$iw.<init>(<console>:39)

at $line16.$read$$iw$$iw.<init>(<console>:41)

at $line16.$read$$iw.<init>(<console>:43)

at $line16.$read.<init>(<console>:45)

at $line16.$read$.<init>(<console>:49)

at $line16.$read$.<clinit>(<console>)

at $line16.$eval$.$print$lzycompute(<console>:7)

at $line16.$eval$.$print(<console>:6)

at $line16.$eval.$print(<console>)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)

at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)

at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)

at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)

at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)

at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)

at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)

at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)

at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)

at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)

at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)

at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)

at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:415)

at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:923)

at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)

at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)

at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)

at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)

at org.apache.spark.repl.Main$.doMain(Main.scala:76)

at org.apache.spark.repl.Main$.main(Main.scala:56)

at org.apache.spark.repl.Main.main(Main.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)

at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)

at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)

at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)

at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)

at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.NullPointerException

at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:465)

at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:309)

at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:905)

at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)

at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)

at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)

at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)

at com.sun.proxy.$Proxy51.fetchResults(Unknown Source)

at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:561)

at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:786)

at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)

at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822)

at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)

at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)

at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)

at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.io.IOException: java.lang.NullPointerException

at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162)

at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2695)

at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)

at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:460)

... 24 more

Caused by: java.lang.NullPointerException: null

at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:272)

at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)

at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)

at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)

at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)

at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)

at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)

at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)

at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)

at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)

at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)

at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)

at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)

... 27 more





1 REPLY 1

New Contributor

hive.executeQuery("query") is for read\write operations. But here you are trying a catalog operation. For the catalog operation, you need to use

hive.execute("statement")

Refer API documentation here : https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_hivewarehouse...

For your specific case of getting the database list, there is a pre-defined catalog operation

hive.showDatabases()

110226-1565105325190.png

BTW, this issue has nothing to do with the ingestion. Please make sure to that the title represents the exact problem.


1565105020698.png