Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Not able to ingest data in hive using spark in HDP3

Highlighted

Not able to ingest data in hive using spark in HDP3

New Contributor

While initializing spark-shell, I have provided following jar for the hive-spark connectivity

/usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar


I have configured below 5 properties as provided by the below link for the connectivity

Link: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_configure_a_s...


Properties:

  1. spark.sql.hive.hiveserver2.jdbc.url
  2. spark.datasource.hive.warehouse.metastoreUri
  3. spark.datasource.hive.warehouse.load.staging.dir
  4. spark.hadoop.hive.llap.daemon.service.hosts
  5. spark.hadoop.hive.zookeeper.quorum


Following code was executed to fetch data from hive:

import com.hortonworks.hwc.HiveWarehouseSession

val hive = HiveWarehouseSession.session(spark).build()

val df = hive.executeQuery("show databases")


The following error is displayed:

ERROR LlapBaseInputFormat: Closing connection due to error

shadehive.org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.NullPointerException

at shadehive.org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300)

at shadehive.org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286)

at shadehive.org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:374)

at org.apache.hadoop.hive.llap.LlapBaseInputFormat.getSplits(LlapBaseInputFormat.java:263)

at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.getTableSchema(HiveWarehouseDataSourceReader.java:109)

at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.readSchema(HiveWarehouseDataSourceReader.java:123)

at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$.apply(DataSourceV2Relation.scala:56)

at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:224)

at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)

at com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl.executeQuery(HiveWarehouseSessionImpl.java:62)

at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:26)

at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:31)

at $line16.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:33)

at $line16.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:35)

at $line16.$read$$iw$$iw$$iw$$iw.<init>(<console>:37)

at $line16.$read$$iw$$iw$$iw.<init>(<console>:39)

at $line16.$read$$iw$$iw.<init>(<console>:41)

at $line16.$read$$iw.<init>(<console>:43)

at $line16.$read.<init>(<console>:45)

at $line16.$read$.<init>(<console>:49)

at $line16.$read$.<clinit>(<console>)

at $line16.$eval$.$print$lzycompute(<console>:7)

at $line16.$eval$.$print(<console>:6)

at $line16.$eval.$print(<console>)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)

at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)

at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)

at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)

at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)

at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)

at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)

at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)

at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)

at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)

at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)

at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)

at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:415)

at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:923)

at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)

at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)

at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)

at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)

at org.apache.spark.repl.Main$.doMain(Main.scala:76)

at org.apache.spark.repl.Main$.main(Main.scala:56)

at org.apache.spark.repl.Main.main(Main.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)

at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)

at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)

at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)

at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)

at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.NullPointerException

at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:465)

at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:309)

at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:905)

at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)

at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)

at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)

at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)

at com.sun.proxy.$Proxy51.fetchResults(Unknown Source)

at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:561)

at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:786)

at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)

at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822)

at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)

at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)

at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)

at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.io.IOException: java.lang.NullPointerException

at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162)

at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2695)

at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)

at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:460)

... 24 more

Caused by: java.lang.NullPointerException: null

at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:272)

at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)

at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)

at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)

at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)

at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)

at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)

at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)

at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)

at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)

at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)

at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)

at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)

... 27 more





1 REPLY 1

Re: Not able to ingest data in hive using spark in HDP3

New Contributor

hive.executeQuery("query") is for read\write operations. But here you are trying a catalog operation. For the catalog operation, you need to use

hive.execute("statement")

Refer API documentation here : https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_hivewarehouse...

For your specific case of getting the database list, there is a pre-defined catalog operation

hive.showDatabases()

110226-1565105325190.png

BTW, this issue has nothing to do with the ingestion. Please make sure to that the title represents the exact problem.


1565105020698.png
Don't have an account?
Coming from Hortonworks? Activate your account here