Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDP 3.1 Hive Connectivity Issue

avatar
Contributor

I am migrating the spark jobs running in HDP 2.6 to HDP 3.1. When executing the spark jobs in HDP3.1 I am getting the following error.

 

 

 

java.util.NoSuchElementException: spark.sql.hive.hiveserver2.jdbc.url
at org.apache.spark.sql.internal.SQLConf$$anonfun$getConfString$2.apply(SQLConf.scala:1571)
at org.apache.spark.sql.internal.SQLConf$$anonfun$getConfString$2.apply(SQLConf.scala:1571)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.internal.SQLConf.getConfString(SQLConf.scala:1571)
at org.apache.spark.sql.RuntimeConfig.get(RuntimeConfig.scala:74)
at com.hortonworks.spark.sql.hive.llap.HWConf.getConnectionUrlFromConf(HWConf.java:143)
at com.hortonworks.spark.sql.hive.llap.HWConf.getConnectionUrl(HWConf.java:107)
at com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.build(HiveWarehouseBuilder.java:97)
at com.wunderman.hdp.Hdp3MigrationMain.main(Hdp3MigrationMain.java:16)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

 

 

 

 

The way I m creating the spark session 

 

 

 

private static SparkSession getSparkSession() {
		/**
		 * Create an instance of SparkSession to connect to the cluster
		 */

		SparkSession sparkSession = SparkSession.builder().appName("Hdp3 Migration").master("yarn").getOrCreate();
		return sparkSession;
	 }

 

 

 

 But the hiveserver2 jdbc url is configured in the spark config.  I have added the following dependency in the pom.xml

 

 

 

<dependency>
			<groupId>com.hortonworks.hive</groupId>
			<artifactId>hive-warehouse-connector_2.11</artifactId>
			<version>1.0.0.3.1.0.0-78</version>
		</dependency>

 

 

And I am trying to execute the below code

 

 

String hdp3Enabled = args[0];
		Dataset<Row> dataset;
		String query="SELECT * FROM  schema.tablename where col1='abc' ; //Sample query
		try {
			if ("Y".equalsIgnoreCase(hdp3Enabled)) {
				HiveWarehouseSession hive = HiveWarehouseSession.session(sparkSession).build();
				dataset = hive.executeQuery(query);
			} else {
				dataset = sparkSession.sql(query);
			}
			dataset.show();
		} catch(Exception e) {
			e.printStackTrace();
		}

 

 

 Share your suggestions to fix the issue.

4 REPLIES 4

avatar
Master Mentor

@eswarloges 

From HDP 3.x onwards, to work with hive databases you should use the HiveWarehouseConnector library /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar as show in the below example

spark-shell --conf spark.sql.hive.hiveserver2.jdbc.url="jdbc:hive2://FQDN or IP:10000/" spark.datasource.hive.warehouse.load.staging.dir="/staging_dir" spark.hadoop.hive.zookeeper.quorum="zk_Quorum_ip's:2181" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar
val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build()
hive.showDatabases().show(100, false)

Could you try that and revert

avatar
Contributor

@Shelton I have tried as you suggested still getting the same error.

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
import com.hortonworks.hwc.HiveWarehouseSession;

import com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder;

public class Hdp3MigrationMain extends CommonUtilities{

	public static void main(String[] args) {
		String hdp3Enabled = args[0];
		Dataset<Row> dataset;
		String query="select * from hive_schema.table1"; 
		try {
			if ("Y".equalsIgnoreCase(hdp3Enabled)) {
				HiveWarehouseSession hive = HiveWarehouseBuilder.session(sparkSession).build();
				dataset = hive.executeQuery(query);
			} else {
				dataset = sparkSession.sql(query);
			}
			dataset.show();
		} catch(Exception e) {
			e.printStackTrace();
		}

	}

 

And the same error occurs.

 

java.util.NoSuchElementException: spark.sql.hive.hiveserver2.jdbc.url
        at org.apache.spark.sql.internal.SQLConf$$anonfun$getConfString$2.apply(SQLConf.scala:1571)
        at org.apache.spark.sql.internal.SQLConf$$anonfun$getConfString$2.apply(SQLConf.scala:1571)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.sql.internal.SQLConf.getConfString(SQLConf.scala:1571)
        at org.apache.spark.sql.RuntimeConfig.get(RuntimeConfig.scala:74)
        at com.hortonworks.spark.sql.hive.llap.HWConf.getConnectionUrlFromConf(HWConf.java:143)
        at com.hortonworks.spark.sql.hive.llap.HWConf.getConnectionUrl(HWConf.java:107)
        at com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.build(HiveWarehouseBuilder.java:97)
        at com.wunderman.hdp.Hdp3MigrationMain.main(Hdp3MigrationMain.java:18)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

avatar
Master Mentor

@eswarloges 

In one of your previous update you have mentioned that you have added  "the hiveserver2 jdbc url is configured in the spark config."

However, looks like the error that you are getting is because the mentioned properties are not found in the spark2-defaults config which is there in your classpath.

 

So can you please make sure that you have included the correct CLASSPATH which is pointing to correct spark-defaults which has the following properties added as mentioned in "Required properties" Section of the following Doc:  https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_configure_a_spar...

.

You must add several Spark properties through spark2-defaults in Ambari to use the Hive Warehouse Connector for accessing data in Hive.  Alternatively, configuration can be provided for each job using --conf.

 

  • spark.sql.hive.hiveserver2.jdbc.url

    The URL for HiveServer2 Interactive

  • spark.datasource.hive.warehouse.metastoreUri

    The URI for the metastore

  • spark.datasource.hive.warehouse.load.staging.dir

    The HDFS temp directory for batch writes to Hive, /tmp for example

  • spark.hadoop.hive.llap.daemon.service.hosts

    The application name for LLAP service

  • spark.hadoop.hive.zookeeper.quorum

    The ZooKeeper hosts used by LLAP

Set the values of these properties as follows:

  • spark.sql.hive.hiveserver2.jdbc.url

    In Ambari, copy the value from Services > Hive > Summary > HIVESERVER2 INTERACTIVE JDBC URL.

  • spark.datasource.hive.warehouse.metastoreUri

    Copy the value from hive.metastore.uris. In Hive, at the hive> prompt, enter set hive.metastore.uris and copy the output. For example, thrift://mycluster-1.com:9083.

  • spark.hadoop.hive.llap.daemon.service.hosts

    Copy value from Advanced hive-interactive-site > hive.llap.daemon.service.hosts.

  • spark.hadoop.hive.zookeeper.quorum

    Copy the value from Advanced hive-sitehive.zookeeper.quorum.



.

 

 

.

 

 

avatar
Contributor

@jsensharma  All the configs are set properly in our cluster. I am trying to access the external hive tables using hive ware house session. 

Is that the error because of that as the documentation says its not needed to use HiveWarehouseSession for the external tables.