Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

HDP 3.1 Hive Connectivity Issue

Explorer

I am migrating the spark jobs running in HDP 2.6 to HDP 3.1. When executing the spark jobs in HDP3.1 I am getting the following error.

 

 

 

java.util.NoSuchElementException: spark.sql.hive.hiveserver2.jdbc.url
at org.apache.spark.sql.internal.SQLConf$$anonfun$getConfString$2.apply(SQLConf.scala:1571)
at org.apache.spark.sql.internal.SQLConf$$anonfun$getConfString$2.apply(SQLConf.scala:1571)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.internal.SQLConf.getConfString(SQLConf.scala:1571)
at org.apache.spark.sql.RuntimeConfig.get(RuntimeConfig.scala:74)
at com.hortonworks.spark.sql.hive.llap.HWConf.getConnectionUrlFromConf(HWConf.java:143)
at com.hortonworks.spark.sql.hive.llap.HWConf.getConnectionUrl(HWConf.java:107)
at com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.build(HiveWarehouseBuilder.java:97)
at com.wunderman.hdp.Hdp3MigrationMain.main(Hdp3MigrationMain.java:16)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

 

 

 

 

The way I m creating the spark session 

 

 

 

private static SparkSession getSparkSession() {
		/**
		 * Create an instance of SparkSession to connect to the cluster
		 */

		SparkSession sparkSession = SparkSession.builder().appName("Hdp3 Migration").master("yarn").getOrCreate();
		return sparkSession;
	 }

 

 

 

 But the hiveserver2 jdbc url is configured in the spark config.  I have added the following dependency in the pom.xml

 

 

 

<dependency>
			<groupId>com.hortonworks.hive</groupId>
			<artifactId>hive-warehouse-connector_2.11</artifactId>
			<version>1.0.0.3.1.0.0-78</version>
		</dependency>

 

 

And I am trying to execute the below code

 

 

String hdp3Enabled = args[0];
		Dataset<Row> dataset;
		String query="SELECT * FROM  schema.tablename where col1='abc' ; //Sample query
		try {
			if ("Y".equalsIgnoreCase(hdp3Enabled)) {
				HiveWarehouseSession hive = HiveWarehouseSession.session(sparkSession).build();
				dataset = hive.executeQuery(query);
			} else {
				dataset = sparkSession.sql(query);
			}
			dataset.show();
		} catch(Exception e) {
			e.printStackTrace();
		}

 

 

 Share your suggestions to fix the issue.

4 REPLIES 4

Mentor

@eswarloges 

From HDP 3.x onwards, to work with hive databases you should use the HiveWarehouseConnector library /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar as show in the below example

spark-shell --conf spark.sql.hive.hiveserver2.jdbc.url="jdbc:hive2://FQDN or IP:10000/" spark.datasource.hive.warehouse.load.staging.dir="/staging_dir" spark.hadoop.hive.zookeeper.quorum="zk_Quorum_ip's:2181" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar
val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build()
hive.showDatabases().show(100, false)

Could you try that and revert

Explorer

@Shelton I have tried as you suggested still getting the same error.

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
import com.hortonworks.hwc.HiveWarehouseSession;

import com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder;

public class Hdp3MigrationMain extends CommonUtilities{

	public static void main(String[] args) {
		String hdp3Enabled = args[0];
		Dataset<Row> dataset;
		String query="select * from hive_schema.table1"; 
		try {
			if ("Y".equalsIgnoreCase(hdp3Enabled)) {
				HiveWarehouseSession hive = HiveWarehouseBuilder.session(sparkSession).build();
				dataset = hive.executeQuery(query);
			} else {
				dataset = sparkSession.sql(query);
			}
			dataset.show();
		} catch(Exception e) {
			e.printStackTrace();
		}

	}

 

And the same error occurs.

 

java.util.NoSuchElementException: spark.sql.hive.hiveserver2.jdbc.url
        at org.apache.spark.sql.internal.SQLConf$$anonfun$getConfString$2.apply(SQLConf.scala:1571)
        at org.apache.spark.sql.internal.SQLConf$$anonfun$getConfString$2.apply(SQLConf.scala:1571)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.sql.internal.SQLConf.getConfString(SQLConf.scala:1571)
        at org.apache.spark.sql.RuntimeConfig.get(RuntimeConfig.scala:74)
        at com.hortonworks.spark.sql.hive.llap.HWConf.getConnectionUrlFromConf(HWConf.java:143)
        at com.hortonworks.spark.sql.hive.llap.HWConf.getConnectionUrl(HWConf.java:107)
        at com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.build(HiveWarehouseBuilder.java:97)
        at com.wunderman.hdp.Hdp3MigrationMain.main(Hdp3MigrationMain.java:18)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Super Mentor

@eswarloges 

In one of your previous update you have mentioned that you have added  "the hiveserver2 jdbc url is configured in the spark config."

However, looks like the error that you are getting is because the mentioned properties are not found in the spark2-defaults config which is there in your classpath.

 

So can you please make sure that you have included the correct CLASSPATH which is pointing to correct spark-defaults which has the following properties added as mentioned in "Required properties" Section of the following Doc:  https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_configure_a_spar...

.

You must add several Spark properties through spark2-defaults in Ambari to use the Hive Warehouse Connector for accessing data in Hive.  Alternatively, configuration can be provided for each job using --conf.

 

  • spark.sql.hive.hiveserver2.jdbc.url

    The URL for HiveServer2 Interactive

  • spark.datasource.hive.warehouse.metastoreUri

    The URI for the metastore

  • spark.datasource.hive.warehouse.load.staging.dir

    The HDFS temp directory for batch writes to Hive, /tmp for example

  • spark.hadoop.hive.llap.daemon.service.hosts

    The application name for LLAP service

  • spark.hadoop.hive.zookeeper.quorum

    The ZooKeeper hosts used by LLAP

Set the values of these properties as follows:

  • spark.sql.hive.hiveserver2.jdbc.url

    In Ambari, copy the value from Services > Hive > Summary > HIVESERVER2 INTERACTIVE JDBC URL.

  • spark.datasource.hive.warehouse.metastoreUri

    Copy the value from hive.metastore.uris. In Hive, at the hive> prompt, enter set hive.metastore.uris and copy the output. For example, thrift://mycluster-1.com:9083.

  • spark.hadoop.hive.llap.daemon.service.hosts

    Copy value from Advanced hive-interactive-site > hive.llap.daemon.service.hosts.

  • spark.hadoop.hive.zookeeper.quorum

    Copy the value from Advanced hive-sitehive.zookeeper.quorum.



.

 

 

.

 

 

Explorer

@jsensharma  All the configs are set properly in our cluster. I am trying to access the external hive tables using hive ware house session. 

Is that the error because of that as the documentation says its not needed to use HiveWarehouseSession for the external tables.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.