Support Questions

__anonymous__ · ‎12-17-2016

Create a Spark applicaiton with SparkSQL inside

package SparkSamplePackage

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.sql._
import org.apache.spark.sql.hive._

object SparkSampleClass
{
        def main(args: Array[String]) {

        val conf = new SparkConf().setAppName("Spark Sample App")
        conf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer")
        conf.set("spark.sepeculation","true")


        val sc = new SparkContext(conf)
        val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
        import sqlContext.implicits._

        val sampleDF = sqlContext.sql("select code, salary from hr.managers limit 10")
        sampleDF.collect.foreach(println)

        sc.stop()
        }
}

spark-submit command looks like. Cluster is with latest HDP 2.5.3, spark is 1.6.2

spark-submit \
  --class SparkSamplePackage.SparkSampleClass \
  --master yarn-cluster \
  --num-executors 2 \
  --driver-memory 1g \
  --executor-memory 1g \
  --executor-cores 1 \
  --files /usr/hdp/current/spark-client/conf/hive-site.xml \
    target/SparkSample-1.0-SNAPSHOT.jar

Getting following error complaining that not able to instantiate metadata

client token: Token { kind: YARN_CLIENT_TOKEN, service:  }
	 diagnostics: User class threw exception: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

Please advise on how to address the issue.

RyanCicak · ‎12-17-2016

Hi @Sherry Noah

Can you try:

spark-submit \
--classSparkSamplePackage.SparkSampleClass \
--master yarn-cluster \
--num-executors 2 \
--driver-memory 1g \
--executor-memory 1g \
--executor-cores 1 \
--files /usr/hdp/current/spark-client/conf/hive-site.xml \    
--jars /usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar,target/SparkSample-1.0-SNAPSHOT.jar

View solution in original post

RyanCicak · ‎12-17-2016

Hi @Sherry Noah

Can you try:

spark-submit \
--classSparkSamplePackage.SparkSampleClass \
--master yarn-cluster \
--num-executors 2 \
--driver-memory 1g \
--executor-memory 1g \
--executor-cores 1 \
--files /usr/hdp/current/spark-client/conf/hive-site.xml \    
--jars /usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar,target/SparkSample-1.0-SNAPSHOT.jar

__anonymous__ · ‎12-17-2016

@Ryan Cicak Yes, it works. Thx

Cloudera Community

Support Questions

HDP 2.5.3 spark-submit sparkSQL, not able to instantiate metadata error