Created 12-17-2016 08:20 PM
Create a Spark applicaiton with SparkSQL inside
package SparkSamplePackage
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.sql._
import org.apache.spark.sql.hive._
object SparkSampleClass
{
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Spark Sample App")
conf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer")
conf.set("spark.sepeculation","true")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
import sqlContext.implicits._
val sampleDF = sqlContext.sql("select code, salary from hr.managers limit 10")
sampleDF.collect.foreach(println)
sc.stop()
}
}
spark-submit command looks like. Cluster is with latest HDP 2.5.3, spark is 1.6.2
spark-submit \
--class SparkSamplePackage.SparkSampleClass \
--master yarn-cluster \
--num-executors 2 \
--driver-memory 1g \
--executor-memory 1g \
--executor-cores 1 \
--files /usr/hdp/current/spark-client/conf/hive-site.xml \
target/SparkSample-1.0-SNAPSHOT.jar
Getting following error complaining that not able to instantiate metadata
client token: Token { kind: YARN_CLIENT_TOKEN, service: }
diagnostics: User class threw exception: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Please advise on how to address the issue.
Created 12-17-2016 08:30 PM
Hi @Sherry Noah
Can you try:
spark-submit \ --classSparkSamplePackage.SparkSampleClass \ --master yarn-cluster \ --num-executors 2 \ --driver-memory 1g \ --executor-memory 1g \ --executor-cores 1 \ --files /usr/hdp/current/spark-client/conf/hive-site.xml \ --jars /usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar,target/SparkSample-1.0-SNAPSHOT.jar
Created 12-17-2016 08:30 PM
Hi @Sherry Noah
Can you try:
spark-submit \ --classSparkSamplePackage.SparkSampleClass \ --master yarn-cluster \ --num-executors 2 \ --driver-memory 1g \ --executor-memory 1g \ --executor-cores 1 \ --files /usr/hdp/current/spark-client/conf/hive-site.xml \ --jars /usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar,target/SparkSample-1.0-SNAPSHOT.jar
Created 12-17-2016 08:35 PM
@Ryan Cicak Yes, it works. Thx