Member since
08-17-2019
5
Posts
1
Kudos Received
0
Solutions
12-17-2016
08:35 PM
@Ryan Cicak Yes, it works. Thx
... View more
12-17-2016
08:20 PM
Create a Spark applicaiton with SparkSQL inside package SparkSamplePackage
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.sql._
import org.apache.spark.sql.hive._
object SparkSampleClass
{
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Spark Sample App")
conf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer")
conf.set("spark.sepeculation","true")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
import sqlContext.implicits._
val sampleDF = sqlContext.sql("select code, salary from hr.managers limit 10")
sampleDF.collect.foreach(println)
sc.stop()
}
}
spark-submit command looks like. Cluster is with latest HDP 2.5.3, spark is 1.6.2 spark-submit \
--class SparkSamplePackage.SparkSampleClass \
--master yarn-cluster \
--num-executors 2 \
--driver-memory 1g \
--executor-memory 1g \
--executor-cores 1 \
--files /usr/hdp/current/spark-client/conf/hive-site.xml \
target/SparkSample-1.0-SNAPSHOT.jar
Getting following error complaining that not able to instantiate metadata client token: Token { kind: YARN_CLIENT_TOKEN, service: }
diagnostics: User class threw exception: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Please advise on how to address the issue.
... View more
Labels:
- Labels:
-
Apache Spark
07-22-2016
08:45 PM
1 Kudo
How to save the data inside a dataframe to text file in csv format in HDFS? Tried the following but csv doesn't see to be a supported format df.write.format("csv").save("/filepath")
... View more
Labels:
- Labels:
-
Apache Spark
05-18-2016
09:31 PM
Thanks Qi, adding quotes solved the problem.
... View more
05-18-2016
09:27 PM
I setup Hive HA by installing 2 HiveServer2 to the cluster. But could not connect to the cluster with beeline following the document instruction from http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_hadoop-ha/content/ha-hs2-service-discovery.html Command used looks like beeline -u jdbc:hive2://zk01:2181,zk02:2181,zk03:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 -n hive -p hadoop Thanks,
... View more
Labels:
- Labels:
-
Apache Hive