Member since
05-18-2017
1
Post
1
Kudos Received
0
Solutions
05-18-2017
02:07 AM
1 Kudo
Thanks for sharing the code of your solution. I've also found that just making HiveContext variable lazy works: val sparkConf = new SparkConf().setAppName("StreamHDFSdata")
sparkConf.set("spark.dynamicAllocation.enabled","false")
val ssc = new StreamingContext(sparkConf, Seconds(5))
ssc.checkpoint("/user/hdpuser/checkpoint")
val sc = ssc.sparkContext
val smDStream = ssc.textFileStream("/user/hdpuser/data")
val smSplitted = smDStream.map( x => x.split(";") ).map( x => Row.fromSeq( x ) )
...
lazy val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
smSplitted.foreachRDD( rdd => {
// use sqlContext here
} )
... View more