Reply
Highlighted
New Contributor
Posts: 2
Registered: ‎02-14-2018

Spark Streaming doesn't receive any data from Flume

created a simple spark streaming application to consume data from Flume using Pull-based approach.

It works well when I run the program from my PC in Eclipse (Run As - Scala Application). But after compiling it into jar and submit the app via spark-submit, it's not receiving any data from Flume. Here's my code:

 

def main(args: Array[String]){
  val conf = new SparkConf().setAppName("twitter").set("spark.streaming.stopGracefullyOnShutdown", "true")
  val ssc = new StreamingContext(conf, Seconds(30))
  val flumeStream = FlumeUtils.createPollingStream(ssc, "172.31.190.31", 9999)
  val tweets = flumeStream.map(e => new String(e.event.getBody.array()))  tweets.print()  tweets.foreachRDD(rdd=>{    rdd.saveAsTextFile("/warehouse/raw/twitter/data")
  })  ssc.start()  ssc.awaitTermination()
}

 

And here's how I submit the app:

spark-submit --master local[*] --deploy-mode client --class co.id.linknet.general.StreamingFlume ./spark/lib/linknet-general-1.0.1.jar

 

Flume config:

TwitterAgent01.sources = Twitter
TwitterAgent01.channels = MemoryChannel01
TwitterAgent01.sinks = HDFS

TwitterAgent01.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent01.sources.Twitter.channels = MemoryChannel01
TwitterAgent01.sources.Twitter.consumerKey = xxxTwitterAgent01.sources.Twitter.consumerSecret = xxxTwitterAgent01.sources.Twitter.accessToken = xxxTwitterAgent01.sources.Twitter.accessTokenSecret = xxxTwitterAgent01.sources.Twitter.keywords = keyword1, keyword2, keywordN

TwitterAgent01.sinks = sparkStreamTwitterAgent01.sinks.sparkStream.type = org.apache.spark.streaming.flume.sink.SparkSink
TwitterAgent01.sinks.sparkStream.hostname = edge01TwitterAgent01.sinks.sparkStream.port = 9999
TwitterAgent01.sinks.sparkStream.channel = MemoryChannel01

TwitterAgent01.channels.MemoryChannel01.type = memoryTwitterAgent01.channels.MemoryChannel01.capacity = 10000
TwitterAgent01.channels.MemoryChannel01.transactionCapacity = 10000

 

Flume and spark submission are in the same server, I'm able to telnet port 9999 from itself.

Did I miss something ?

 

Contributor
Posts: 41
Registered: ‎06-24-2018

Re: Spark Streaming doesn't receive any data from Flume

were you able to solve this issue?

Announcements