Reply
Highlighted
Explorer
Posts: 21
Registered: ‎02-08-2016

Twitter Spark Streaming Issue

Hello All ,

 

I am getting "NullPointerException" exception while reading Location field. 

 

Can you please help me resovle it 

 

Code :

import org.apache.spark._
import org.apache.spark.SparkContext._
import org.apache.spark.streaming._
import org.apache.spark.streaming.twitter._
import org.apache.spark.streaming.StreamingContext._
import org.apache.log4j.Level
object PrintTweets {
def main(args: Array[String]) {
type TweetText = String
type Sentence = Seq[String]
val sparkConfiguration = new SparkConf().
setAppName("spark-twitter-stream").
setMaster("local[*]")
val sparkContext = new SparkContext(sparkConfiguration)
val streamingContext = new StreamingContext(sparkContext, Seconds(5))
val filters = Seq("olympic")
// Creating a stream from Twitter 
val tweets =
TwitterUtils.createStream(streamingContext, None)
val alldata_part1 = tweets.map { status =>
(
status.getContributors.toString(),
status.getCreatedAt.toString(),
status.getCurrentUserRetweetId.toString(),
status.getGeoLocation)
}
val Tweet_CreatedAt = alldata_part1.map(Fetchfiled => Fetchfiled._2)
// val Tweet_CurrentUserRetweetId = alldata_part1.map(Fetchfiled => Fetchfiled._3)
val Tweet_GeoLocation = alldata_part1.map(Fetchfiled => Fetchfiled._4)
Tweet_CreatedAt.print()
streamingContext.start()
streamingContext.awaitTermination()
}
}

 

Error: 


16/08/17 10:51:55 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
java.lang.NullPointerException
	at org.test.sparktest.PrintTweets$$anonfun$1.apply(Printtweets.scala:50)
	at org.test.sparktest.PrintTweets$$anonfun$1.apply(Printtweets.scala:44)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:370)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:370)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:354)
	at scala.collection.Iterator$class.foreach(Iterator.scala:750)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1202)
	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:295)
	at scala.collection.AbstractIterator.to(Iterator.scala:1202)
	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:287)
	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1202)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:274)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1202)
	at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$28.apply(RDD.scala:1276)
	at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$28.apply(RDD.scala:1276)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
	at org.apache.spark.scheduler.Task.run(Task.scala:70)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
16/08/17 10:51:55 ERROR TaskSetManager: Task 0 in stage 1.0 failed 1 times; aborting job
16/08/17 10:51:55 ERROR JobScheduler: Error running job streaming job 1471411315000 ms.0

 

Cloudera Employee
Posts: 481
Registered: ‎08-11-2014

Re: Twitter Spark Streaming Issue

This is going to be an error in your code rather than Spark, as you can see from the stack trace. It's hard to figure out which line number is which in this paste. I assume you're dereferencing a null field from the results you get back. You need to debug your code.