Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to use saveAsTextFiles in spark streaming

Solved Go to solution

Re: How to use saveAsTextFiles in spark streaming

16/04/06 14:09:52 INFO FileInputDStream: Duration for remembering RDDs set to 60000 ms for org.apache.spark.streaming.dstream.FileInputDStream@4bf57335
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.util.ThreadUtils$.runInNewThread$default$2()Z
	at org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:606)
	at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:600)

there's a mismatch in the versions of dependencies and runtime so i do :


And i am getting error like as the following :

16/04/07 11:23:56 WARN FileInputDStream: Error finding new files Incomplete HDFS URI, no host: "/root/file/test"


Re: How to use saveAsTextFiles in spark streaming


I don't know why but I re-run and it works, but I have an empty _success file into the directory file1.

here is the complete code :

    def main(args: Array[String]) {
    val conf = new SparkConf()
    .setAppName("File Count")

    val sc = new SparkContext(conf)
    val ssc = new StreamingContext(sc, Seconds(1))
    val file = ssc.textFileStream("/root/file/test/file")
   file.foreachRDD(t=> {
         val test = => (x.split(" ")(0)+";"+x.split(" ")(1), 1)).reduceByKey((x,y) => x+y)


Re: How to use saveAsTextFiles in spark streaming


Thats because you have no new files arriving in the directory after streaming application starts.


You can try "cp" to drop files in the directory after starting the streaming application.