Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

how to append files in writing to hdfs from spark?

avatar
New Contributor

This is my code:

JavaPairInputDStream<String, String> messages = KafkaUtils.createDirectStream(jssc, String.class, String.class, StringDecoder.class, StringDecoder.class, kafkaParams, topicsSet);

JavaDStream<String> lines = messages.map(new Function<Tuple2<String, String>, String>() { @Override public String call(Tuple2<String, String> tuple2) { return tuple2._2(); } });

lines.dstream().saveAsTextFiles(pathtohdfs);

this is basically generating different files every time in hdfs. I need to append all the files into one. How can I do that?

1 REPLY 1

avatar
Expert Contributor

@Apoorva Teja Vanam : It doesn't look like there is a straight forward approach to this. Have you checked : http://stackoverflow.com/questions/37017366/how-can-i-make-spark1-6-saveastextfile-to-append-existin...