Support Questions

Find answers, ask questions, and share your expertise

how to append files in writing to hdfs from spark?

avatar
New Contributor

This is my code:

JavaPairInputDStream<String, String> messages = KafkaUtils.createDirectStream(jssc, String.class, String.class, StringDecoder.class, StringDecoder.class, kafkaParams, topicsSet);

JavaDStream<String> lines = messages.map(new Function<Tuple2<String, String>, String>() { @Override public String call(Tuple2<String, String> tuple2) { return tuple2._2(); } });

lines.dstream().saveAsTextFiles(pathtohdfs);

this is basically generating different files every time in hdfs. I need to append all the files into one. How can I do that?

1 REPLY 1

avatar
Expert Contributor

@Apoorva Teja Vanam : It doesn't look like there is a straight forward approach to this. Have you checked : http://stackoverflow.com/questions/37017366/how-can-i-make-spark1-6-saveastextfile-to-append-existin...