Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

how to append files in writing to hdfs from spark?

Highlighted

how to append files in writing to hdfs from spark?

New Contributor

This is my code:

JavaPairInputDStream<String, String> messages = KafkaUtils.createDirectStream(jssc, String.class, String.class, StringDecoder.class, StringDecoder.class, kafkaParams, topicsSet);

JavaDStream<String> lines = messages.map(new Function<Tuple2<String, String>, String>() { @Override public String call(Tuple2<String, String> tuple2) { return tuple2._2(); } });

lines.dstream().saveAsTextFiles(pathtohdfs);

this is basically generating different files every time in hdfs. I need to append all the files into one. How can I do that?

1 REPLY 1
Highlighted

Re: how to append files in writing to hdfs from spark?

Rising Star

@Apoorva Teja Vanam : It doesn't look like there is a straight forward approach to this. Have you checked : http://stackoverflow.com/questions/37017366/how-can-i-make-spark1-6-saveastextfile-to-append-existin...

Don't have an account?
Coming from Hortonworks? Activate your account here