- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
how to append files in writing to hdfs from spark?
- Labels:
-
Apache Hadoop
-
Apache Spark
Created ‎03-16-2017 03:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is my code:
JavaPairInputDStream<String, String> messages = KafkaUtils.createDirectStream(jssc, String.class, String.class, StringDecoder.class, StringDecoder.class, kafkaParams, topicsSet);
JavaDStream<String> lines = messages.map(new Function<Tuple2<String, String>, String>() { @Override public String call(Tuple2<String, String> tuple2) { return tuple2._2(); } });
lines.dstream().saveAsTextFiles(pathtohdfs);
this is basically generating different files every time in hdfs. I need to append all the files into one. How can I do that?
Created ‎03-16-2017 04:38 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Apoorva Teja Vanam : It doesn't look like there is a straight forward approach to this. Have you checked : http://stackoverflow.com/questions/37017366/how-can-i-make-spark1-6-saveastextfile-to-append-existin...
