- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Apache SPARK - Overwrite data file
- Labels:
-
Apache Spark
Created ‎09-29-2016 06:38 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi experts, How can I overwrite an existing file by a new one (data update). Imagine that I've this:
- result.map(pair => pair.swap).sortByKey(true).saveAsTextFile("FILE/results")
And Imagine that I want to do this:
- test.map(pair => pair.swap).sortByKey(false).saveAsTextFile("FILE/results")
How can I overwrite the results of the var result to the results of the val test in same directory?
Created ‎10-06-2016 04:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
RDD's saveAsTextFile does not give us the opportunity to do that (DataFrame's have "save modes" for things like append/overwrite/ignore). You'll have to control this prior before (maybe delete or rename existing data) or afterwards (write the RDD as a diff dir and then swap it out).
Created ‎10-06-2016 04:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
RDD's saveAsTextFile does not give us the opportunity to do that (DataFrame's have "save modes" for things like append/overwrite/ignore). You'll have to control this prior before (maybe delete or rename existing data) or afterwards (write the RDD as a diff dir and then swap it out).
