I was able to read a text file from HDFS.
But when I try to write dummy data to HDFS it seems the data is stored in sequence format instead text format
I'm doing something wrong or there is not a direct way to write a file in text format? A workaround could be create a local file and then use the command hdfs.put to upload the file to HDFS.
Thanks in advanced.
rhdfs package has a put function. You should be able to simply write the file using put function. See the following link (accepted answer):
https://community.hortonworks.com/questions/36583/how-to-save-data-in-hdfs-using-r.html. here is how you should do it:
localData <-system.file(file.path("unitTestData", "AirlineDemo1kNoMissing.csv"),package="rhdfs") hdfs.mkdir("/test/airline") hdfs.put(localData, "/test/airline/AirlineDemo1kNoMissing.csv")
Thanks for the quick response @mqureshi
The command hdfs.put is for upload a local file to HDFS, but I need to store directly the data without store it in a local file. If there is not other way I will have to use that approach.
object: The R object to be written to disk.
con: An open HDFS connection returned by ‘hdfs.file’
hsync: If TRUE, the file will be synched after writing
The functions can be used to read and write files both on the
local filesystem and the HDFS. If the object is a raw vector, it
is written directly to the ‘con’ object, otherwise it is
serialized and the bytes written to the ‘con’. No prefix (for
example, length of bytes) are written and it is up to the user to
handle this. ‘hdfs.seek’ seeks to the position ‘n’. It must be
positive. ‘hdfs.tell’ returns the current location of the file
data <- "hello world"
modelfile <- hdfs.file("test.txt", "w")
data1 <- toJSON(data)
data2 <- charToRaw(data1)
you have to write data as raw vector to modelfile object .