02-03-2017 02:43 AM - last edited on 02-03-2017 05:36 AM by cjervis
i have few computers that combine together with cloudera.
i am using MobaXterm to connect the server.
i lunch my code from specipic server-computer (let said server 1)
i am trying to create a txt file that locate on server 1
and douring the code each server will write to the same file
so i did somting like that:
add the file
path = os.path.join("/home/user/","clousters.txt")
with open(path, "w") as testFile:
then i have rdd which i send to a python function the python do alot of things- there are many sub functions
one of them get list of list and tring to write the output into the clousters txt file!
for enter in location_to_cluster_list: with open(SparkFiles.get("clousters.txt")) as f: writer = csv.writer(f) writer.writerow(enter)
but i have that error : File not open for writing
i am using spark 1.3 with python 2.6
02-13-2017 08:26 PM
Spark's addFile is used to distribute read only files to workers and is not intended to be written to. Also, using python file operators will write files locally and you will need to manually collect files if that is the desired result. Instead, you can use a distributed file system, like HDFS, that is available to all workers. Or collect results to the driver and write the file local to the driver if the results are sufficiently small.