Member since
05-25-2022
2
Posts
0
Kudos Received
0
Solutions
05-25-2022
01:42 AM
I am new to spark so I need some help to understand the processing of Executor in spark. In map-reduce job, mapper write its output file "file.out" under "hadoop/yarn/local/usercache/hdfs/appcache". This file contains the output of a mapper(actual data of file). I need to find the same output file of a executor in case of Spark. I saw other temp data is being cached in the same folder in spark as well but I am not able to locate the file.out file. We are persisting data in both memory and disk. Could anyone please help me with this.
... View more
Labels:
- Labels:
-
Apache Spark
-
HDFS
-
MapReduce
05-25-2022
01:39 AM
I am running a map-reduce job in local mode. Jobs are completing successfully but the cache folder which created per job are not deleting post the map-reduce job. However, all files creating in cache folders are deleting automatically. The only issue is with folder. The default location of these cache folders is "/tmp/hadoop-root/". Could anyone please let me know if there is any property that need to set explicitly for local runner to delete these files.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
MapReduce