Member since
10-10-2017
5
Posts
0
Kudos Received
0
Solutions
01-10-2019
08:41 AM
I have some replication set up to copy the output of a daily process to another cluster, this uses:
hadoop distcp -update -delete $SOURCE $TARGET
However it occassionally fails (after mapping 100%!) with this error:
19/01/10 16:13:44 INFO mapreduce.Job: map 100% reduce 0% 19/01/10 16:15:51 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server 19/01/10 16:15:51 INFO mapreduce.Job: map 0% reduce NaN% 19/01/10 16:15:51 INFO mapreduce.Job: Job job_1546553376389_1538 failed with state FAILED due to: 19/01/10 16:15:51 ERROR tools.DistCp: Exception encountered java.io.IOException: DistCp failure: Job job_1546553376389_1538 has failed: at org.apache.hadoop.tools.DistCp.execute(DistCp.java:195) at org.apache.hadoop.tools.DistCp.run(DistCp.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.tools.DistCp.main(DistCp.java:493)
I'm looking for some advice on how to investigate this problem, as I'm not completely sure where to start. Has anyone encountered something similar? What logs might have useful information for failed tasks like this?
... View more
11-06-2018
08:51 AM
We're using CDH 5.12.1 currently, which ships with Spark1.6. We have deployed Spark2.3 on the cluster, which is the distribution that we're actively using, and is working fine. However, this does mean that we've got Spark1.6 binaries on our servers. Our security scans have picked these up as a vulnerability and we'd like to go ahead and remove them. I'm wondering if anyone has attempted something like this before? If so, do they have any advice regarding it? I was simply going to have a look at what Spark1.6 files there are, then write a script that looped through our cluster and removed those files. If someone has a more "official" way of doing things, that would be preferable. I'm more than aware that my proposal wouldn't exactly be supported. As a follow up, have the Spark1.6 binaries been removed from more recent CDH versions?
... View more
Labels:
- Labels:
-
Apache Spark