Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Distcp job fails with EOF Exception

Highlighted

Distcp job fails with EOF Exception

Hi,

While running distcp job using hsftp to transfer files from openstack cluster to AWS cluster, sometimes job fails with below exceptions. If we re run multiple times, it is sucessful.

The directory has multiple files ranging from few MBs to max 1GB and total dir size is ~20GB.

Exception 1:

at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:281)
   ... 10 more
Caused
 by: 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException:
 javax.net.ssl.SSLException: SSL peer shut down incorrectly
   at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:288)
   at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:256)
   at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToFile(RetriableFileCopyCommand.java:183)
   at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:123)
   at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:99)
   at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
   ... 11 more
Caused by: javax.net.ssl.SSLException: SSL peer shut down incorrectly
   at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:596)
   at sun.security.ssl.InputRecord.read(InputRecord.java:532)
   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
   at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
   at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
   at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
   at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
   at sun.net.www.MeteredStream.read(MeteredStream.java:134)
   at java.io.FilterInputStream.read(FilterInputStream.java:133)
   at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3335)
   at org.apache.commons.io.input.BoundedInputStream.read(BoundedInputStream.java:121)
   at org.apache.hadoop.hdfs.web.ByteRangeInputStream.read(ByteRangeInputStream.java:229)
   at java.io.DataInputStream.read(DataInputStream.java:100)
   at org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream.java:80)
   at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:283)
   ... 16 more

Exception 2:

Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.IOException: Got EOF but currentPos = 336175104 < filelength = 836475643
    at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:288)
    at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:256)
    at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToFile(RetriableFileCopyCommand.java:183)
    at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:123)
    at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:99)
    at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
    ... 11 more
Caused by: java.io.IOException: Got EOF but currentPos = 336175104 < filelength = 836475643
    at org.apache.hadoop.hdfs.web.ByteRangeInputStream.update(ByteRangeInputStream.java:214)
    at org.apache.hadoop.hdfs.web.ByteRangeInputStream.read(ByteRangeInputStream.java:229)
    at java.io.DataInputStream.read(DataInputStream.java:100)
    at org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream.java:80)
    at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:283)
    ... 16 more

4 REPLIES 4
Highlighted

Re: Distcp job fails with EOF Exception

@subacini balakrishnan, DistCp works by distributing the work of copying files across all of the nodes in a cluster. Have you noticed if these failures occur on specific nodes? If so, then it might indicate a misconfiguration or a network connectivity problem on those particular nodes.
Highlighted

Re: Distcp job fails with EOF Exception

@subacini balakrishnan

Is your source and target hadoop versions are same?

Can you paste your distcp command please?

Highlighted

Re: Distcp job fails with EOF Exception

Yes versions are same.

here is the command

hadoop distcp hsftp://host1:50470/user/xxxx/1.txt hdfs://host2_nameservice/user/yyyy

Highlighted

Re: Distcp job fails with EOF Exception

Explorer

@subacini balakrishnan did this work for you?

Don't have an account?
Coming from Hortonworks? Activate your account here