Support Questions

Find answers, ask questions, and share your expertise

Exception When Migratin Data between Hadoop clusters

Contributor

Hi All,

I'm getting the following Error in between a sapcrk job is running. actually I'm trying to migrate data from Hadoop cluster-A and Hadoop cluster-B. Im migrating Around 10000 Partitions per Job, the problem here is my job is throwing Error when it crossed half way. please find the following Error and let me know how to Resolve this.

18/07/09 20:49:54 WARN TaskSetManager: Lost task 1.0 in stage 882.0 (TID 1902, DNSName, executor 1): java.net.UnknownHostException: HostName:50075: HostName at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.<init>(HttpClient.java:211) at sun.net.www.http.HttpClient.New(HttpClient.java:308) at sun.net.www.http.HttpClient.New(HttpClient.java:326) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1168) at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1104) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:998) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:932) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:595) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:552) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.connect(WebHdfsFileSystem.java:1710) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:620) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:472) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:502) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:498) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.read(WebHdfsFileSystem.java:1659) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.read(WebHdfsFileSystem.java:1520) at java.io.DataInputStream.readFully(DataInputStream.java:195) at java.io.DataInputStream.readFully(DataInputStream.java:169) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1915) at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1880) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1829) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1843) at org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:49) at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:64) at org.apache.spark.rdd.HadoopRDD$anon$1.liftedTree1$1(HadoopRDD.scala:251) at org.apache.spark.rdd.HadoopRDD$anon$1.<init>(HadoopRDD.scala:250) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:94) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Tanx and Regards,

MJ

1 REPLY 1

New Contributor

The log says one host can't reach the other host using DNS. Are you sure there's communication at DNS level between those 2 clusters?