INFO mapreduce.Job: Task Id : attempt_1497611885470_0832_m_000000_0, Status : FAILED
Error: com.teradata.connector.common.exception.ConnectorException: java.net.SocketException:
Socket is not connected
at java.net.Socket.getInputStream(Socket.java:905)
at com.teradata.connector.teradata.TeradataInternalFastloadOutputFormat.getRecordWriter(TeradataInternalFastloadOutputFormat.java:356)
at com.teradata.connector.common.ConnectorOutputFormat$ConnectorFileRecordWriter.<init>(ConnectorOutputFormat.java:89)
at com.teradata.connector.common.ConnectorOutputFormat.getRecordWriter(ConnectorOutputFormat.java:38)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
at com.teradata.connector.teradata.TeradataInternalFastloadOutputFormat.getRecordWriter(TeradataInternalFastloadOutputFormat.java:478)
at com.teradata.connector.common.ConnectorOutputFormat$ConnectorFileRecordWriter.<init>(ConnectorOutputFormat.java:89)
at com.teradata.connector.common.ConnectorOutputFormat.getRecordWriter(ConnectorOutputFormat.java:38)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Root cause:
Fastload protocol requires ports 8678 through 65535 be opened on all the nodes in the HDP cluster including the client machine whether part of the cluster or not. These ports should be open for both inbound and outbound traffic. The default port 1025 is the only port that needs to be open on the Teradata server
If all the ports cannot be opened, the job need to be ran using a specified port using the following:
a) To run using only a single port within this range then it is recommended that the port no is explicitly set using the property "-Dteradata.db.output.fastload.socket.port=8678 " within the sqoop statement as above
Resolution:
The way fastload protocol works is slightly different than the batch mode or the standard sqoop export, so when you run the same command using batch mode or standard sqoop export - they run successfully.
For fast load its mandatory that we open those ports and then run the sqoop for successful completion.