Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar

Issue:

Running the below sqoop on a client machine might lead to the reported error

sqoop export -Dmapreduce.map.log.level=DEBUG -Dteradata.db.output.fastload.socket.port=8678 \
-Dteradata.db.output.method=internal.fastload \
--connect jdbc:teradata://mrt1.openstacklocal/Database=testdb \
--connection-manager org.apache.sqoop.teradata.TeradataConnManager --username COOXP -P \
--export-dir /user/karthick/fastload_test \
--table test_table --input-fields-terminated-by ',' \
--null-string '\N' --null-non-string '\N' --num-mappers 5
INFO mapreduce.Job: Task Id : attempt_1497611885470_0832_m_000000_0, Status : FAILED
Error: com.teradata.connector.common.exception.ConnectorException: java.net.SocketException: 
Socket is not connected
        at java.net.Socket.getInputStream(Socket.java:905)
        at com.teradata.connector.teradata.TeradataInternalFastloadOutputFormat.getRecordWriter(TeradataInternalFastloadOutputFormat.java:356)
        at com.teradata.connector.common.ConnectorOutputFormat$ConnectorFileRecordWriter.<init>(ConnectorOutputFormat.java:89)
        at com.teradata.connector.common.ConnectorOutputFormat.getRecordWriter(ConnectorOutputFormat.java:38)
        at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

        at com.teradata.connector.teradata.TeradataInternalFastloadOutputFormat.getRecordWriter(TeradataInternalFastloadOutputFormat.java:478)
        at com.teradata.connector.common.ConnectorOutputFormat$ConnectorFileRecordWriter.<init>(ConnectorOutputFormat.java:89)
        at com.teradata.connector.common.ConnectorOutputFormat.getRecordWriter(ConnectorOutputFormat.java:38)
        at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

Root cause:

Fastload protocol requires ports 8678 through 65535 be opened on all the nodes in the HDP cluster including the client machine whether part of the cluster or not. These ports should be open for both inbound and outbound traffic. The default port 1025 is the only port that needs to be open on the Teradata server

If all the ports cannot be opened, the job need to be ran using a specified port using the following: a) To run using only a single port within this range then it is recommended that the port no is explicitly set using the property "-Dteradata.db.output.fastload.socket.port=8678 " within the sqoop statement as above

Resolution:

The way fastload protocol works is slightly different than the batch mode or the standard sqoop export, so when you run the same command using batch mode or standard sqoop export - they run successfully.

For fast load its mandatory that we open those ports and then run the sqoop for successful completion.

8,514 Views