Created 03-14-2018 05:09 AM
I am exporting the data from HDFS to Netezza using sqoop , and there are many jobs running in parallel. few of them failed with the following error causing the Sqoop job to fail. Can anyone please help me with this issue. I have taken care of all the permissions and ownership of the tables in Netezza.
I have found a link with a similar issue but doesn't have ane any solution.
Version details:
Hadoop 2.7.3.2
HDP 2.5.5.5-2
Sqoop 1.4.6
Parameter file
#indicates that to use the direct mode instead of JDBC mode --direct #Use batch mode for underlying statement execution. --batch #Connect parameter to the Netezza database --connect jdbc:netezza://mydatabase:5480/emp_db #Username to the Netezza database --username myuser #Password to the Netezza database, this file is available in HDFS. --password-file /pass/loc/file --input-lines-terminated-by '\n' --input-fields-terminated-by '\016'
Sqoop command
mapper_no=8 sqoop export --options-file ${SQOOP_PARM_FILE} \ --table ${TAR_TABLE} \ --export-dir ${HDFS_EXP_DIR}/${TABLE} \ --num-mappers ${mapper_no} \ --verbose &> ${extended_log}
Error log here
18/03/13 18:08:34 INFO mapreduce.Job: Task Id : attempt_1517179256891_920980_m_000005_2, Status : FAILED Error: java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.ipc.Client.call(Client.java:1463) at org.apache.hadoop.ipc.Client.call(Client.java:1398) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at com.sun.proxy.$Proxy14.delete(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:585) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185) at com.sun.proxy.$Proxy15.delete(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2094) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:815) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:811) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:811) at org.apache.sqoop.util.FileUploader.uploadFilesToDFS(FileUploader.java:58) at org.apache.sqoop.mapreduce.db.netezza.NetezzaExternalTableExportMapper.run(NetezzaExternalTableExportMapper.java:249) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1865) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164) Caused by: java.lang.InterruptedException at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) at java.util.concurrent.FutureTask.get(FutureTask.java:191) at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1094) at org.apache.hadoop.ipc.Client.call(Client.java:1457) ... 26 more
Created 03-14-2018 04:14 PM
Can you post the sqoop command (you can mask user/pwd/host,etc details). Firstly, i hope you took care of permissions as below in Netezza.
1)Database access. CREATE EXTERNAL TABLE access in the database for the user.
2)Table permissions for the user.
3)HDFS target directory write access.
4) Also try without "--direct" option in sqoop. To eliminate option of CREATE EXTERNAL TABLE permissions and debug.
Created 03-15-2018 12:40 PM
hi @bmasna , I have updated the question with the command and the parameters used. Yes, all the permissions are taken care in Netezza and HDFS dir, I am sure because only a few mappers failed with this error, but the other mappers succeeded and loaded the data into Netezza.
Created 03-15-2018 06:37 PM
Hi @sai harshavardhan As you said permissions seems not an issue. Can you try to 1)rerun the same. To avoid any network/timeout errors causing interruption , 2) You also seem to have --verbose option, attach log file(remove anything private). If you're copying cross dc some threads running longer time may be killed due to network or cross DC traffic.