Created on 03-26-2020 11:42 AM - edited 09-16-2022 07:36 AM
Trying to export a hive table from an encrypted HDFS.
The distcp portion of the export fails. Every file segment throws one of these...
2020-03-26 14:36:17,124 ERROR [IPC Server handler 25 on 37768] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1585157114387_3676_m_000000_0 - exited : java.io.IOException: File copy failed: hdfs://myserver/warehouse/tablespace/tmp/hive/e18e8f44-9d0e-4f10-9490-2703bbdc84e9/_tmp_space.db/7579e917-2de7-4923-887a-df739f3a98e1/000006_0 --> hdfs://myserver/warehouse/tablespace/hive-export/username/tablename/data/000006_0
at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:263)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:220)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:48)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: java.io.IOException: Couldn't run retriable-command: Copying hdfs://myserver/warehouse/tablespace/tmp/hive/e18e8f44-9d0e-4f10-9490-2703bbdc84e9/_tmp_space.db/7579e917-2de7-4923-887a-df739f3a98e1/000006_0 to hdfs://myserver/warehouse/tablespace/hive-export/username/tablename/data/000006_0
at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:259)
... 10 more
Caused by: java.io.IOException: Checksum mismatch between hdfs://myserver/warehouse/tablespace/tmp/hive/e18e8f44-9d0e-4f10-9490-2703bbdc84e9/_tmp_space.db/7579e917-2de7-4923-887a-df739f3a98e1/000006_0 and hdfs://myserver/warehouse/tablespace/hive-export/username/tablename/data/.distcp.tmp.attempt_1585157114387_3676_m_000000_0.
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareCheckSums(RetriableFileCopyCommand.java:261)
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:153)
at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:115)
at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
... 11 more
Created 03-26-2020 01:20 PM
hive > set distcp.options.skipcrccheck=;
hive > export ...blah, blah blah;
success!
Thanks to Cloudera Support, M. Green. Super fast!
Created 03-26-2020 11:47 AM
Should have mentioned: this is my export command (beeline)
0: jdbc:hive2://myhiveserver,d> export table mydatabase.mytable to '/warehouse/tablespace/hive-export/username/tablename';
The HDFS encryption zone is on /warehouse/tablespace
Created 03-26-2020 01:20 PM
hive > set distcp.options.skipcrccheck=;
hive > export ...blah, blah blah;
success!
Thanks to Cloudera Support, M. Green. Super fast!