Support Questions

Find answers, ask questions, and share your expertise

Sqoop import does not work anymore

avatar
Contributor

Hello,

We have an hadoop cluster with 3 nodes in which we had a sqoop import job that worked very well until few days ago.
The number of files of the external table in 999 files (is it a maximum number ?).

This is the import command :

sqoop import -D oraoop.locations=hadop202.mickey.int -D mapred.map.max.attempts=1 -D oraoop.import.consistent.read=false -D oraoop.timestamp.string=false --connect jdbc:oracle:thin:@//CRIDB101:1521/appli --username sqoop -password '******' --table=ATLAS_STATS_20171114 --columns=APPLICATION,USERNAME,OFFICE,STAT_TYPE,STAT_KEY,STAT_INFO,TIME_STAMP,REQUESTER,DETAIL_INFO_1,DETAIL_INFO_2,DETAIL_INFO_3,DETAIL_INFO_4,OWNER,STATS_ID,DB_NAME,PARAMS --where "sqoop = 'Z'" --hcatalog-database=crim --hcatalog-table=atlas_stats_clob --num-mappers=2 --split-by=TIME_STAMP

and this the error we get :

17/11/14 16:17:31 INFO mapreduce.Job: Job job_1510660800260_0022 failed with state FAILED due to: Job commit failed: org.apache.hive.hcatalog.common.HCatException : 2012 : Moving of data failed during commit : Could not find a unique destination path for move: file = hdfs://hadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob/_SCRATCH0.46847233143209766/part-m-00000 , src = hdfs://hadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob/_SCRATCH0.46847233143209766, dest = hdfs://hadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob

Thanks for help

1 ACCEPTED SOLUTION

avatar
Contributor

External table with 999 files was the problem

View solution in original post

5 REPLIES 5

avatar
Contributor

I did an export/import in another table:

export table atlas_stats_clob to '/data/hive/export/';
import table atlas_imported from '/data/hive/export/data/';

and try again with the same sqoop import option in the new table:

sqoop import-D oraoop.locations=hadop202.mickey.int-D mapred.map.max.attempts=1-D oraoop.import.consistent.read=false-D oraoop.timestamp.string=false--connect jdbc:oracle:thin:@//CRIDB101:1521/appli --username sqoop -password '******' --table=ATLAS_STATS_20171114 --columns=APPLICATION,USERNAME,OFFICE,STAT_TYPE,STAT_KEY,STAT_INFO,TIME_STAMP,REQUESTER,DETAIL_INFO_1,DETAIL_INFO_2,DETAIL_INFO_3,DETAIL_INFO_4,OWNER,STATS_ID,DB_NAME,PARAMS --where "sqoop = 'Z'" --hcatalog-database=crim --hcatalog-table=atlas_imported --num-mappers=2 --split-by=TIME_STAMP

but I have the same issue, is there a limit in the number of files for a table? :

avatar
Contributor

Hello, I think that the problem come from a threshold of 1000 written in FileOutputCommitterContainer.java :

indeed:

on one side I have this error

…17/10/21 02:12:45 INFO mapreduce.Job: Job job_1505194606915_0236 failed with state FAILED due to: Job commit failed: org.apache.hive.hcatalog.common.HCatException : 2012 : Moving of data failed during commit : Could not find a unique destination path for move: file = hdfs://vpbshadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob/_SCRATCH0.04665097541205321/part-m-00000 , src = hdfs://vpbshadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob/_SCRATCH0.04665097541205321, dest = hdfs://vpbshadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob  at org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.getFinalPath(FileOutputCommitterContainer.java:662)  at org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.moveTaskOutputs(FileOutputCommitterContainer.java:515)…

and in FileOutputCommitterContainer.java, I can see

Could not find a unique destination path for move

when counter = maxAppendAttempts = APPEND_COUNTER_WARN_THRESHOLD = 1000

and in the other side I have:

hdfs dfs -ls /data/hive/crim.db/atlas_stats_clob/part-m* | wc -l

999

is there a way to increase this threshold ?

avatar
Contributor

External table with 999 files was the problem

avatar
Rising Star

Could you try that command with this param "--create-hcatalog-table" ?

avatar
Explorer

Hi,

recently we ran into the same problem after few years of successful imports.
Did you maybe find a way to overcome the problem?