Support Questions
Find answers, ask questions, and share your expertise

sqoop export fails when mapper more than 1


When I run below sqoop command with -num-mapper 10, while it succeeeds when -num-mappers is set 1; kindly suggest, how to run the command with -num-mapper 10.

driver=com.sybase.jdbc4.jdbc.SybDriver echo "${driver}" jconnect=jdbc:sybase:Tds:USD01V-SYIQ003:7777/DATABASE=JFORSDEV echo "${jconnect}" sqoop export \ -Dsqoop.export.statements.per.transaction=1000 \ -verbose \ -driver "${driver}" \ -connect "${jconnect}" \ -username=tableau \ -password=Smile123 \ -direct \ -export-dir '/tmp/test' \ -input-lines-terminated-by '\n' \ -input-optionally-enclosed-by '\"' \ -fields-terminated-by '\t' \ -table tableau.sqoopExport_orc \ -columns 'id,first_name,last_name,address' \ -batch \ -num-mappers 10 \ ;

Sqoop export does partial export and fails with error

2018-06-08 04:17:19,385 FATAL [IPC Server handler 11 on 37311] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1528413303562_0066_m_000005_0 - exited : java.sql.BatchUpdateException: JZ0BE: BatchUpdateException: Error occurred while executing batch statement: SQL Anywhere Error -210: User 'another user' has the row in 'sqoopExport_orc' locked
	at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(
	at org.apache.hadoop.mapred.MapTask.runNewMapper(
	at org.apache.hadoop.mapred.YarnChild$
	at Method)

It seems concurrency problem, the row with same primary key is getting updated by two mappers in two different transactions.
Probably changing -Dsqoop.export.statements.per.transaction to 1 can help in shortening your transactions and avoid locks held for a long time.

Hi Ankit,

we tried that as well -Dsqoop.export.statements.per.transaction to 1, but it does not work!!


Mamta Chawla

; ;