Support Questions

Find answers, ask questions, and share your expertise

Sqoop import of Avro file from Teradata to hdfs.

avatar
Contributor

I am trying to a copy an Avro schema file from Teradata to Hdfs using sqoop, but the import job is failing with the below error:

sqoop import --libjars "SQOOP_HOME/lib/avro-mapred-1.7.5-hadoop2.jar,SQOOP_HOME/lib/avro-mapred-1.7.4-hadoop2.jar,SQOOP_HOME/lib/paranamer-2.3.jar" --connect jdbc:teradata://xx.xx.xx.xxx/Database=xxxx --connection-manager org.apache.sqoop.teradata.TeradataConnManager --username xxx --password xxx --table xx --target-dir xx --as-avrodatafile -m 1 -- --usexview --accesslock --avroschemafile xx.avsc
INFO impl.YarnClientImpl: Submitted application application_1455051872611_0127
INFO mapreduce.Job: The url to track the job: http://teradata-sqoop-ks-re-sec-4.novalocal:8088/proxy/application_1455051872611_0127/
INFO mapreduce.Job: Running job: job_1455051872611_0127
INFO mapreduce.Job: Job job_1455051872611_0127 running in uber mode : false
INFO mapreduce.Job:  map 0% reduce 0%
INFO mapreduce.Job:  map 100% reduce 0%
INFO mapreduce.Job: Task Id : attempt_1455051872611_0127_m_000000_0, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
 INFO mapreduce.Job:  map 0% reduce 0%
 INFO mapreduce.Job: Task Id : attempt_1455051872611_0127_m_000000_1, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
INFO mapreduce.Job: Task Id : attempt_1455051872611_0127_m_000000_2, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
INFO mapreduce.Job:  map 100% reduce 0%
INFO mapreduce.Job: Job job_1455051872611_0127 failed with state FAILED due to: Task failed task_1455051872611_0127_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
.

.

.
INFO processor.TeradataInputProcessor: input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByHashProcessor starts at:  1455147607714
INFO processor.TeradataInputProcessor: input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByHashProcessor ends at:  1455147607714
INFO processor.TeradataInputProcessor: the total elapsed time of input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByHashProcessor is: 0s
INFO teradata.TeradataSqoopImportHelper: Teradata import job completed with exit code 1
ERROR tool.ImportTool: Error during import: Import Job failed
FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
	at org.apache.avro.mapreduce.AvroKeyRecordWriter.<init>(AvroKeyRecordWriter.java:53)
	at org.apache.avro.mapreduce.AvroKeyOutputFormat$RecordWriterFactory.create(AvroKeyOutputFormat.java:78)
	at org.apache.avro.mapreduce.AvroKeyOutputFormat.getRecordWriter(AvroKeyOutputFormat.java:104)
	at com.teradata.connector.hdfs.HdfsAvroOutputFormat.getRecordWriter(HdfsAvroOutputFormat.java:49)
	at com.teradata.connector.common.ConnectorOutputFormat$ConnectorFileRecordWriter.<init>(ConnectorOutputFormat.java:89)
	at com.teradata.connector.common.ConnectorOutputFormat.getRecordWriter(ConnectorOutputFormat.java:38)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:647)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
1 ACCEPTED SOLUTION

avatar
Contributor

Found this on the Hortonworks Teradata Connector support doc :

  • If you will run Avro jobs, download avro-mapred-1.7.4-hadoop2.jar and place it under $SQOOP_HOME/lib.

I had two versions of avro jars in my $SQOOP_HOME/lib, upon removing all others except the avro-mapred-1.7.4-hadoop2.jar, import succeeded.

View solution in original post

4 REPLIES 4

avatar
Contributor

Found this on the Hortonworks Teradata Connector support doc :

  • If you will run Avro jobs, download avro-mapred-1.7.4-hadoop2.jar and place it under $SQOOP_HOME/lib.

I had two versions of avro jars in my $SQOOP_HOME/lib, upon removing all others except the avro-mapred-1.7.4-hadoop2.jar, import succeeded.

avatar
Expert Contributor

@ksuresh, thanks for posting your resolution. I work on the Sqoop documentation and will look into whether we need to add a comment about removing other jars from /lib.

avatar
Expert Contributor
@ksuresh

Thanks for catching the issue in the doc. You can either remove the conflicting avro files (sqoop in HDP 2.5 and 2.6 ships with avro 1.8.0 jar files) or you can add the following to the sqoop command line and run.

sqoop import -Dmapreduce.job.user.classpath.first=true <rest of the arguments>

Beverley

Yes, it would be good to mention this in the docs.

avatar
Master Mentor

Does it mean with HDP 2.5+ we support date type in Avro 1.8+? Because that would be awesome Venkat.