Support Questions
Find answers, ask questions, and share your expertise

Sqoop import of Avro file from Teradata to hdfs.

Solved Go to solution
Highlighted

Sqoop import of Avro file from Teradata to hdfs.

Explorer

I am trying to a copy an Avro schema file from Teradata to Hdfs using sqoop, but the import job is failing with the below error:

sqoop import --libjars "SQOOP_HOME/lib/avro-mapred-1.7.5-hadoop2.jar,SQOOP_HOME/lib/avro-mapred-1.7.4-hadoop2.jar,SQOOP_HOME/lib/paranamer-2.3.jar" --connect jdbc:teradata://xx.xx.xx.xxx/Database=xxxx --connection-manager org.apache.sqoop.teradata.TeradataConnManager --username xxx --password xxx --table xx --target-dir xx --as-avrodatafile -m 1 -- --usexview --accesslock --avroschemafile xx.avsc
INFO impl.YarnClientImpl: Submitted application application_1455051872611_0127
INFO mapreduce.Job: The url to track the job: http://teradata-sqoop-ks-re-sec-4.novalocal:8088/proxy/application_1455051872611_0127/
INFO mapreduce.Job: Running job: job_1455051872611_0127
INFO mapreduce.Job: Job job_1455051872611_0127 running in uber mode : false
INFO mapreduce.Job:  map 0% reduce 0%
INFO mapreduce.Job:  map 100% reduce 0%
INFO mapreduce.Job: Task Id : attempt_1455051872611_0127_m_000000_0, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
 INFO mapreduce.Job:  map 0% reduce 0%
 INFO mapreduce.Job: Task Id : attempt_1455051872611_0127_m_000000_1, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
INFO mapreduce.Job: Task Id : attempt_1455051872611_0127_m_000000_2, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
INFO mapreduce.Job:  map 100% reduce 0%
INFO mapreduce.Job: Job job_1455051872611_0127 failed with state FAILED due to: Task failed task_1455051872611_0127_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
.

.

.
INFO processor.TeradataInputProcessor: input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByHashProcessor starts at:  1455147607714
INFO processor.TeradataInputProcessor: input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByHashProcessor ends at:  1455147607714
INFO processor.TeradataInputProcessor: the total elapsed time of input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByHashProcessor is: 0s
INFO teradata.TeradataSqoopImportHelper: Teradata import job completed with exit code 1
ERROR tool.ImportTool: Error during import: Import Job failed
FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
	at org.apache.avro.mapreduce.AvroKeyRecordWriter.<init>(AvroKeyRecordWriter.java:53)
	at org.apache.avro.mapreduce.AvroKeyOutputFormat$RecordWriterFactory.create(AvroKeyOutputFormat.java:78)
	at org.apache.avro.mapreduce.AvroKeyOutputFormat.getRecordWriter(AvroKeyOutputFormat.java:104)
	at com.teradata.connector.hdfs.HdfsAvroOutputFormat.getRecordWriter(HdfsAvroOutputFormat.java:49)
	at com.teradata.connector.common.ConnectorOutputFormat$ConnectorFileRecordWriter.<init>(ConnectorOutputFormat.java:89)
	at com.teradata.connector.common.ConnectorOutputFormat.getRecordWriter(ConnectorOutputFormat.java:38)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:647)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Sqoop import of Avro file from Teradata to hdfs.

Explorer

Found this on the Hortonworks Teradata Connector support doc :

  • If you will run Avro jobs, download avro-mapred-1.7.4-hadoop2.jar and place it under $SQOOP_HOME/lib.

I had two versions of avro jars in my $SQOOP_HOME/lib, upon removing all others except the avro-mapred-1.7.4-hadoop2.jar, import succeeded.

View solution in original post

4 REPLIES 4
Highlighted

Re: Sqoop import of Avro file from Teradata to hdfs.

Explorer

Found this on the Hortonworks Teradata Connector support doc :

  • If you will run Avro jobs, download avro-mapred-1.7.4-hadoop2.jar and place it under $SQOOP_HOME/lib.

I had two versions of avro jars in my $SQOOP_HOME/lib, upon removing all others except the avro-mapred-1.7.4-hadoop2.jar, import succeeded.

View solution in original post

Re: Sqoop import of Avro file from Teradata to hdfs.

Rising Star

@ksuresh, thanks for posting your resolution. I work on the Sqoop documentation and will look into whether we need to add a comment about removing other jars from /lib.

Highlighted

Re: Sqoop import of Avro file from Teradata to hdfs.

Rising Star
@ksuresh

Thanks for catching the issue in the doc. You can either remove the conflicting avro files (sqoop in HDP 2.5 and 2.6 ships with avro 1.8.0 jar files) or you can add the following to the sqoop command line and run.

sqoop import -Dmapreduce.job.user.classpath.first=true <rest of the arguments>

Beverley

Yes, it would be good to mention this in the docs.

Highlighted

Re: Sqoop import of Avro file from Teradata to hdfs.

Mentor

Does it mean with HDP 2.5+ we support date type in Avro 1.8+? Because that would be awesome Venkat.