Support Questions

ksuresh · ‎03-22-2017

I am trying to a copy an Avro schema file from Teradata to Hdfs using sqoop, but the import job is failing with the below error:

sqoop import --libjars "SQOOP_HOME/lib/avro-mapred-1.7.5-hadoop2.jar,SQOOP_HOME/lib/avro-mapred-1.7.4-hadoop2.jar,SQOOP_HOME/lib/paranamer-2.3.jar" --connect jdbc:teradata://xx.xx.xx.xxx/Database=xxxx --connection-manager org.apache.sqoop.teradata.TeradataConnManager --username xxx --password xxx --table xx --target-dir xx --as-avrodatafile -m 1 -- --usexview --accesslock --avroschemafile xx.avsc
INFO impl.YarnClientImpl: Submitted application application_1455051872611_0127
INFO mapreduce.Job: The url to track the job: http://teradata-sqoop-ks-re-sec-4.novalocal:8088/proxy/application_1455051872611_0127/
INFO mapreduce.Job: Running job: job_1455051872611_0127
INFO mapreduce.Job: Job job_1455051872611_0127 running in uber mode : false
INFO mapreduce.Job:  map 0% reduce 0%
INFO mapreduce.Job:  map 100% reduce 0%
INFO mapreduce.Job: Task Id : attempt_1455051872611_0127_m_000000_0, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
 INFO mapreduce.Job:  map 0% reduce 0%
 INFO mapreduce.Job: Task Id : attempt_1455051872611_0127_m_000000_1, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
INFO mapreduce.Job: Task Id : attempt_1455051872611_0127_m_000000_2, Status : FAILED
Error: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
INFO mapreduce.Job:  map 100% reduce 0%
INFO mapreduce.Job: Job job_1455051872611_0127 failed with state FAILED due to: Task failed task_1455051872611_0127_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
.

.

.
INFO processor.TeradataInputProcessor: input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByHashProcessor starts at:  1455147607714
INFO processor.TeradataInputProcessor: input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByHashProcessor ends at:  1455147607714
INFO processor.TeradataInputProcessor: the total elapsed time of input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByHashProcessor is: 0s
INFO teradata.TeradataSqoopImportHelper: Teradata import job completed with exit code 1
ERROR tool.ImportTool: Error during import: Import Job failed

FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
	at org.apache.avro.mapreduce.AvroKeyRecordWriter.<init>(AvroKeyRecordWriter.java:53)
	at org.apache.avro.mapreduce.AvroKeyOutputFormat$RecordWriterFactory.create(AvroKeyOutputFormat.java:78)
	at org.apache.avro.mapreduce.AvroKeyOutputFormat.getRecordWriter(AvroKeyOutputFormat.java:104)
	at com.teradata.connector.hdfs.HdfsAvroOutputFormat.getRecordWriter(HdfsAvroOutputFormat.java:49)
	at com.teradata.connector.common.ConnectorOutputFormat$ConnectorFileRecordWriter.<init>(ConnectorOutputFormat.java:89)
	at com.teradata.connector.common.ConnectorOutputFormat.getRecordWriter(ConnectorOutputFormat.java:38)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:647)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

ksuresh · ‎03-23-2017

Found this on the Hortonworks Teradata Connector support doc :

If you will run Avro jobs, download avro-mapred-1.7.4-hadoop2.jar and place it under $SQOOP_HOME/lib.

I had two versions of avro jars in my $SQOOP_HOME/lib, upon removing all others except the avro-mapred-1.7.4-hadoop2.jar, import succeeded.

View solution in original post

ksuresh · ‎03-23-2017

Found this on the Hortonworks Teradata Connector support doc :

If you will run Avro jobs, download avro-mapred-1.7.4-hadoop2.jar and place it under $SQOOP_HOME/lib.

I had two versions of avro jars in my $SQOOP_HOME/lib, upon removing all others except the avro-mapred-1.7.4-hadoop2.jar, import succeeded.

bandalora · ‎03-24-2017

@ksuresh, thanks for posting your resolution. I work on the Sqoop documentation and will look into whether we need to add a comment about removing other jars from /lib.

vranganathan · ‎03-24-2017

@ksuresh

Thanks for catching the issue in the doc. You can either remove the conflicting avro files (sqoop in HDP 2.5 and 2.6 ships with avro 1.8.0 jar files) or you can add the following to the sqoop command line and run.

sqoop import -Dmapreduce.job.user.classpath.first=true <rest of the arguments>

Beverley

Yes, it would be good to mention this in the docs.

aervits · ‎03-26-2017

Does it mean with HDP 2.5+ we support date type in Avro 1.8+? Because that would be awesome Venkat.

Cloudera Community

Support Questions

Sqoop import of Avro file from Teradata to hdfs.

HDFS to Teradata - example

Sqoop import from Teradata using hive-import into ...

Sqoop : Teradata to HDFS using PARQUET file format...

Sqoop import to avro failing - which jars to be us...

Sqoop import - null values in HDFS files replaced ...

Oozie Sqoop actions fails when importing data from...

SQoop job too slow importing data from Teradata to...

sqoop import issue

Sqoop import using last-value from a hdfs file

sqoop import all tables into HDFS