Reply
New Contributor
Posts: 3
Registered: ‎05-21-2016

Tutorial exercise 1: Problem ingesting structured data using sqoop does not work

Hi, the following sqoop command does not work in VM 5.7, can anyone help?

 


[cloudera@quickstart ~]$ sqoop import-all-tables \
-m 1 \
--connect jdbc:mysql://quickstart:3306/retail_db \
--username=retail_dba \
--password=cloudera \
--compression-codec=snappy \
--as-parquetfile \
--warehouse-dir=/user/hive/warehouse \
--hive-import

 

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

16/05/21 01:52:17 INFO mapreduce.ImportJobBase: Transferred 3.3652 KB in 57.8114 seconds (59.6077 bytes/sec)
16/05/21 01:52:17 INFO mapreduce.ImportJobBase: Retrieved 58 records.
16/05/21 01:52:17 INFO tool.CodeGenTool: Beginning code generation
16/05/21 01:52:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1
16/05/21 01:52:17 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/2c4f23186b50638117ee1594fae3977f/codegen_categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/05/21 01:52:19 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/2c4f23186b50638117ee1594fae3977f/codegen_categories.jar
16/05/21 01:52:19 INFO mapreduce.ImportJobBase: Beginning import of customers
16/05/21 01:52:19 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
16/05/21 01:52:19 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1
16/05/21 01:52:19 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1
16/05/21 01:52:19 WARN mapreduce.DataDrivenImportJob: Target Hive table 'customers' exists! Sqoop will append data into the existing Hive table. Consider using --hive-overwrite, if you do NOT intend to do appending.
16/05/21 01:52:20 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/05/21 01:52:28 INFO db.DBInputFormat: Using read commited transaction isolation
16/05/21 01:52:28 INFO mapreduce.JobSubmitter: number of splits:1
16/05/21 01:52:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1463818069837_0006
16/05/21 01:52:29 INFO impl.YarnClientImpl: Submitted application application_1463818069837_0006
16/05/21 01:52:29 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1463818069837_0006/
16/05/21 01:52:29 INFO mapreduce.Job: Running job: job_1463818069837_0006
16/05/21 01:52:57 INFO mapreduce.Job: Job job_1463818069837_0006 running in uber mode : false
16/05/21 01:52:57 INFO mapreduce.Job: map 0% reduce 0%
16/05/21 01:53:13 INFO mapreduce.Job: Task Id : attempt_1463818069837_0006_m_000000_0, Status : FAILED
Error: org.kitesdk.data.DatasetOperationException: Failed to append {"customer_id": 1, "customer_fname": "Richard", "customer_lname": "Hernandez", "customer_email": "XXXXXXXXX", "customer_password": "XXXXXXXXX", "customer_street": "6303 Heather Plaza", "customer_city": "Brownsville", "customer_state": "TX", "customer_zipcode": "78521"} to ParquetAppender{path=hdfs://quickstart.cloudera:8020/tmp/default/.temp/job_1463818069837_0006/mr/attempt_1463818069837_0006_m_000000_0/.cccefc21-ec3f-4a22-94c1-c1dfc2cf088b.parquet.tmp, schema={"type":"record","name":"customers","fields":[{"name":"id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"email_preferences","type":["null",{"type":"record","name":"email_preferences","fields":[{"name":"email_format","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"frequency","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"categories","type":["null",{"type":"record","name":"categories","fields":[{"name":"promos","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null},{"name":"surveys","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null}]}],"default":null}]}],"default":null},{"name":"addresses","type":["null",{"type":"map","values":["null",{"type":"record","name":"addresses","fields":[{"name":"street_1","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"street_2","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"city","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"state","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"zip_code","type":["null","string"],"doc":"Converted from 'string'","default":null}]}]}],"doc":"Converted from 'map<string,struct<street_1:string,street_2:string,city:string,state:string,zip_code:string>>'","default":null},{"name":"orders","type":["null",{"type":"array","items":["null",{"type":"record","name":"orders","fields":[{"name":"order_id","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"order_date","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"items","type":["null",{"type":"array","items":["null",{"type":"record","name":"items","fields":[{"name":"product_id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"sku","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"price","type":["null","double"],"doc":"Converted from 'double'","default":null},{"name":"qty","type":["null","int"],"doc":"Converted from 'int'","default":null}]}]}],"doc":"Converted from 'array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>'","default":null}]}]}],"doc":"Converted from 'array<struct<order_id:string,order_date:string,items:array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>>>'","default":null}]}, fileSystem=DFS[DFSClient[clientName=DFSClient_attempt_1463818069837_0006_m_000000_0_-1719628224_1, ugi=cloudera (auth:SIMPLE)]], avroParquetWriter=parquet.avro.AvroParquetWriter@98a0bfa}
at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:184)
at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:325)
at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:304)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:70)
at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.avro.generic.IndexedRecord
at org.apache.avro.generic.GenericData.getField(GenericData.java:658)
at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:164)
at parquet.avro.AvroWriteSupport.writeRecord(AvroWriteSupport.java:149)
at parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:262)
at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
at parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
at parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:116)
at parquet.hadoop.ParquetWriter.write(ParquetWriter.java:324)
at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:75)
at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:36)
at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:178)
... 16 more

 

Highlighted
Cloudera Employee
Posts: 435
Registered: ‎07-12-2013

Re: Tutorial exercise 1: Problem ingesting structured data using sqoop does not work

Try adding the --hive-overwrite flag. The error you posted indicates it's
failing because of a previous failed attempt that hasn't been cleaned up.
There was likely some other failure that occurred the first time, but
there's no telling what it was from this specific log.