Support Questions

Find answers, ask questions, and share your expertise

Exercie 1 ERROR tool.ImportAllTablesTool: Error during import: Import job failed!

avatar
New Contributor

Hi all, I am a complete noob in these tochnologies. Try to import tables from mysql in exerc. 1 but get an error.

 

The command:

 

sqoop import-all-tables \
>     -m 1 \
>     --connect jdbc:mysql://quickstart:3306/retail_db \
>     --username=retail_dba \
>     --password=cloudera \
>     --compression-codec=snappy \
>     --as-parquetfile \
>     --warehouse-dir=/user/hive/warehouse \
>     --hive-import

 

The output:

 

Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/04/16 05:09:52 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.0
16/04/16 05:09:52 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/04/16 05:09:52 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
16/04/16 05:09:52 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
16/04/16 05:09:52 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default
16/04/16 05:09:52 WARN tool.BaseSqoopTool: hive warehouse directory which is not supported. Sqoop is
16/04/16 05:09:52 WARN tool.BaseSqoopTool: firstly importing data into separate directory and then
16/04/16 05:09:52 WARN tool.BaseSqoopTool: inserting data into hive. Please consider removing
16/04/16 05:09:52 WARN tool.BaseSqoopTool: --target-dir or --warehouse-dir into /user/hive/warehouse in
16/04/16 05:09:52 WARN tool.BaseSqoopTool: case that you will detect any issues.
16/04/16 05:09:53 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/04/16 05:09:54 INFO tool.CodeGenTool: Beginning code generation
16/04/16 05:09:54 INFO tool.CodeGenTool: Will generate java class as codegen_categories
16/04/16 05:09:54 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/04/16 05:09:54 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/04/16 05:09:54 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
^C[cloudera@quickstart ~]$ ls /var/lib/sqoop
mysql-connector-java.jar
[cloudera@quickstart ~]$ sqoop import-all-tables     -m 1     --connect jdbc:mysql://quickstart:3306/retail_db     --username=retail_dba     --password=cloudera     --compression-codec=snappy     --as-parquetfile     --warehouse-dir=/user/hive/warehouse     --hive-import
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/04/16 05:11:43 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.0
16/04/16 05:11:43 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/04/16 05:11:43 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
16/04/16 05:11:43 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
16/04/16 05:11:43 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default
16/04/16 05:11:43 WARN tool.BaseSqoopTool: hive warehouse directory which is not supported. Sqoop is
16/04/16 05:11:43 WARN tool.BaseSqoopTool: firstly importing data into separate directory and then
16/04/16 05:11:43 WARN tool.BaseSqoopTool: inserting data into hive. Please consider removing
16/04/16 05:11:43 WARN tool.BaseSqoopTool: --target-dir or --warehouse-dir into /user/hive/warehouse in
16/04/16 05:11:43 WARN tool.BaseSqoopTool: case that you will detect any issues.
16/04/16 05:11:44 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/04/16 05:11:45 INFO tool.CodeGenTool: Beginning code generation
16/04/16 05:11:45 INFO tool.CodeGenTool: Will generate java class as codegen_categories
16/04/16 05:11:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/04/16 05:11:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/04/16 05:11:45 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/5c05ec7d71902ec0786ad83933e2419b/codegen_categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/04/16 05:11:52 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/5c05ec7d71902ec0786ad83933e2419b/codegen_categories.jar
16/04/16 05:11:52 WARN manager.MySQLManager: It looks like you are importing from mysql.
16/04/16 05:11:52 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
16/04/16 05:11:52 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
16/04/16 05:11:52 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
16/04/16 05:11:52 INFO mapreduce.ImportJobBase: Beginning import of categories
16/04/16 05:11:52 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
16/04/16 05:11:53 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/04/16 05:11:56 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/04/16 05:11:56 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/04/16 05:11:59 INFO hive.metastore: Trying to connect to metastore with URI thrift://127.0.0.1:9083
16/04/16 05:11:59 INFO hive.metastore: Opened a connection to metastore, current connections: 1
16/04/16 05:11:59 INFO hive.metastore: Connected to metastore.
16/04/16 05:12:00 WARN mapreduce.DataDrivenImportJob: Target Hive table 'categories' exists! Sqoop will append data into the existing Hive table. Consider using --hive-overwrite, if you do NOT intend to do appending.
16/04/16 05:12:02 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/04/16 05:12:03 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/04/16 05:12:21 INFO db.DBInputFormat: Using read commited transaction isolation
16/04/16 05:12:21 INFO mapreduce.JobSubmitter: number of splits:1
16/04/16 05:12:21 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460806549439_0001
16/04/16 05:12:25 INFO impl.YarnClientImpl: Submitted application application_1460806549439_0001
16/04/16 05:12:25 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1460806549439_0001/
16/04/16 05:12:25 INFO mapreduce.Job: Running job: job_1460806549439_0001
16/04/16 05:13:14 INFO mapreduce.Job: Job job_1460806549439_0001 running in uber mode : false
16/04/16 05:13:14 INFO mapreduce.Job:  map 0% reduce 0%
16/04/16 05:13:43 INFO mapreduce.Job:  map 100% reduce 0%
16/04/16 05:13:45 INFO mapreduce.Job: Job job_1460806549439_0001 completed successfully
16/04/16 05:13:46 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=205148
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=5685
		HDFS: Number of bytes written=3446
		HDFS: Number of read operations=48
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=10
	Job Counters 
		Launched map tasks=1
		Other local map tasks=1
		Total time spent by all maps in occupied slots (ms)=21529
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=21529
		Total vcore-seconds taken by all map tasks=21529
		Total megabyte-seconds taken by all map tasks=22045696
	Map-Reduce Framework
		Map input records=58
		Map output records=58
		Input split bytes=87
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=586
		CPU time spent (ms)=4920
		Physical memory (bytes) snapshot=171204608
		Virtual memory (bytes) snapshot=1522245632
		Total committed heap usage (bytes)=60751872
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=0
16/04/16 05:13:46 INFO mapreduce.ImportJobBase: Transferred 3.3652 KB in 103.0853 seconds (33.4286 bytes/sec)
16/04/16 05:13:46 INFO mapreduce.ImportJobBase: Retrieved 58 records.
16/04/16 05:13:46 INFO tool.CodeGenTool: Beginning code generation
16/04/16 05:13:46 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1
16/04/16 05:13:46 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/5c05ec7d71902ec0786ad83933e2419b/codegen_categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/04/16 05:13:49 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/5c05ec7d71902ec0786ad83933e2419b/codegen_categories.jar
16/04/16 05:13:49 INFO mapreduce.ImportJobBase: Beginning import of customers
16/04/16 05:13:49 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
16/04/16 05:13:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1
16/04/16 05:13:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1
16/04/16 05:13:50 WARN mapreduce.DataDrivenImportJob: Target Hive table 'customers' exists! Sqoop will append data into the existing Hive table. Consider using --hive-overwrite, if you do NOT intend to do appending.
16/04/16 05:13:52 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/04/16 05:14:07 INFO db.DBInputFormat: Using read commited transaction isolation
16/04/16 05:14:07 INFO mapreduce.JobSubmitter: number of splits:1
16/04/16 05:14:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460806549439_0002
16/04/16 05:14:09 INFO impl.YarnClientImpl: Submitted application application_1460806549439_0002
16/04/16 05:14:09 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1460806549439_0002/
16/04/16 05:14:09 INFO mapreduce.Job: Running job: job_1460806549439_0002
16/04/16 05:14:48 INFO mapreduce.Job: Job job_1460806549439_0002 running in uber mode : false
16/04/16 05:14:48 INFO mapreduce.Job:  map 0% reduce 0%
16/04/16 05:15:21 INFO mapreduce.Job: Task Id : attempt_1460806549439_0002_m_000000_0, Status : FAILED
Error: org.kitesdk.data.DatasetOperationException: Failed to append {"customer_id": 1, "customer_fname": "Richard", "customer_lname": "Hernandez", "customer_email": "XXXXXXXXX", "customer_password": "XXXXXXXXX", "customer_street": "6303 Heather Plaza", "customer_city": "Brownsville", "customer_state": "TX", "customer_zipcode": "78521"} to ParquetAppender{path=hdfs://quickstart.cloudera:8020/tmp/default/.temp/job_1460806549439_0002/mr/attempt_1460806549439_0002_m_000000_0/.29f12956-fca1-43a0-b212-28879220a322.parquet.tmp, schema={"type":"record","name":"customers","fields":[{"name":"id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"email_preferences","type":["null",{"type":"record","name":"email_preferences","fields":[{"name":"email_format","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"frequency","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"categories","type":["null",{"type":"record","name":"categories","fields":[{"name":"promos","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null},{"name":"surveys","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null}]}],"default":null}]}],"default":null},{"name":"addresses","type":["null",{"type":"map","values":["null",{"type":"record","name":"addresses","fields":[{"name":"street_1","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"street_2","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"city","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"state","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"zip_code","type":["null","string"],"doc":"Converted from 'string'","default":null}]}]}],"doc":"Converted from 'map<string,struct<street_1:string,street_2:string,city:string,state:string,zip_code:string>>'","default":null},{"name":"orders","type":["null",{"type":"array","items":["null",{"type":"record","name":"orders","fields":[{"name":"order_id","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"order_date","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"items","type":["null",{"type":"array","items":["null",{"type":"record","name":"items","fields":[{"name":"product_id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"sku","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"price","type":["null","double"],"doc":"Converted from 'double'","default":null},{"name":"qty","type":["null","int"],"doc":"Converted from 'int'","default":null}]}]}],"doc":"Converted from 'array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>'","default":null}]}]}],"doc":"Converted from 'array<struct<order_id:string,order_date:string,items:array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>>>'","default":null}]}, fileSystem=DFS[DFSClient[clientName=DFSClient_attempt_1460806549439_0002_m_000000_0_373030920_1, ugi=cloudera (auth:SIMPLE)]], avroParquetWriter=parquet.avro.AvroParquetWriter@1210e8f1}
	at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:184)
	at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:325)
	at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:304)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
	at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
	at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:70)
	at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:39)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.avro.generic.IndexedRecord
	at org.apache.avro.generic.GenericData.getField(GenericData.java:658)
	at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:164)
	at parquet.avro.AvroWriteSupport.writeRecord(AvroWriteSupport.java:149)
	at parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:262)
	at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
	at parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
	at parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:116)
	at parquet.hadoop.ParquetWriter.write(ParquetWriter.java:324)
	at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:75)
	at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:36)
	at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:178)
	... 16 more

16/04/16 05:15:46 INFO mapreduce.Job: Task Id : attempt_1460806549439_0002_m_000000_1, Status : FAILED
Error: org.kitesdk.data.DatasetOperationException: Failed to append {"customer_id": 1, "customer_fname": "Richard", "customer_lname": "Hernandez", "customer_email": "XXXXXXXXX", "customer_password": "XXXXXXXXX", "customer_street": "6303 Heather Plaza", "customer_city": "Brownsville", "customer_state": "TX", "customer_zipcode": "78521"} to ParquetAppender{path=hdfs://quickstart.cloudera:8020/tmp/default/.temp/job_1460806549439_0002/mr/attempt_1460806549439_0002_m_000000_1/.6568bad7-f7d6-4ae9-86cd-4c1c30b5e3bc.parquet.tmp, schema={"type":"record","name":"customers","fields":[{"name":"id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"email_preferences","type":["null",{"type":"record","name":"email_preferences","fields":[{"name":"email_format","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"frequency","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"categories","type":["null",{"type":"record","name":"categories","fields":[{"name":"promos","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null},{"name":"surveys","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null}]}],"default":null}]}],"default":null},{"name":"addresses","type":["null",{"type":"map","values":["null",{"type":"record","name":"addresses","fields":[{"name":"street_1","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"street_2","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"city","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"state","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"zip_code","type":["null","string"],"doc":"Converted from 'string'","default":null}]}]}],"doc":"Converted from 'map<string,struct<street_1:string,street_2:string,city:string,state:string,zip_code:string>>'","default":null},{"name":"orders","type":["null",{"type":"array","items":["null",{"type":"record","name":"orders","fields":[{"name":"order_id","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"order_date","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"items","type":["null",{"type":"array","items":["null",{"type":"record","name":"items","fields":[{"name":"product_id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"sku","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"price","type":["null","double"],"doc":"Converted from 'double'","default":null},{"name":"qty","type":["null","int"],"doc":"Converted from 'int'","default":null}]}]}],"doc":"Converted from 'array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>'","default":null}]}]}],"doc":"Converted from 'array<struct<order_id:string,order_date:string,items:array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>>>'","default":null}]}, fileSystem=DFS[DFSClient[clientName=DFSClient_attempt_1460806549439_0002_m_000000_1_-158232601_1, ugi=cloudera (auth:SIMPLE)]], avroParquetWriter=parquet.avro.AvroParquetWriter@7ccb7779}
	at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:184)
	at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:325)
	at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:304)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
	at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
	at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:70)
	at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:39)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.avro.generic.IndexedRecord
	at org.apache.avro.generic.GenericData.getField(GenericData.java:658)
	at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:164)
	at parquet.avro.AvroWriteSupport.writeRecord(AvroWriteSupport.java:149)
	at parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:262)
	at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
	at parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
	at parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:116)
	at parquet.hadoop.ParquetWriter.write(ParquetWriter.java:324)
	at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:75)
	at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:36)
	at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:178)
	... 16 more

16/04/16 05:16:08 INFO mapreduce.Job: Task Id : attempt_1460806549439_0002_m_000000_2, Status : FAILED
Error: org.kitesdk.data.DatasetOperationException: Failed to append {"customer_id": 1, "customer_fname": "Richard", "customer_lname": "Hernandez", "customer_email": "XXXXXXXXX", "customer_password": "XXXXXXXXX", "customer_street": "6303 Heather Plaza", "customer_city": "Brownsville", "customer_state": "TX", "customer_zipcode": "78521"} to ParquetAppender{path=hdfs://quickstart.cloudera:8020/tmp/default/.temp/job_1460806549439_0002/mr/attempt_1460806549439_0002_m_000000_2/.b9e12c70-c233-4207-98f6-aab3203e6d51.parquet.tmp, schema={"type":"record","name":"customers","fields":[{"name":"id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"email_preferences","type":["null",{"type":"record","name":"email_preferences","fields":[{"name":"email_format","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"frequency","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"categories","type":["null",{"type":"record","name":"categories","fields":[{"name":"promos","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null},{"name":"surveys","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null}]}],"default":null}]}],"default":null},{"name":"addresses","type":["null",{"type":"map","values":["null",{"type":"record","name":"addresses","fields":[{"name":"street_1","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"street_2","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"city","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"state","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"zip_code","type":["null","string"],"doc":"Converted from 'string'","default":null}]}]}],"doc":"Converted from 'map<string,struct<street_1:string,street_2:string,city:string,state:string,zip_code:string>>'","default":null},{"name":"orders","type":["null",{"type":"array","items":["null",{"type":"record","name":"orders","fields":[{"name":"order_id","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"order_date","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"items","type":["null",{"type":"array","items":["null",{"type":"record","name":"items","fields":[{"name":"product_id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"sku","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"price","type":["null","double"],"doc":"Converted from 'double'","default":null},{"name":"qty","type":["null","int"],"doc":"Converted from 'int'","default":null}]}]}],"doc":"Converted from 'array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>'","default":null}]}]}],"doc":"Converted from 'array<struct<order_id:string,order_date:string,items:array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>>>'","default":null}]}, fileSystem=DFS[DFSClient[clientName=DFSClient_attempt_1460806549439_0002_m_000000_2_-1299212121_1, ugi=cloudera (auth:SIMPLE)]], avroParquetWriter=parquet.avro.AvroParquetWriter@2218c2e5}
	at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:184)
	at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:325)
	at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:304)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
	at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
	at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:70)
	at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:39)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.avro.generic.IndexedRecord
	at org.apache.avro.generic.GenericData.getField(GenericData.java:658)
	at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:164)
	at parquet.avro.AvroWriteSupport.writeRecord(AvroWriteSupport.java:149)
	at parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:262)
	at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167)
	at parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142)
	at parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:116)
	at parquet.hadoop.ParquetWriter.write(ParquetWriter.java:324)
	at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:75)
	at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:36)
	at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:178)
	... 16 more

16/04/16 05:16:35 INFO mapreduce.Job:  map 100% reduce 0%
16/04/16 05:16:37 INFO mapreduce.Job: Job job_1460806549439_0002 failed with state FAILED due to: Task failed task_1460806549439_0002_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

16/04/16 05:16:37 INFO mapreduce.Job: Counters: 8
	Job Counters 
		Failed map tasks=4
		Launched map tasks=4
		Other local map tasks=4
		Total time spent by all maps in occupied slots (ms)=92115
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=92115
		Total vcore-seconds taken by all map tasks=92115
		Total megabyte-seconds taken by all map tasks=94325760
16/04/16 05:16:37 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
16/04/16 05:16:37 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 164.838 seconds (0 bytes/sec)
16/04/16 05:16:37 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
16/04/16 05:16:37 INFO mapreduce.ImportJobBase: Retrieved 0 records.
16/04/16 05:16:37 ERROR tool.ImportAllTablesTool: Error during import: Import job failed!
1 ACCEPTED SOLUTION

avatar
New Contributor

Reimported Virtualbox appliance and now it works.

View solution in original post

3 REPLIES 3

avatar
New Contributor

Reimported Virtualbox appliance and now it works.

avatar
Contributor

Delete the old VM and then start new one.It worked for me.

avatar
New Contributor
wanted to say "now it works".