Created on 04-16-2016 05:40 AM - edited 09-16-2022 03:14 AM
Hi all, I am a complete noob in these tochnologies. Try to import tables from mysql in exerc. 1 but get an error.
The command:
sqoop import-all-tables \ > -m 1 \ > --connect jdbc:mysql://quickstart:3306/retail_db \ > --username=retail_dba \ > --password=cloudera \ > --compression-codec=snappy \ > --as-parquetfile \ > --warehouse-dir=/user/hive/warehouse \ > --hive-import
The output:
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 16/04/16 05:09:52 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.0 16/04/16 05:09:52 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/04/16 05:09:52 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override 16/04/16 05:09:52 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc. 16/04/16 05:09:52 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default 16/04/16 05:09:52 WARN tool.BaseSqoopTool: hive warehouse directory which is not supported. Sqoop is 16/04/16 05:09:52 WARN tool.BaseSqoopTool: firstly importing data into separate directory and then 16/04/16 05:09:52 WARN tool.BaseSqoopTool: inserting data into hive. Please consider removing 16/04/16 05:09:52 WARN tool.BaseSqoopTool: --target-dir or --warehouse-dir into /user/hive/warehouse in 16/04/16 05:09:52 WARN tool.BaseSqoopTool: case that you will detect any issues. 16/04/16 05:09:53 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 16/04/16 05:09:54 INFO tool.CodeGenTool: Beginning code generation 16/04/16 05:09:54 INFO tool.CodeGenTool: Will generate java class as codegen_categories 16/04/16 05:09:54 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 16/04/16 05:09:54 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 16/04/16 05:09:54 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce ^C[cloudera@quickstart ~]$ ls /var/lib/sqoop mysql-connector-java.jar [cloudera@quickstart ~]$ sqoop import-all-tables -m 1 --connect jdbc:mysql://quickstart:3306/retail_db --username=retail_dba --password=cloudera --compression-codec=snappy --as-parquetfile --warehouse-dir=/user/hive/warehouse --hive-import Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 16/04/16 05:11:43 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.0 16/04/16 05:11:43 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/04/16 05:11:43 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override 16/04/16 05:11:43 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc. 16/04/16 05:11:43 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default 16/04/16 05:11:43 WARN tool.BaseSqoopTool: hive warehouse directory which is not supported. Sqoop is 16/04/16 05:11:43 WARN tool.BaseSqoopTool: firstly importing data into separate directory and then 16/04/16 05:11:43 WARN tool.BaseSqoopTool: inserting data into hive. Please consider removing 16/04/16 05:11:43 WARN tool.BaseSqoopTool: --target-dir or --warehouse-dir into /user/hive/warehouse in 16/04/16 05:11:43 WARN tool.BaseSqoopTool: case that you will detect any issues. 16/04/16 05:11:44 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 16/04/16 05:11:45 INFO tool.CodeGenTool: Beginning code generation 16/04/16 05:11:45 INFO tool.CodeGenTool: Will generate java class as codegen_categories 16/04/16 05:11:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 16/04/16 05:11:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 16/04/16 05:11:45 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce Note: /tmp/sqoop-cloudera/compile/5c05ec7d71902ec0786ad83933e2419b/codegen_categories.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 16/04/16 05:11:52 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/5c05ec7d71902ec0786ad83933e2419b/codegen_categories.jar 16/04/16 05:11:52 WARN manager.MySQLManager: It looks like you are importing from mysql. 16/04/16 05:11:52 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 16/04/16 05:11:52 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 16/04/16 05:11:52 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 16/04/16 05:11:52 INFO mapreduce.ImportJobBase: Beginning import of categories 16/04/16 05:11:52 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 16/04/16 05:11:53 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 16/04/16 05:11:56 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 16/04/16 05:11:56 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 16/04/16 05:11:59 INFO hive.metastore: Trying to connect to metastore with URI thrift://127.0.0.1:9083 16/04/16 05:11:59 INFO hive.metastore: Opened a connection to metastore, current connections: 1 16/04/16 05:11:59 INFO hive.metastore: Connected to metastore. 16/04/16 05:12:00 WARN mapreduce.DataDrivenImportJob: Target Hive table 'categories' exists! Sqoop will append data into the existing Hive table. Consider using --hive-overwrite, if you do NOT intend to do appending. 16/04/16 05:12:02 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 16/04/16 05:12:03 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 16/04/16 05:12:21 INFO db.DBInputFormat: Using read commited transaction isolation 16/04/16 05:12:21 INFO mapreduce.JobSubmitter: number of splits:1 16/04/16 05:12:21 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460806549439_0001 16/04/16 05:12:25 INFO impl.YarnClientImpl: Submitted application application_1460806549439_0001 16/04/16 05:12:25 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1460806549439_0001/ 16/04/16 05:12:25 INFO mapreduce.Job: Running job: job_1460806549439_0001 16/04/16 05:13:14 INFO mapreduce.Job: Job job_1460806549439_0001 running in uber mode : false 16/04/16 05:13:14 INFO mapreduce.Job: map 0% reduce 0% 16/04/16 05:13:43 INFO mapreduce.Job: map 100% reduce 0% 16/04/16 05:13:45 INFO mapreduce.Job: Job job_1460806549439_0001 completed successfully 16/04/16 05:13:46 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=205148 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=5685 HDFS: Number of bytes written=3446 HDFS: Number of read operations=48 HDFS: Number of large read operations=0 HDFS: Number of write operations=10 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=21529 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=21529 Total vcore-seconds taken by all map tasks=21529 Total megabyte-seconds taken by all map tasks=22045696 Map-Reduce Framework Map input records=58 Map output records=58 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=586 CPU time spent (ms)=4920 Physical memory (bytes) snapshot=171204608 Virtual memory (bytes) snapshot=1522245632 Total committed heap usage (bytes)=60751872 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0 16/04/16 05:13:46 INFO mapreduce.ImportJobBase: Transferred 3.3652 KB in 103.0853 seconds (33.4286 bytes/sec) 16/04/16 05:13:46 INFO mapreduce.ImportJobBase: Retrieved 58 records. 16/04/16 05:13:46 INFO tool.CodeGenTool: Beginning code generation 16/04/16 05:13:46 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1 16/04/16 05:13:46 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce Note: /tmp/sqoop-cloudera/compile/5c05ec7d71902ec0786ad83933e2419b/codegen_categories.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 16/04/16 05:13:49 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/5c05ec7d71902ec0786ad83933e2419b/codegen_categories.jar 16/04/16 05:13:49 INFO mapreduce.ImportJobBase: Beginning import of customers 16/04/16 05:13:49 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 16/04/16 05:13:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1 16/04/16 05:13:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1 16/04/16 05:13:50 WARN mapreduce.DataDrivenImportJob: Target Hive table 'customers' exists! Sqoop will append data into the existing Hive table. Consider using --hive-overwrite, if you do NOT intend to do appending. 16/04/16 05:13:52 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 16/04/16 05:14:07 INFO db.DBInputFormat: Using read commited transaction isolation 16/04/16 05:14:07 INFO mapreduce.JobSubmitter: number of splits:1 16/04/16 05:14:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460806549439_0002 16/04/16 05:14:09 INFO impl.YarnClientImpl: Submitted application application_1460806549439_0002 16/04/16 05:14:09 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1460806549439_0002/ 16/04/16 05:14:09 INFO mapreduce.Job: Running job: job_1460806549439_0002 16/04/16 05:14:48 INFO mapreduce.Job: Job job_1460806549439_0002 running in uber mode : false 16/04/16 05:14:48 INFO mapreduce.Job: map 0% reduce 0% 16/04/16 05:15:21 INFO mapreduce.Job: Task Id : attempt_1460806549439_0002_m_000000_0, Status : FAILED Error: org.kitesdk.data.DatasetOperationException: Failed to append {"customer_id": 1, "customer_fname": "Richard", "customer_lname": "Hernandez", "customer_email": "XXXXXXXXX", "customer_password": "XXXXXXXXX", "customer_street": "6303 Heather Plaza", "customer_city": "Brownsville", "customer_state": "TX", "customer_zipcode": "78521"} to ParquetAppender{path=hdfs://quickstart.cloudera:8020/tmp/default/.temp/job_1460806549439_0002/mr/attempt_1460806549439_0002_m_000000_0/.29f12956-fca1-43a0-b212-28879220a322.parquet.tmp, schema={"type":"record","name":"customers","fields":[{"name":"id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"email_preferences","type":["null",{"type":"record","name":"email_preferences","fields":[{"name":"email_format","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"frequency","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"categories","type":["null",{"type":"record","name":"categories","fields":[{"name":"promos","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null},{"name":"surveys","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null}]}],"default":null}]}],"default":null},{"name":"addresses","type":["null",{"type":"map","values":["null",{"type":"record","name":"addresses","fields":[{"name":"street_1","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"street_2","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"city","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"state","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"zip_code","type":["null","string"],"doc":"Converted from 'string'","default":null}]}]}],"doc":"Converted from 'map<string,struct<street_1:string,street_2:string,city:string,state:string,zip_code:string>>'","default":null},{"name":"orders","type":["null",{"type":"array","items":["null",{"type":"record","name":"orders","fields":[{"name":"order_id","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"order_date","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"items","type":["null",{"type":"array","items":["null",{"type":"record","name":"items","fields":[{"name":"product_id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"sku","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"price","type":["null","double"],"doc":"Converted from 'double'","default":null},{"name":"qty","type":["null","int"],"doc":"Converted from 'int'","default":null}]}]}],"doc":"Converted from 'array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>'","default":null}]}]}],"doc":"Converted from 'array<struct<order_id:string,order_date:string,items:array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>>>'","default":null}]}, fileSystem=DFS[DFSClient[clientName=DFSClient_attempt_1460806549439_0002_m_000000_0_373030920_1, ugi=cloudera (auth:SIMPLE)]], avroParquetWriter=parquet.avro.AvroParquetWriter@1210e8f1} at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:184) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:325) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:304) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:70) at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.avro.generic.IndexedRecord at org.apache.avro.generic.GenericData.getField(GenericData.java:658) at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:164) at parquet.avro.AvroWriteSupport.writeRecord(AvroWriteSupport.java:149) at parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:262) at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167) at parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142) at parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:116) at parquet.hadoop.ParquetWriter.write(ParquetWriter.java:324) at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:75) at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:36) at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:178) ... 16 more 16/04/16 05:15:46 INFO mapreduce.Job: Task Id : attempt_1460806549439_0002_m_000000_1, Status : FAILED Error: org.kitesdk.data.DatasetOperationException: Failed to append {"customer_id": 1, "customer_fname": "Richard", "customer_lname": "Hernandez", "customer_email": "XXXXXXXXX", "customer_password": "XXXXXXXXX", "customer_street": "6303 Heather Plaza", "customer_city": "Brownsville", "customer_state": "TX", "customer_zipcode": "78521"} to ParquetAppender{path=hdfs://quickstart.cloudera:8020/tmp/default/.temp/job_1460806549439_0002/mr/attempt_1460806549439_0002_m_000000_1/.6568bad7-f7d6-4ae9-86cd-4c1c30b5e3bc.parquet.tmp, schema={"type":"record","name":"customers","fields":[{"name":"id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"email_preferences","type":["null",{"type":"record","name":"email_preferences","fields":[{"name":"email_format","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"frequency","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"categories","type":["null",{"type":"record","name":"categories","fields":[{"name":"promos","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null},{"name":"surveys","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null}]}],"default":null}]}],"default":null},{"name":"addresses","type":["null",{"type":"map","values":["null",{"type":"record","name":"addresses","fields":[{"name":"street_1","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"street_2","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"city","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"state","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"zip_code","type":["null","string"],"doc":"Converted from 'string'","default":null}]}]}],"doc":"Converted from 'map<string,struct<street_1:string,street_2:string,city:string,state:string,zip_code:string>>'","default":null},{"name":"orders","type":["null",{"type":"array","items":["null",{"type":"record","name":"orders","fields":[{"name":"order_id","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"order_date","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"items","type":["null",{"type":"array","items":["null",{"type":"record","name":"items","fields":[{"name":"product_id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"sku","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"price","type":["null","double"],"doc":"Converted from 'double'","default":null},{"name":"qty","type":["null","int"],"doc":"Converted from 'int'","default":null}]}]}],"doc":"Converted from 'array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>'","default":null}]}]}],"doc":"Converted from 'array<struct<order_id:string,order_date:string,items:array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>>>'","default":null}]}, fileSystem=DFS[DFSClient[clientName=DFSClient_attempt_1460806549439_0002_m_000000_1_-158232601_1, ugi=cloudera (auth:SIMPLE)]], avroParquetWriter=parquet.avro.AvroParquetWriter@7ccb7779} at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:184) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:325) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:304) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:70) at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.avro.generic.IndexedRecord at org.apache.avro.generic.GenericData.getField(GenericData.java:658) at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:164) at parquet.avro.AvroWriteSupport.writeRecord(AvroWriteSupport.java:149) at parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:262) at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167) at parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142) at parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:116) at parquet.hadoop.ParquetWriter.write(ParquetWriter.java:324) at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:75) at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:36) at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:178) ... 16 more 16/04/16 05:16:08 INFO mapreduce.Job: Task Id : attempt_1460806549439_0002_m_000000_2, Status : FAILED Error: org.kitesdk.data.DatasetOperationException: Failed to append {"customer_id": 1, "customer_fname": "Richard", "customer_lname": "Hernandez", "customer_email": "XXXXXXXXX", "customer_password": "XXXXXXXXX", "customer_street": "6303 Heather Plaza", "customer_city": "Brownsville", "customer_state": "TX", "customer_zipcode": "78521"} to ParquetAppender{path=hdfs://quickstart.cloudera:8020/tmp/default/.temp/job_1460806549439_0002/mr/attempt_1460806549439_0002_m_000000_2/.b9e12c70-c233-4207-98f6-aab3203e6d51.parquet.tmp, schema={"type":"record","name":"customers","fields":[{"name":"id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"email_preferences","type":["null",{"type":"record","name":"email_preferences","fields":[{"name":"email_format","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"frequency","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"categories","type":["null",{"type":"record","name":"categories","fields":[{"name":"promos","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null},{"name":"surveys","type":["null","boolean"],"doc":"Converted from 'boolean'","default":null}]}],"default":null}]}],"default":null},{"name":"addresses","type":["null",{"type":"map","values":["null",{"type":"record","name":"addresses","fields":[{"name":"street_1","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"street_2","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"city","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"state","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"zip_code","type":["null","string"],"doc":"Converted from 'string'","default":null}]}]}],"doc":"Converted from 'map<string,struct<street_1:string,street_2:string,city:string,state:string,zip_code:string>>'","default":null},{"name":"orders","type":["null",{"type":"array","items":["null",{"type":"record","name":"orders","fields":[{"name":"order_id","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"order_date","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"items","type":["null",{"type":"array","items":["null",{"type":"record","name":"items","fields":[{"name":"product_id","type":["null","int"],"doc":"Converted from 'int'","default":null},{"name":"sku","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"name","type":["null","string"],"doc":"Converted from 'string'","default":null},{"name":"price","type":["null","double"],"doc":"Converted from 'double'","default":null},{"name":"qty","type":["null","int"],"doc":"Converted from 'int'","default":null}]}]}],"doc":"Converted from 'array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>'","default":null}]}]}],"doc":"Converted from 'array<struct<order_id:string,order_date:string,items:array<struct<product_id:int,sku:string,name:string,price:double,qty:int>>>>'","default":null}]}, fileSystem=DFS[DFSClient[clientName=DFSClient_attempt_1460806549439_0002_m_000000_2_-1299212121_1, ugi=cloudera (auth:SIMPLE)]], avroParquetWriter=parquet.avro.AvroParquetWriter@2218c2e5} at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:184) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:325) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:304) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:70) at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.avro.generic.IndexedRecord at org.apache.avro.generic.GenericData.getField(GenericData.java:658) at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:164) at parquet.avro.AvroWriteSupport.writeRecord(AvroWriteSupport.java:149) at parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:262) at parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167) at parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142) at parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:116) at parquet.hadoop.ParquetWriter.write(ParquetWriter.java:324) at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:75) at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:36) at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:178) ... 16 more 16/04/16 05:16:35 INFO mapreduce.Job: map 100% reduce 0% 16/04/16 05:16:37 INFO mapreduce.Job: Job job_1460806549439_0002 failed with state FAILED due to: Task failed task_1460806549439_0002_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0 16/04/16 05:16:37 INFO mapreduce.Job: Counters: 8 Job Counters Failed map tasks=4 Launched map tasks=4 Other local map tasks=4 Total time spent by all maps in occupied slots (ms)=92115 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=92115 Total vcore-seconds taken by all map tasks=92115 Total megabyte-seconds taken by all map tasks=94325760 16/04/16 05:16:37 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead 16/04/16 05:16:37 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 164.838 seconds (0 bytes/sec) 16/04/16 05:16:37 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 16/04/16 05:16:37 INFO mapreduce.ImportJobBase: Retrieved 0 records. 16/04/16 05:16:37 ERROR tool.ImportAllTablesTool: Error during import: Import job failed!
Created on 04-18-2016 01:37 AM - last edited on 05-11-2016 03:58 PM by cjervis
Reimported Virtualbox appliance and now it works.
Created on 04-18-2016 01:37 AM - last edited on 05-11-2016 03:58 PM by cjervis
Reimported Virtualbox appliance and now it works.
Created 05-05-2016 11:43 AM
Delete the old VM and then start new one.It worked for me.
Created 05-06-2016 12:07 AM