Reply
New Contributor
Posts: 1
Registered: ‎05-16-2016

Checksum error: /tmp/hadoop-yarn/staging/cloudera/....jars/kite-data-

Hi All,

 

I am doing the tutorial and I got stuck in the importing sql step with a checksum error.

 

I a cloudea-quickstart-vm-5.4 running on Virtual Box on my Linux machine. I get the following error

 

[cloudera@quickstart ~]$ sqoop import-all-tables -m 1 --connect jdbc:mysql://quickstart:3306/retail_db --username=retail_dba --password=cloudera --compression-codec=snappy --as-avrodatafile --warehouse-dir=/user/hive/warehouse
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/05/16 10:21:16 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.4.2
16/05/16 10:21:16 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/05/16 10:21:17 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/05/16 10:21:17 INFO tool.CodeGenTool: Beginning code generation
16/05/16 10:21:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/05/16 10:21:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/05/16 10:21:17 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/05/16 10:21:20 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/categories.jar
16/05/16 10:21:20 WARN manager.MySQLManager: It looks like you are importing from mysql.
16/05/16 10:21:20 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
16/05/16 10:21:20 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
16/05/16 10:21:20 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
16/05/16 10:21:20 INFO mapreduce.ImportJobBase: Beginning import of categories
16/05/16 10:21:20 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
16/05/16 10:21:21 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/05/16 10:21:22 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
16/05/16 10:21:22 INFO mapreduce.DataDrivenImportJob: Writing Avro schema file: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/sqoop_import_categories.avsc
16/05/16 10:21:22 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/05/16 10:21:23 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/05/16 10:21:25 INFO db.DBInputFormat: Using read commited transaction isolation
16/05/16 10:21:25 INFO mapreduce.JobSubmitter: number of splits:1
16/05/16 10:21:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1463331548251_0010
16/05/16 10:21:25 INFO impl.YarnClientImpl: Submitted application application_1463331548251_0010
16/05/16 10:21:25 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1463331548251_0010/
16/05/16 10:21:25 INFO mapreduce.Job: Running job: job_1463331548251_0010
16/05/16 10:21:36 INFO mapreduce.Job: Job job_1463331548251_0010 running in uber mode : false
16/05/16 10:21:36 INFO mapreduce.Job:  map 0% reduce 0%
16/05/16 10:21:44 INFO mapreduce.Job:  map 100% reduce 0%
16/05/16 10:21:45 INFO mapreduce.Job: Job job_1463331548251_0010 completed successfully
16/05/16 10:21:45 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=135505
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=87
		HDFS: Number of bytes written=1344
		HDFS: Number of read operations=4
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=1
		Other local map tasks=1
		Total time spent by all maps in occupied slots (ms)=6181
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=6181
		Total vcore-seconds taken by all map tasks=6181
		Total megabyte-seconds taken by all map tasks=6329344
	Map-Reduce Framework
		Map input records=58
		Map output records=58
		Input split bytes=87
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=105
		CPU time spent (ms)=1260
		Physical memory (bytes) snapshot=125665280
		Virtual memory (bytes) snapshot=1508417536
		Total committed heap usage (bytes)=60751872
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=1344
16/05/16 10:21:45 INFO mapreduce.ImportJobBase: Transferred 1.3125 KB in 22.9802 seconds (58.4851 bytes/sec)
16/05/16 10:21:45 INFO mapreduce.ImportJobBase: Retrieved 58 records.
16/05/16 10:21:45 INFO tool.CodeGenTool: Beginning code generation
16/05/16 10:21:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1
16/05/16 10:21:45 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/customers.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/05/16 10:21:46 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/customers.jar
16/05/16 10:21:46 INFO mapreduce.ImportJobBase: Beginning import of customers
16/05/16 10:21:46 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
16/05/16 10:21:46 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customers` AS t LIMIT 1
16/05/16 10:21:46 INFO mapreduce.DataDrivenImportJob: Writing Avro schema file: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/sqoop_import_customers.avsc
16/05/16 10:21:47 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/05/16 10:21:48 INFO db.DBInputFormat: Using read commited transaction isolation
16/05/16 10:21:48 INFO mapreduce.JobSubmitter: number of splits:1
16/05/16 10:21:48 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1463331548251_0011
16/05/16 10:21:48 INFO impl.YarnClientImpl: Submitted application application_1463331548251_0011
16/05/16 10:21:48 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1463331548251_0011/
16/05/16 10:21:48 INFO mapreduce.Job: Running job: job_1463331548251_0011
16/05/16 10:21:58 INFO mapreduce.Job: Job job_1463331548251_0011 running in uber mode : false
16/05/16 10:21:58 INFO mapreduce.Job:  map 0% reduce 0%
16/05/16 10:22:10 INFO mapreduce.Job:  map 100% reduce 0%
16/05/16 10:22:10 INFO mapreduce.Job: Job job_1463331548251_0011 completed successfully
16/05/16 10:22:10 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=136178
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=87
		HDFS: Number of bytes written=470392
		HDFS: Number of read operations=4
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=1
		Other local map tasks=1
		Total time spent by all maps in occupied slots (ms)=9160
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=9160
		Total vcore-seconds taken by all map tasks=9160
		Total megabyte-seconds taken by all map tasks=9379840
	Map-Reduce Framework
		Map input records=12435
		Map output records=12435
		Input split bytes=87
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=139
		CPU time spent (ms)=3290
		Physical memory (bytes) snapshot=134823936
		Virtual memory (bytes) snapshot=1509306368
		Total committed heap usage (bytes)=60751872
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=470392
16/05/16 10:22:10 INFO mapreduce.ImportJobBase: Transferred 459.3672 KB in 23.258 seconds (19.7509 KB/sec)
16/05/16 10:22:10 INFO mapreduce.ImportJobBase: Retrieved 12435 records.
16/05/16 10:22:10 INFO tool.CodeGenTool: Beginning code generation
16/05/16 10:22:10 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
16/05/16 10:22:10 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/departments.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/05/16 10:22:11 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/departments.jar
16/05/16 10:22:11 INFO mapreduce.ImportJobBase: Beginning import of departments
16/05/16 10:22:11 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
16/05/16 10:22:11 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
16/05/16 10:22:11 INFO mapreduce.DataDrivenImportJob: Writing Avro schema file: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/sqoop_import_departments.avsc
16/05/16 10:22:11 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/05/16 10:22:13 INFO db.DBInputFormat: Using read commited transaction isolation
16/05/16 10:22:13 INFO mapreduce.JobSubmitter: number of splits:1
16/05/16 10:22:13 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1463331548251_0012
16/05/16 10:22:13 INFO impl.YarnClientImpl: Submitted application application_1463331548251_0012
16/05/16 10:22:13 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1463331548251_0012/
16/05/16 10:22:13 INFO mapreduce.Job: Running job: job_1463331548251_0012
16/05/16 10:22:26 INFO mapreduce.Job: Job job_1463331548251_0012 running in uber mode : false
16/05/16 10:22:26 INFO mapreduce.Job:  map 0% reduce 0%
16/05/16 10:22:38 INFO mapreduce.Job:  map 100% reduce 0%
16/05/16 10:22:39 INFO mapreduce.Job: Job job_1463331548251_0012 completed successfully
16/05/16 10:22:39 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=135393
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=87
		HDFS: Number of bytes written=458
		HDFS: Number of read operations=4
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=1
		Other local map tasks=1
		Total time spent by all maps in occupied slots (ms)=9364
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=9364
		Total vcore-seconds taken by all map tasks=9364
		Total megabyte-seconds taken by all map tasks=9588736
	Map-Reduce Framework
		Map input records=6
		Map output records=6
		Input split bytes=87
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=144
		CPU time spent (ms)=1810
		Physical memory (bytes) snapshot=127283200
		Virtual memory (bytes) snapshot=1508417536
		Total committed heap usage (bytes)=60751872
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=458
16/05/16 10:22:39 INFO mapreduce.ImportJobBase: Transferred 458 bytes in 28.0973 seconds (16.3005 bytes/sec)
16/05/16 10:22:39 INFO mapreduce.ImportJobBase: Retrieved 6 records.
16/05/16 10:22:39 INFO tool.CodeGenTool: Beginning code generation
16/05/16 10:22:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `order_items` AS t LIMIT 1
16/05/16 10:22:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/order_items.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/05/16 10:22:41 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/order_items.jar
16/05/16 10:22:41 INFO mapreduce.ImportJobBase: Beginning import of order_items
16/05/16 10:22:41 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
16/05/16 10:22:41 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `order_items` AS t LIMIT 1
16/05/16 10:22:41 INFO mapreduce.DataDrivenImportJob: Writing Avro schema file: /tmp/sqoop-cloudera/compile/f6888c0a0e4f1dedd7fbe9fa78d0e074/sqoop_import_order_items.avsc
16/05/16 10:22:41 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/05/16 10:22:43 INFO db.DBInputFormat: Using read commited transaction isolation
16/05/16 10:22:43 INFO mapreduce.JobSubmitter: number of splits:1
16/05/16 10:22:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1463331548251_0013
16/05/16 10:22:43 INFO impl.YarnClientImpl: Submitted application application_1463331548251_0013
16/05/16 10:22:43 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1463331548251_0013/
16/05/16 10:22:43 INFO mapreduce.Job: Running job: job_1463331548251_0013
16/05/16 10:22:51 INFO mapreduce.Job: Job job_1463331548251_0013 running in uber mode : false
16/05/16 10:22:51 INFO mapreduce.Job:  map 0% reduce 0%
16/05/16 10:22:51 INFO mapreduce.Job: Job job_1463331548251_0013 failed with state FAILED due to: Application application_1463331548251_0013 failed 2 times due to AM Container for appattempt_1463331548251_0013_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:http://quickstart.cloudera:8088/proxy/application_1463331548251_0013/Then, click on links to logs of each attempt.
Diagnostics: Checksum error: /tmp/hadoop-yarn/staging/cloudera/.staging/job_1463331548251_0013/libjars/kite-data-hive.jar at 1531392 exp: 1648069466 got: 1557948288
org.apache.hadoop.fs.ChecksumException: Checksum error: /tmp/hadoop-yarn/staging/cloudera/.staging/job_1463331548251_0013/libjars/kite-data-hive.jar at 1531392 exp: 1648069466 got: 1557948288
	at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(Native Method)
	at org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:59)
	at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:301)
	at org.apache.hadoop.hdfs.BlockReaderLocal.fillBuffer(BlockReaderLocal.java:355)
	at org.apache.hadoop.hdfs.BlockReaderLocal.fillDataBuf(BlockReaderLocal.java:464)
	at org.apache.hadoop.hdfs.BlockReaderLocal.readWithBounceBuffer(BlockReaderLocal.java:595)
	at org.apache.hadoop.hdfs.BlockReaderLocal.read(BlockReaderLocal.java:559)
	at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:740)
	at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:796)
	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:856)
	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:897)
	at java.io.DataInputStream.read(DataInputStream.java:100)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:91)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
	at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
	at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:265)
	at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

Failing this attempt. Failing the application.
16/05/16 10:22:51 INFO mapreduce.Job: Counters: 0
16/05/16 10:22:51 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
16/05/16 10:22:51 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 10.1984 seconds (0 bytes/sec)
16/05/16 10:22:51 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
16/05/16 10:22:51 INFO mapreduce.ImportJobBase: Retrieved 0 records.
16/05/16 10:22:51 ERROR tool.ImportAllTablesTool: Error during import: Import job failed!

 

Any help would be appreciated,

 

Thanks,

Lucas

Highlighted
New Contributor
Posts: 1
Registered: ‎03-11-2017

Re: Checksum error: /tmp/hadoop-yarn/staging/cloudera/....jars/kite-data-

You have to restart yarn(MR2 included) and it will work successfully