Support Questions

Find answers, ask questions, and share your expertise

Teragen Failing on HDC with 1TB of data

Super Guru

I am testing teragen on HDC and it fails using 1TB of data. The storage device i use is s3. When I use with smaller data set it works (ie 1GB). The error is java.io.IOException: No space left on device.

I know I am not capped on storage for s3 since I did exact same test using EMR on same bucket and same instance types and same number of nodes.

Here is error:

2017-02-05 06:22:59,959 WARN [main] org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-maptask.properties,hadoop-metrics2.properties
2017-02-05 06:23:00,026 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2017-02-05 06:23:00,026 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2017-02-05 06:23:00,036 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2017-02-05 06:23:00,036 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1486274161724_0003, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@8458f04)
2017-02-05 06:23:00,216 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2017-02-05 06:23:00,410 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /hadoopfs/fs1/yarn/nodemanager/usercache/hdfs/appcache/application_1486274161724_0003,/hadoopfs/fs2/yarn/nodemanager/usercache/hdfs/appcache/application_1486274161724_0003,/hadoopfs/fs3/yarn/nodemanager/usercache/hdfs/appcache/application_1486274161724_0003,/hadoopfs/fs4/yarn/nodemanager/usercache/hdfs/appcache/application_1486274161724_0003
2017-02-05 06:23:00,626 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2017-02-05 06:23:01,023 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1
2017-02-05 06:23:01,023 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2017-02-05 06:23:02,188 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
2017-02-05 06:23:02,427 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: org.apache.hadoop.examples.terasort.TeraGen$RangeInputFormat$RangeInputSplit@40712ee9
2017-02-05 06:28:51,040 INFO [main] org.apache.hadoop.mapred.MapTask: Ignoring exception during close for org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector@bfc14b9
java.io.IOException: No space left on device
	at java.io.FileOutputStream.writeBytes(Native Method)
	at java.io.FileOutputStream.write(FileOutputStream.java:326)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
	at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
	at org.apache.hadoop.fs.s3a.S3AOutputStream.close(S3AOutputStream.java:99)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
	at org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.close(TeraOutputFormat.java:80)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:670)
	at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:2019)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:797)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
2017-02-05 06:28:51,042 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: No space left on device
	at java.io.FileOutputStream.writeBytes(Native Method)
	at java.io.FileOutputStream.write(FileOutputStream.java:326)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
	at org.apache.hadoop.fs.s3a.S3AOutputStream.write(S3AOutputStream.java:140)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
	at java.io.DataOutputStream.write(DataOutputStream.java:107)
	at org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:73)
	at org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:60)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
	at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
	at org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper.map(TeraGen.java:230)
	at org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper.map(TeraGen.java:203)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)


2017-02-05 06:28:51,044 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
2017-02-05 06:28:51,721 WARN [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete s3a://xxxxx/data/sandbox/poc/teragen/1T-terasort-input/_temporary/1/_temporary/attempt_1486274161724_0003_m_000002_0
2017-02-05 06:28:51,825 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...
2017-02-05 06:28:51,825 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.
2017-02-05 06:28:51,825 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete.

I am using s3a connector.

Your feedback appreciated.

1 REPLY 1

Mentor

Argh, HCC formatting..

Eventhough S3 is your destination for Teragen, Mapreduce creates intermediate data and that still relies on local storage, you ran out of node manager space.

INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /hadoopfs/fs1/yarn/nodemanager/usercache/hdfs/appcache/application_1486274161724_0003,/hadoopfs/fs2/yarn/nodemanager/usercache/hdfs/appcache/application_1486274161724_0003,/hadoopfs/fs3/yarn/nodemanager/usercache/hdfs/appcache/application_1486274161724_0003,/hadoopfs/fs4/yarn/nodemanager/usercache/hdfs/appcache/application_1486274161724_0003