Support Questions

subhash_parise3 · ‎12-30-2015

15/12/30 08:55:10 INFO mapreduce.Job: Task Id : attempt_1451465507406_0001_m_000001_2, Status : FAILED Error: java.lang.IllegalArgumentException at java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1307) at java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1230) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:274) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.hadoop.tools.mapred.CopyMapper.setup(CopyMapper.java:112) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

cnauroth · ‎12-30-2015

According to the stack trace, there was an IllegalArgumentException while trying to create a ThreadPoolExecutor. This is the relevant source code from the S3AFileSystem class:

 int maxThreads = conf.getInt(MAX_THREADS, DEFAULT_MAX_THREADS); int coreThreads = conf.getInt(CORE_THREADS, DEFAULT_CORE_THREADS);
    if (maxThreads == 0) {
      maxThreads = Runtime.getRuntime().availableProcessors() * 8;
    }
    if (coreThreads == 0) {
      coreThreads = Runtime.getRuntime().availableProcessors() * 8;
    }
    long keepAliveTime = conf.getLong(KEEPALIVE_TIME, DEFAULT_KEEPALIVE_TIME);
    LinkedBlockingQueue<Runnable> workQueue =
      new LinkedBlockingQueue<>(maxThreads *
        conf.getInt(MAX_TOTAL_TASKS, DEFAULT_MAX_TOTAL_TASKS));
    threadPoolExecutor = new ThreadPoolExecutor(
        coreThreads,
        maxThreads,
        keepAliveTime,
        TimeUnit.SECONDS,
        workQueue,
        newDaemonThreadFactory("s3a-transfer-shared-"));
    threadPoolExecutor.allowCoreThreadTimeOut(true);

The various arguments passed to the ThreadPoolExecutor are pulled from Hadoop configuration, such as the core-site.xml file. The defaults for these are defined in core-default.xml:

<property>
  <name>fs.s3a.threads.max</name>
  <value>256</value>
  <description> Maximum number of concurrent active (part)uploads,
    which each use a thread from the threadpool.</description>
</property>
<property>
  <name>fs.s3a.threads.core</name>
  <value>15</value>
  <description>Number of core threads in the threadpool.</description>
</property>
<property>
  <name>fs.s3a.threads.keepalivetime</name>
  <value>60</value>
  <description>Number of seconds a thread can be idle before being
    terminated.</description>
</property>
<property>
  <name>fs.s3a.max.total.tasks</name>
  <value>1000</value>
  <description>Number of (part)uploads allowed to the queue before
    blocking additional uploads.</description>
</property>

Is it possible that you have overridden one of these configuration properties to an invalid value, such as a negative number?

View solution in original post

aervits · ‎12-30-2015

can you provide the command you executed? You can follow advice in this thread https://community.hortonworks.com/questions/7165/how-to-copy-hdfs-file-to-aws-s3-bucket-hadoop-dist....

cnauroth · ‎12-30-2015

According to the stack trace, there was an IllegalArgumentException while trying to create a ThreadPoolExecutor. This is the relevant source code from the S3AFileSystem class:

 int maxThreads = conf.getInt(MAX_THREADS, DEFAULT_MAX_THREADS); int coreThreads = conf.getInt(CORE_THREADS, DEFAULT_CORE_THREADS);
    if (maxThreads == 0) {
      maxThreads = Runtime.getRuntime().availableProcessors() * 8;
    }
    if (coreThreads == 0) {
      coreThreads = Runtime.getRuntime().availableProcessors() * 8;
    }
    long keepAliveTime = conf.getLong(KEEPALIVE_TIME, DEFAULT_KEEPALIVE_TIME);
    LinkedBlockingQueue<Runnable> workQueue =
      new LinkedBlockingQueue<>(maxThreads *
        conf.getInt(MAX_TOTAL_TASKS, DEFAULT_MAX_TOTAL_TASKS));
    threadPoolExecutor = new ThreadPoolExecutor(
        coreThreads,
        maxThreads,
        keepAliveTime,
        TimeUnit.SECONDS,
        workQueue,
        newDaemonThreadFactory("s3a-transfer-shared-"));
    threadPoolExecutor.allowCoreThreadTimeOut(true);

The various arguments passed to the ThreadPoolExecutor are pulled from Hadoop configuration, such as the core-site.xml file. The defaults for these are defined in core-default.xml:

<property>
  <name>fs.s3a.threads.max</name>
  <value>256</value>
  <description> Maximum number of concurrent active (part)uploads,
    which each use a thread from the threadpool.</description>
</property>
<property>
  <name>fs.s3a.threads.core</name>
  <value>15</value>
  <description>Number of core threads in the threadpool.</description>
</property>
<property>
  <name>fs.s3a.threads.keepalivetime</name>
  <value>60</value>
  <description>Number of seconds a thread can be idle before being
    terminated.</description>
</property>
<property>
  <name>fs.s3a.max.total.tasks</name>
  <value>1000</value>
  <description>Number of (part)uploads allowed to the queue before
    blocking additional uploads.</description>
</property>

Is it possible that you have overridden one of these configuration properties to an invalid value, such as a negative number?

subhash_parise3 · ‎01-05-2016

yes it's working now. i have given fs.s3a.max.total.tasks value is 10 that's why it was throwing an exception

If possible , could you please reply the below query ?.

I have total 6 T.B (3*2 T.B) hard drives in each node. HDFS is using 5 T.B

i need to upload 5 T.B data into s3 Bucket.

i am using s3a client and i am getting "No Space Left in Device" ... !

Cloudera Community

Support Questions

I am getting below exception by using distcp to copy hdfs data into s3 using s3a protocol.

How to copy/merge multiple files from HDFS to an S...

Use S3 as storage for Zeppelin Notebooks.

Using Pig to convert uncompressed data to compress...

Ingesting Log data using MiNiFi & NiFi

Comparing Performance of Cloudera Operational Data...

How to use s3a with HDP

Distcp fails with File doesnot exist exception

How to transfer file using secure webhdfs in distc...

Ingesting EDI into HDFS using HDF 2.0

How to use Templating to make Data Engineering wo...