Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Too many open files error

Highlighted

Too many open files error

Explorer

Hi.

 

I have Debian Wheezy and Cloudera 5.3.2.

 

I have a problem with temp data in Spark. I have fixed spark.shuffle.manager to "SORT". In /etc/secucity/limits.conf set the next values:

*               soft    nofile  1000000
*               hard    nofile  1000000
In spark-env.sh set ulimit -n 1000000
I've put "session required  pam_limits.so" into  /etc/pam.d/common-session
 
I've restarted the spark service and it continues crashing (Too many open files)
 
How can I resolve? I'm executing Spark 1.2.0 in Cloudera 5.3.2
 
java.io.FileNotFoundException: /tmp/spark-local-20150330115312-37a7/2f/temp_shuffle_c4ba5bce-c516-4a2a-9e40-56121eb84a8c (Too many open files)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
        at org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:123)
        at org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:360)
        at org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:355)
        at scala.Array$.fill(Array.scala:267)
        at org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:355)
        at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
        at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
15/03/30 11:54:18 WARN TaskSetManager: Lost task 22.0 in stage 3.0 (TID 27, localhost): java.io.FileNotFoundException: /tmp/spark-local-20150330115312-37a7/2f/temp_shuffle_c4ba5bce-c516-4a2a-9e40-56121eb84a8c (Too many open files)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
        at org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:123)
        at org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:360)
        at org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:355)
        at scala.Array$.fill(Array.scala:267)
        at org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:355)
        at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
        at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
 
 
 
Thanks!!!!!
Don't have an account?
Coming from Hortonworks? Activate your account here