Support Questions
Find answers, ask questions, and share your expertise

Too many open files error

Highlighted

Too many open files error

Explorer

Hi.

 

I have Debian Wheezy and Cloudera 5.3.2.

 

I have a problem with temp data in Spark. I have fixed spark.shuffle.manager to "SORT". In /etc/secucity/limits.conf set the next values:

*               soft    nofile  1000000
*               hard    nofile  1000000
In spark-env.sh set ulimit -n 1000000
I've put "session required  pam_limits.so" into  /etc/pam.d/common-session
 
I've restarted the spark service and it continues crashing (Too many open files)
 
How can I resolve? I'm executing Spark 1.2.0 in Cloudera 5.3.2
 
java.io.FileNotFoundException: /tmp/spark-local-20150330115312-37a7/2f/temp_shuffle_c4ba5bce-c516-4a2a-9e40-56121eb84a8c (Too many open files)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
        at org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:123)
        at org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:360)
        at org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:355)
        at scala.Array$.fill(Array.scala:267)
        at org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:355)
        at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
        at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
15/03/30 11:54:18 WARN TaskSetManager: Lost task 22.0 in stage 3.0 (TID 27, localhost): java.io.FileNotFoundException: /tmp/spark-local-20150330115312-37a7/2f/temp_shuffle_c4ba5bce-c516-4a2a-9e40-56121eb84a8c (Too many open files)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
        at org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:123)
        at org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:360)
        at org.apache.spark.util.collection.ExternalSorter$$anonfun$spillToPartitionFiles$1.apply(ExternalSorter.scala:355)
        at scala.Array$.fill(Array.scala:267)
        at org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:355)
        at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
        at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
 
 
 
Thanks!!!!!