Member since
01-16-2018
4
Posts
0
Kudos Received
0
Solutions
04-16-2018
06:54 AM
Hello and good morning,
we have a problem with the submit of Spark Jobs.
The last two tasks are not processed and the system is blocked. It only helps to quit the application.
Executor ID
Address
Status
RDD Blocks
Storage Memory
Disk Used
Cores
Active Tasks
Failed Tasks
Complete Tasks
Total Tasks
Task Time (GC Time)
Input
Shuffle Read
Shuffle Write
Logs
Thread Dump
driver
192.168.1.134:37341
Active
2
48.2 KB / 4.4 GB
0.0 B
0
0
0
0
0
0 ms (0 ms)
0.0 B
0.0 B
0.0 B
stdout
Thread Dump
stderr
1
worker_12:43376
Active
2
48.2 KB / 9 GB
0.0 B
2
2
0
20
22
3 s (58 ms)
920 KB
0.0 B
152 B
stdout
Thread Dump
stderr
In the thread dump we have found the following
In the thread dump I could find the following inconsistency. It seems that the thread with the ID 63 is waiting for the one with the ID 71. Can you see why the thread can't finish its work?
Thread ID
Thread Name
Thread State
Thread Locks
71
Executor task launch worker for task 502
RUNNABLE
Lock(java.util.concurrent.ThreadPoolExecutor$Worker@1010216063}), Monitor(java.lang.UNIXProcess$ProcessPipeInputStream@752994465}), Monitor(org.apache.spark.api.python.PythonWorkerFactory@143199750}), Monitor(org.apache.spark.SparkEnv@687340918})
java.io.FileInputStream.readBytes(Native Method)
java.io.FileInputStream.read(FileInputStream.java:255)
java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
java.io.BufferedInputStream.read(BufferedInputStream.java:265) => holding Monitor(java.lang.UNIXProcess$ProcessPipeInputStream@752994465})
java.io.DataInputStream.readInt(DataInputStream.java:387)
org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:181) => holding Monitor(org.apache.spark.api.python.PythonWorkerFactory@143199750})
org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:103) => holding Monitor(org.apache.spark.api.python.PythonWorkerFactory@143199750})
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:79)
org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:118) => holding Monitor(org.apache.spark.SparkEnv@687340918})
org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128)
org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
org.apache.spark.scheduler.Task.run(Task.scala:108)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
63
Executor task launch worker for task 524
BLOCKED
Blocked by Thread Some(71) Lock(org.apache.spark.SparkEnv@687340918})
Lock(java.util.concurrent.ThreadPoolExecutor$Worker@2128788466})
org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:116)
org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:128)
org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
org.apache.spark.scheduler.Task.run(Task.scala:108)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
The spark-003.txt contains the last ~200 lines of the job log.
If any further log / dump etc. needed I will try to provide and post it.
Very thanks ahead for the support.
Best regards René
... View more
Labels:
- Labels:
-
Apache Spark