Created on 12-21-2015 11:59 PM - edited 09-16-2022 02:54 AM
Hi Team,
Facing below errors with spark sql.
15/12/22 06:36:17 ERROR YarnScheduler: Lost executor 5 on xxxxx0075.us2.oraclecloud.com: remote Akka client disassociated
15/12/22 06:36:17 INFO TaskSetManager: Re-queueing tasks for 5 from TaskSet 1.0
15/12/22 06:36:17 WARN TaskSetManager: Lost task 33.0 in stage 1.0 (TID 32, xxxxx0075.us2.oraclecloud.com): ExecutorLostFailure (executor 5 lost)
15/12/22 06:36:17 WARN TaskSetManager: Lost task 11.0 in stage 1.0 (TID 2, xxxxx0075.us2.oraclecloud.com): ExecutorLostFailure (executor 5 lost)
15/12/22 06:36:17 WARN TaskSetManager: Lost task 30.0 in stage 1.0 (TID 22, xxxxx0075.us2.oraclecloud.com): ExecutorLostFailure (executor 5 lost)
15/12/22 06:36:17 WARN TaskSetManager: Lost task 78.0 in stage 1.0 (TID 52, xxxxx0075.us2.oraclecloud.com): ExecutorLostFailure (executor 5 lost)
15/12/22 06:36:17 WARN TaskSetManager: Lost task 24.0 in stage 1.0 (TID 12, xxxxx0075.us2.oraclecloud.com): ExecutorLostFailure (executor 5 lost)
15/12/22 06:36:17 WARN TaskSetManager: Lost task 47.0 in stage 1.0 (TID 42, xxxxx0075.us2.oraclecloud.com): ExecutorLostFailure (executor 5 lost)
15/12/22 06:36:17 INFO DAGScheduler: Executor lost: 5 (epoch 2)
15/12/22 06:36:17 INFO BlockManagerMasterActor: Trying to remove executor 5 from BlockManagerMaster.
15/12/22 06:36:17 INFO BlockManagerMaster: Removed 5 successfully in removeExecutor
15/12/22 06:36:20 INFO YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@xxxxx0084.us2.oraclecloud.com:39580/user/Executor#309269734] with ID 11
15/12/22 06:36:20 INFO TaskSetManager: Starting task 11.1 in stage 1.0 (TID 70, xxxxx0084.us2.oraclecloud.com, NODE_LOCAL, 1385 bytes)
15/12/22 06:36:20 INFO TaskSetManager: Starting task 28.1 in stage 1.0 (TID 71, xxxxx0084.us2.oraclecloud.com, NODE_LOCAL, 1386 bytes)
15/12/22 06:36:20 INFO TaskSetManager: Starting task 15.1 in stage 1.0 (TID 72, xxxxx0084.us2.oraclecloud.com, NODE_LOCAL, 1386 bytes)
15/12/22 06:36:20 INFO TaskSetManager: Starting task 37.1 in stage 1.0 (TID 73, xxxxx0084.us2.oraclecloud.com, NODE_LOCAL, 1386 bytes)
15/12/22 06:36:20 INFO TaskSetManager: Starting task 25.1 in stage 1.0 (TID 74, xxxxx0084.us2.oraclecloud.com, NODE_LOCAL, 1386 bytes)
15/12/22 06:36:20 INFO TaskSetManager: Starting task 82.1 in stage 1.0 (TID 75, xxxxx0084.us2.oraclecloud.com, NODE_LOCAL, 1386 bytes)
15/12/22 06:36:20 INFO BlockManagerMasterActor: Registering block manager xxxxx0084.us2.oraclecloud.com:46844 with 5.2 GB RAM, BlockManagerId(11, xxxxx0084.us2.oraclecloud.com, 46844)
15/12/22 06:36:20 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on xxxxx0084.us2.oraclecloud.com:46844 (size: 3.7 KB, free: 5.2 GB)
15/12/22 06:36:22 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on xxxxx0084.us2.oraclecloud.com:46844 (size: 28.7 KB, free: 5.2 GB)
15/12/22 06:36:27 INFO TaskSetManager: Starting task 104.0 in stage 1.0 (TID 76, xxxxx0080.us2.oraclecloud.com, NODE_LOCAL, 1386 bytes)
15/12/22 06:36:27 WARN TaskSetManager: Lost task 85.0 in stage 1.0 (TID 65, xxxxx0080.us2.oraclecloud.com): java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.nio.ByteBuffer.wrap(ByteBuffer.java:373)
at parquet.io.api.Binary$ByteArraySliceBackedBinary.toStringUsingUTF8(Binary.java:91)
at org.apache.spark.sql.parquet.CatalystPrimitiveStringConverter.addBinary(ParquetConverter.scala:478)
at parquet.column.impl.ColumnReaderImpl$2$6.writeValue(ColumnReaderImpl.java:318)
at parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:365)
at parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:405)
at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:206)
at parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:201)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:143)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7.apply(Aggregate.scala:152)
at org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7.apply(Aggregate.scala:147)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
15/12/22 06:36:28 ERROR YarnScheduler: Lost executor 6 on xxxxx0080.us2.oraclecloud.com: remote Akka client disassociated
Spark version - 1.3.0
CDH - 5.4.0
Thanks
Kishore
Created 04-01-2017 11:02 AM
@TheKishore432 Hi where you able to solve the issue?
Created 04-02-2017 09:20 PM
Hi Fawze,
Earlier when I executed query table has more than 1K partitions so we splitted tables and ran the same query on small tables and issue is resolved. Since spark can't keep all the content in memory it is failing with GC error.
Thanks
Kishore
Created 04-02-2017 09:24 PM