Created 12-27-2017 09:30 AM
I am using HDP 2.6 and spark 2.1.
While running select * from table trough beeline.
Connection sting !connect jdbc:hive2://xxx.xxxx.xxxx.xxx:10016
I am getting following error.
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (, executor 7): TaskResultLost (result lost from block manager) Driver stacktrace: (state=,code=0) Error: org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 4 tasks (1230.8 MB) is bigger than spark.driver.maxResultSize (1024.0 MB) (state=,code=0) Error: java.lang.OutOfMemoryError: GC overhead limit exceeded (state=,code=0)
WARN TransportChannelHandler: Exception in connection from ip-192-168-181-26.ca-central-1.compute.internal/192.168.181.26:42866 java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:693) at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:711) at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:700) at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:237) at io.netty.buffer.PoolArena.allocate(PoolArena.java:221) at io.netty.buffer.PoolArena.allocate(PoolArena.java:141) at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:296) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:177) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:168) at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:129) at io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:117) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) at java.lang.Thread.run(Thread.java:745) 17/12/27 04:18:26 ERROR TransportResponseHandler: Still have 1 requests outstanding when connection from ip-192-168-181-26.ca-central-1.compute.internal/192.168.181.26:42866 is closed 17/12/27 04:18:26 ERROR RetryingBlockFetcher: Failed to fetch block taskresult_8, and will not retry (0 retries) java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:693) at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:711) at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:700) at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:237) at io.netty.buffer.PoolArena.allocate(PoolArena.java:221) at io.netty.buffer.PoolArena.allocate(PoolArena.java:141) at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:296) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:177) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:168) at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:129) at io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:117) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) at java.lang.Thread.run(Thread.java:745) 17/12/27 04:18:26 WARN BlockManager: Failed to fetch block after 1 fetch failures. Most recent failure cause: org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:194) at org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:104) at org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:593) at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:82) at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:63) at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:63) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1963) at org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:62) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.ExecutionException: Boxed Error at scala.concurrent.impl.Promise$.resolver(Promise.scala:55) at scala.concurrent.impl.Promise$.scala$concurrent$impl$Promise$$resolveTry(Promise.scala:47) at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:244) at scala.concurrent.Promise$class.complete(Promise.scala:55) at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:153) at scala.concurrent.Promise$class.failure(Promise.scala:104) at scala.concurrent.impl.Promise$DefaultPromise.failure(Promise.scala:153) at org.apache.spark.network.BlockTransferService$$anon$1.onBlockFetchFailure(BlockTransferService.scala:95) at org.apache.spark.network.shuffle.RetryingBlockFetcher$RetryingBlockFetchListener.onBlockFetchFailure(RetryingBlockFetcher.java:231) at org.apache.spark.network.shuffle.OneForOneBlockFetcher.failRemainingBlocks(OneForOneBlockFetcher.java:123) at org.apache.spark.network.shuffle.OneForOneBlockFetcher.access$300(OneForOneBlockFetcher.java:43) at org.apache.spark.network.shuffle.OneForOneBlockFetcher$ChunkCallback.onFailure(OneForOneBlockFetcher.java:79) at org.apache.spark.network.client.TransportResponseHandler.failOutstandingRequests(TransportResponseHandler.java:107) at org.apache.spark.network.client.TransportResponseHandler.exceptionCaught(TransportResponseHandler.java:138) at org.apache.spark.network.server.TransportChannelHandler.exceptionCaught(TransportChannelHandler.java:81) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:260) at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:252) at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:260) at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:252) at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:260) at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:252) at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131) at org.apache.spark.network.util.TransportFrameDecoder.exceptionCaught(TransportFrameDecoder.java:190) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:260) at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:252) at io.netty.channel.ChannelHandlerAdapter.exceptionCaught(ChannelHandlerAdapter.java:79) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:260) at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:252) at io.netty.channel.DefaultChannelPipeline$HeadContext.exceptionCaught(DefaultChannelPipeline.java:1261) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:260) at io.netty.channel.DefaultChannelPipeline.fireExceptionCaught(DefaultChannelPipeline.java:899) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:87) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) ... 1 more Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:693) at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:711) at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:700) at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:237) at io.netty.buffer.PoolArena.allocate(PoolArena.java:221) at io.netty.buffer.PoolArena.allocate(PoolArena.java:141) at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:296) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:177) at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:168) at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:129) at io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:117)
Need help urgent. Thanks in Advance.
Created 12-27-2017 09:33 AM
I have increased the driver memory to 40G and executor memory to 20G . Also have added
spark.driver.maxResultSize 3g in spark conf.
Created 12-27-2017 09:44 AM
How many executors are in place? Also are you firing the query in spark-sql directly? What is the size of the table which you are fetching? Try increasing the partitions manually instead of letting spark deciding the no of partitions. No of partitions can be decided based on the table size which has to be splitted across executors.
Use the below properties
set by SparkConf: conf.set("spark.driver.maxResultSize", "3g")
set by spark-defaults.conf: spark.driver.maxResultSize 3g
set when calling spark-submit: --conf spark.driver.maxResultSize=3g
I believe the above property should work. I could see that you have increase the driver size already. If so then ignore the driver size change property.
Created 12-27-2017 09:57 AM
Table size is 8.2G.. The value of all. I have added this properties in spark2-env
SPARK_EXECUTOR_INSTANCES="3"
SPARK_EXECUTOR_CORES="3"
SPARK_EXECUTOR_MEMORY="20G"
SPARK_DRIVER_MEMORY="40G".
How to increase partition in spark-thrift connection.
Created 12-27-2017 10:04 AM
I have added this property in spark-conf.
Created 01-16-2018 06:57 AM
I have resolved the issue with increasing spark_daemon_memory to 20 GB.
Is that right approach. If not then please help me with right approach.