Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

getting issue from spark-sql.

avatar

I am using hdp 2.6. And I am trying to fetch data from phoenix.

I have tried follwoing link.;-

https://community.hortonworks.com/questions/60413/hbase-master-and-regionserver-goes-down-citing-lea...

https://community.hortonworks.com/questions/41122/during-an-import-of-hbase-using-importtsv-hdfs-is....

Getting following error.

ERROR SparkSQLDriver: Failed in [select * from rasdb.dim_account] java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.spark.sql.types.Decimal$.createUnsafe(Decimal.scala:456) at org.apache.spark.sql.types.Decimal.createUnsafe(Decimal.scala) at org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(UnsafeRow.java:404) at org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toScalaImpl(CatalystTypeConverters.scala:324) at org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toScalaImpl(CatalystTypeConverters.scala:304) at org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toScala(CatalystTypeConverters.scala:111) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toScala(CatalystTypeConverters.scala:264) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toScala(CatalystTypeConverters.scala:231) at org.apache.spark.sql.catalyst.CatalystTypeConverters$anonfun$createToScalaConverter$2.apply(CatalystTypeConverters.scala:396) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeCollectPublic$1.apply(SparkPlan.scala:298) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeCollectPublic$1.apply(SparkPlan.scala:298) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:298) at org.apache.spark.sql.execution.QueryExecution$anonfun$hiveResultString$4.apply(QueryExecution.scala:139) at org.apache.spark.sql.execution.QueryExecution$anonfun$hiveResultString$4.apply(QueryExecution.scala:138) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) at org.apache.spark.sql.execution.QueryExecution.hiveResultString(QueryExecution.scala:138) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:335) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:751) 17/12/12 05:14:49 INFO ClientCnxn: Client session timed out, have not heard from server in 48847ms for sessionid 0x2604a1e0a641d82, closing socket connection and attempting reconnect 17/12/12 05:14:49 INFO ClientCnxn: Client session timed out, have not heard from server in 47326ms for sessionid 0x3604a1e09c01d5f, closing socket connection and attempting reconnect 17/12/12 05:14:49 INFO ClientCnxn: Client session timed out, have not heard from server in 55591ms for sessionid 0x2604a1e0a641d85, closing socket connection and attempting reconnect 17/12/12 05:14:49 INFO ContextCleaner: Cleaned accumulator 49 17/12/12 05:14:49 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 192.168.180.54:44265 in memory (size: 33.4 KB, free: 409.5 MB) java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.spark.sql.types.Decimal$.createUnsafe(Decimal.scala:456) at org.apache.spark.sql.types.Decimal.createUnsafe(Decimal.scala) at org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(UnsafeRow.java:404) at org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toScalaImpl(CatalystTypeConverters.scala:324) at org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toScalaImpl(CatalystTypeConverters.scala:304) at org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toScala(CatalystTypeConverters.scala:111) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toScala(CatalystTypeConverters.scala:264) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toScala(CatalystTypeConverters.scala:231) at org.apache.spark.sql.catalyst.CatalystTypeConverters$anonfun$createToScalaConverter$2.apply(CatalystTypeConverters.scala:396) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeCollectPublic$1.apply(SparkPlan.scala:298) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeCollectPublic$1.apply(SparkPlan.scala:298) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:298) at org.apache.spark.sql.execution.QueryExecution$anonfun$hiveResultString$4.apply(QueryExecution.scala:139) at org.apache.spark.sql.execution.QueryExecution$anonfun$hiveResultString$4.apply(QueryExecution.scala:138) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) at org.apache.spark.sql.execution.QueryExecution.hiveResultString(QueryExecution.scala:138) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:335) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:751) spark-sql> 17/12/12 05:14:49 WARN NettyRpcEnv: Ignored failure: java.util.concurrent.TimeoutException: Cannot receive any reply in 10 seconds 17/12/12 05:14:49 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,[Lscala.Tuple2;@5dd1e183,BlockManagerId(driver xxx, 44265, None))] in 1 attempts org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$createRpcTimeoutException(RpcTimeout.scala:48) at org.apache.spark.rpc.RpcTimeout$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63) at org.apache.spark.rpc.RpcTimeout$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83) at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102) at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$reportHeartBeat(Executor.scala:689) at org.apache.spark.executor.Executor$anon$1$anonfun$run$1.apply$mcV$sp(Executor.scala:718) at org.apache.spark.executor.Executor$anon$1$anonfun$run$1.apply(Executor.scala:718) at org.apache.spark.executor.Executor$anon$1$anonfun$run$1.apply(Executor.scala:718) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1963) at org.apache.spark.executor.Executor$anon$1.run(Executor.scala:718) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$anonfun$result$1.apply(package.scala:190) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:190) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81) ... 14 more 17/12/12 05:14:49 INFO ClientCnxn: Opening socket connection to server ip-192-168-180-21.ca-central-1.compute.internal/192.168.180.21:2181. Will not attempt to authenticate using SASL (unknown error) 17/12/12 05:14:49 INFO ClientCnxn: Socket connection established, initiating session, client: /192.168.180.54:53080, server: ip-192-168-180-21.ca-central-1.compute.internal/192.168.180.21:2181 17/12/12 05:14:49 INFO ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x3604a1e09c01d5f has expired, closing socket connection 17/12/12 05:14:49 WARN ConnectionManager$HConnectionImplementation: This client just lost it's session with ZooKeeper, closing it. It will be recreated next time someone needs it org.apache.phoenix.shaded.org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:634) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:566) at org.apache.hadoop.hbase.zookeeper.PendingWatcher.process(PendingWatcher.java:40) at org.apache.phoenix.shaded.org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at org.apache.phoenix.shaded.org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 17/12/12 05:14:49 INFO ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3604a1e09c01d5f 17/12/12 05:14:49 INFO ClientCnxn: EventThread shut down 17/12/12 05:14:50 INFO ClientCnxn: Opening socket connection to server ip-xxxca-central-1.compute.internal/xxx:2181. Will not attempt to authenticate using SASL (unknown error) 17/12/12 05:14:50 INFO ClientCnxn: Socket connection established, initiating session, client: /xxx:53088, server: ip-xxx.ca-central-1.compute.internal/xxxx:2181 17/12/12 05:14:50 INFO ClientCnxn: Session establishment complete on server ip-xxx.ca-central-1.compute.internal/xxx:2181, sessionid = 0x2604a1e0a641d85, negotiated timeout = 60000 17/12/12 05:14:50 INFO ClientCnxn: Opening socket connection to server ip-192-168-181-26.ca-central-1.compute.internal/192.168.181.26:2181. Will not attempt to authenticate using SASL (unknown error) 17/12/12 05:14:50 INFO ClientCnxn: Socket connection established, initiating session, client: /192.168.180.54:59278, server: ip-xxx.ca-central-1.compute.internal/xxx:2181 17/12/12 05:14:50 INFO ClientCnxn: Session establishment complete on server ip-xxxx.ca-central-1.compute.internal/xxx:2181, sessionid = 0x2604a1e0a641d82, negotiated timeout = 60000

1 ACCEPTED SOLUTION

avatar

@Sandeep Nemuri

I have resolved this issue with increasing spark_daemon_memory in spark configuration

Advanced spark2-env.

View solution in original post

6 REPLIES 6

avatar

@Ashnee Sharma What is your driver memory?

java.lang.OutOfMemoryError: GC overhead 

Try increasing the Driver memory according to the data size.

avatar

@Sandeep Nemuri

I have given 9GB for driver memory.

spark.executor.memory=9216m

avatar
@Sandeep Nemuri

Also getting error while connecting through with beeline. Coneecton getting established successfully , but when run the query getting following error.

Error: java.lang.OutOfMemoryError: GC overhead limit exceeded (state=,code=0)

Added following property

spark.yarn.am.memory=1g

spark.yarn.am.cores=1
Please help to resolve the issue.

avatar

@Ashnee Sharma: Do we know what is the data size of this table? "select * from rasdb.dim_account" gets complete data to driver and we need to make sure table data fits into driver.

avatar

@Sandeep Nemuri

There is around 3338700 rescords in this table.

avatar

@Sandeep Nemuri

I have resolved this issue with increasing spark_daemon_memory in spark configuration

Advanced spark2-env.