Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

getting issue from spark-sql.

avatar

I am using hdp 2.6. And I am trying to fetch data from phoenix.

I have tried follwoing link.;-

https://community.hortonworks.com/questions/60413/hbase-master-and-regionserver-goes-down-citing-lea...

https://community.hortonworks.com/questions/41122/during-an-import-of-hbase-using-importtsv-hdfs-is....

Getting following error.

ERROR SparkSQLDriver: Failed in [select * from rasdb.dim_account] java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.spark.sql.types.Decimal$.createUnsafe(Decimal.scala:456) at org.apache.spark.sql.types.Decimal.createUnsafe(Decimal.scala) at org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(UnsafeRow.java:404) at org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toScalaImpl(CatalystTypeConverters.scala:324) at org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toScalaImpl(CatalystTypeConverters.scala:304) at org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toScala(CatalystTypeConverters.scala:111) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toScala(CatalystTypeConverters.scala:264) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toScala(CatalystTypeConverters.scala:231) at org.apache.spark.sql.catalyst.CatalystTypeConverters$anonfun$createToScalaConverter$2.apply(CatalystTypeConverters.scala:396) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeCollectPublic$1.apply(SparkPlan.scala:298) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeCollectPublic$1.apply(SparkPlan.scala:298) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:298) at org.apache.spark.sql.execution.QueryExecution$anonfun$hiveResultString$4.apply(QueryExecution.scala:139) at org.apache.spark.sql.execution.QueryExecution$anonfun$hiveResultString$4.apply(QueryExecution.scala:138) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) at org.apache.spark.sql.execution.QueryExecution.hiveResultString(QueryExecution.scala:138) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:335) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:751) 17/12/12 05:14:49 INFO ClientCnxn: Client session timed out, have not heard from server in 48847ms for sessionid 0x2604a1e0a641d82, closing socket connection and attempting reconnect 17/12/12 05:14:49 INFO ClientCnxn: Client session timed out, have not heard from server in 47326ms for sessionid 0x3604a1e09c01d5f, closing socket connection and attempting reconnect 17/12/12 05:14:49 INFO ClientCnxn: Client session timed out, have not heard from server in 55591ms for sessionid 0x2604a1e0a641d85, closing socket connection and attempting reconnect 17/12/12 05:14:49 INFO ContextCleaner: Cleaned accumulator 49 17/12/12 05:14:49 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 192.168.180.54:44265 in memory (size: 33.4 KB, free: 409.5 MB) java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.spark.sql.types.Decimal$.createUnsafe(Decimal.scala:456) at org.apache.spark.sql.types.Decimal.createUnsafe(Decimal.scala) at org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(UnsafeRow.java:404) at org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toScalaImpl(CatalystTypeConverters.scala:324) at org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toScalaImpl(CatalystTypeConverters.scala:304) at org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toScala(CatalystTypeConverters.scala:111) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toScala(CatalystTypeConverters.scala:264) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toScala(CatalystTypeConverters.scala:231) at org.apache.spark.sql.catalyst.CatalystTypeConverters$anonfun$createToScalaConverter$2.apply(CatalystTypeConverters.scala:396) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeCollectPublic$1.apply(SparkPlan.scala:298) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeCollectPublic$1.apply(SparkPlan.scala:298) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:298) at org.apache.spark.sql.execution.QueryExecution$anonfun$hiveResultString$4.apply(QueryExecution.scala:139) at org.apache.spark.sql.execution.QueryExecution$anonfun$hiveResultString$4.apply(QueryExecution.scala:138) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) at org.apache.spark.sql.execution.QueryExecution.hiveResultString(QueryExecution.scala:138) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:335) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:751) spark-sql> 17/12/12 05:14:49 WARN NettyRpcEnv: Ignored failure: java.util.concurrent.TimeoutException: Cannot receive any reply in 10 seconds 17/12/12 05:14:49 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,[Lscala.Tuple2;@5dd1e183,BlockManagerId(driver xxx, 44265, None))] in 1 attempts org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$createRpcTimeoutException(RpcTimeout.scala:48) at org.apache.spark.rpc.RpcTimeout$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63) at org.apache.spark.rpc.RpcTimeout$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83) at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102) at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$reportHeartBeat(Executor.scala:689) at org.apache.spark.executor.Executor$anon$1$anonfun$run$1.apply$mcV$sp(Executor.scala:718) at org.apache.spark.executor.Executor$anon$1$anonfun$run$1.apply(Executor.scala:718) at org.apache.spark.executor.Executor$anon$1$anonfun$run$1.apply(Executor.scala:718) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1963) at org.apache.spark.executor.Executor$anon$1.run(Executor.scala:718) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$anonfun$result$1.apply(package.scala:190) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:190) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81) ... 14 more 17/12/12 05:14:49 INFO ClientCnxn: Opening socket connection to server ip-192-168-180-21.ca-central-1.compute.internal/192.168.180.21:2181. Will not attempt to authenticate using SASL (unknown error) 17/12/12 05:14:49 INFO ClientCnxn: Socket connection established, initiating session, client: /192.168.180.54:53080, server: ip-192-168-180-21.ca-central-1.compute.internal/192.168.180.21:2181 17/12/12 05:14:49 INFO ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x3604a1e09c01d5f has expired, closing socket connection 17/12/12 05:14:49 WARN ConnectionManager$HConnectionImplementation: This client just lost it's session with ZooKeeper, closing it. It will be recreated next time someone needs it org.apache.phoenix.shaded.org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:634) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:566) at org.apache.hadoop.hbase.zookeeper.PendingWatcher.process(PendingWatcher.java:40) at org.apache.phoenix.shaded.org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:534) at org.apache.phoenix.shaded.org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) 17/12/12 05:14:49 INFO ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3604a1e09c01d5f 17/12/12 05:14:49 INFO ClientCnxn: EventThread shut down 17/12/12 05:14:50 INFO ClientCnxn: Opening socket connection to server ip-xxxca-central-1.compute.internal/xxx:2181. Will not attempt to authenticate using SASL (unknown error) 17/12/12 05:14:50 INFO ClientCnxn: Socket connection established, initiating session, client: /xxx:53088, server: ip-xxx.ca-central-1.compute.internal/xxxx:2181 17/12/12 05:14:50 INFO ClientCnxn: Session establishment complete on server ip-xxx.ca-central-1.compute.internal/xxx:2181, sessionid = 0x2604a1e0a641d85, negotiated timeout = 60000 17/12/12 05:14:50 INFO ClientCnxn: Opening socket connection to server ip-192-168-181-26.ca-central-1.compute.internal/192.168.181.26:2181. Will not attempt to authenticate using SASL (unknown error) 17/12/12 05:14:50 INFO ClientCnxn: Socket connection established, initiating session, client: /192.168.180.54:59278, server: ip-xxx.ca-central-1.compute.internal/xxx:2181 17/12/12 05:14:50 INFO ClientCnxn: Session establishment complete on server ip-xxxx.ca-central-1.compute.internal/xxx:2181, sessionid = 0x2604a1e0a641d82, negotiated timeout = 60000

1 ACCEPTED SOLUTION

avatar

@Sandeep Nemuri

I have resolved this issue with increasing spark_daemon_memory in spark configuration

Advanced spark2-env.

View solution in original post

6 REPLIES 6

avatar

@Ashnee Sharma What is your driver memory?

java.lang.OutOfMemoryError: GC overhead 

Try increasing the Driver memory according to the data size.

avatar

@Sandeep Nemuri

I have given 9GB for driver memory.

spark.executor.memory=9216m

avatar
@Sandeep Nemuri

Also getting error while connecting through with beeline. Coneecton getting established successfully , but when run the query getting following error.

Error: java.lang.OutOfMemoryError: GC overhead limit exceeded (state=,code=0)

Added following property

spark.yarn.am.memory=1g

spark.yarn.am.cores=1
Please help to resolve the issue.

avatar

@Ashnee Sharma: Do we know what is the data size of this table? "select * from rasdb.dim_account" gets complete data to driver and we need to make sure table data fits into driver.

avatar

@Sandeep Nemuri

There is around 3338700 rescords in this table.

avatar

@Sandeep Nemuri

I have resolved this issue with increasing spark_daemon_memory in spark configuration

Advanced spark2-env.