Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to configute spark.network.timeout for SPARK on AMBARI

Highlighted

How to configute spark.network.timeout for SPARK on AMBARI

I'm running Spark and my app suddenly dead. I check log and find this problem is

17/08/15 12:29:40 ERROR TransportChannelHandler: Connection to /192.168.xx.109:44271 has been quiet for 120000 ms while there are outstanding requests. Assuming connection is dead; please adjust spark.network.timeout if this is wrong.
17/08/15 12:29:40 WARN NettyRpcEndpointRef: Error sending message [message = RetrieveSparkProps] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.askTimeout
	at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
	at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
	at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:172)
	at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)
	at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:67)
	at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:157)
	at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:259)
	at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
	at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
	at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
	at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
	at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
	at scala.concurrent.Await$.result(package.scala:107)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	... 12 more
17/08/15 12:29:43 ERROR TransportClient: Failed to send RPC 8631131244922754830 to hdp05.xxx.local/192.168.xx.109:44271: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException


It mean spark.network.timeout is configure by default (120s) https://spark.apache.org/docs/1.6.3/configuration.html#networking

So I want to increase spark.network.timeout = 800s (higher value than default). I can not find this line on Ambari UI, so I added it to : Spark > Configs > Custom spark-defaults > Add Property.

I see it create and add this configure to spark-defaults.conf


But when I running Spark app, I still have this ERROR

ERROR TransportChannelHandler: Connection to /192.168.xx.109:44271 has been quiet for 120000 ms while there are outstanding requests. Assuming connection is dead; please adjust spark.network.timeout if this is wrong.

It seem this config spark.network.timeout = 800s is not apply to Spark for running.

So anyone have the same problem, anyone have solution for that please support me.

Thanks

1 REPLY 1

Re: How to configute spark.network.timeout for SPARK on AMBARI

Explorer

must be in "ms", so try 800000.

Don't have an account?
Coming from Hortonworks? Activate your account here