Member since
09-12-2016
39
Posts
5
Kudos Received
0
Solutions
10-05-2017
02:54 AM
@bkosaraju I think my issue not relevant with your link I am running Spark 1.6.3 on HDP 2.6
... View more
10-03-2017
02:32 AM
Hi All,
I running Spark 1 Thrift Server as Proxy.
But it not running as long as I expected.
Every two days, Spark will die likely cronjob. And sometime I can not access to Spark Thrift Server url.
I often get this ERROR. <html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
<title>Error 500 Server Error</title>
</head>
<body><h2>HTTP ERROR 500</h2>
<p>Problem accessing /jobs/. Reason:
<pre> Server Error</pre></p><h3>Caused by:</h3><pre>java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:3664)
at java.lang.String.<init>(String.java:207)
at java.lang.StringBuilder.toString(StringBuilder.java:407)
at scala.collection.mutable.StringBuilder.toString(StringBuilder.scala:427)
at scala.xml.Node.buildString(Node.scala:161)
at scala.xml.Node.toString(Node.scala:166)
at org.apache.spark.ui.JettyUtils$$anonfun$htmlResponderToServlet$1.apply(JettyUtils.scala:55)
at org.apache.spark.ui.JettyUtils$$anonfun$htmlResponderToServlet$1.apply(JettyUtils.scala:55)
at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:83)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at org.spark-project.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1507)
at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:179)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
at org.spark-project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499)
at org.spark-project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
at org.spark-project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:427)
at org.spark-project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
at org.spark-project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.spark-project.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301)
at org.spark-project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at org.spark-project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.spark-project.jetty.server.Server.handle(Server.java:370)
at org.spark-project.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
at org.spark-project.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:973)
at org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1035)
at org.spark-project.jetty.http.HttpParser.parseNext(HttpParser.java:641)
at org.spark-project.jetty.http.HttpParser.parseAvailable(HttpParser.java:231)
at org.spark-project.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
</pre>
<hr /><i><small>Powered by Jetty://</small></i><br/>
</body>
</html>
I can see the reason is java.lang.OutOfMemoryError: Java heap space. But how can I fix that ?
... View more
Labels:
- Labels:
-
Apache Spark
09-22-2017
10:06 AM
Hi All, I research a lot but not clearly about Share resource between Yarn queues My question is. I have two queue: - Dev: Capacity: 65% Max Capacity: 100% User limit factor: 1 - Test: Capacity: 35% Max Capacity: 100% User limit factor: 1 I want If job on queue Test not using all resource, job on Dev can using free resource on Test. And it do the same with job on queue Test if queue Dev have free resource.
... View more
Labels:
- Labels:
-
Apache YARN
-
Cloudera Manager
08-16-2017
01:38 AM
I'm running Spark and my app suddenly dead. I check log and find this problem is 17/08/15 12:29:40 ERROR TransportChannelHandler: Connection to /192.168.xx.109:44271 has been quiet for 120000 ms while there are outstanding requests. Assuming connection is dead; please adjust spark.network.timeout if this is wrong.
17/08/15 12:29:40 WARN NettyRpcEndpointRef: Error sending message [message = RetrieveSparkProps] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.askTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:172)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:67)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:157)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:259)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
... 12 more
17/08/15 12:29:43 ERROR TransportClient: Failed to send RPC 8631131244922754830 to hdp05.xxx.local/192.168.xx.109:44271: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
It mean spark.network.timeout is configure by default (120s) https://spark.apache.org/docs/1.6.3/configuration.html#networking So I want to increase spark.network.timeout = 800s (higher value than default). I can not find this line on Ambari UI, so I added it to : Spark > Configs > Custom spark-defaults > Add Property. I see it create and add this configure to spark-defaults.conf But when I running Spark app, I still have this ERROR ERROR TransportChannelHandler: Connection to /192.168.xx.109:44271 has been quiet for 120000 ms while there are outstanding requests. Assuming connection is dead; please adjust spark.network.timeout if this is wrong. It seem this config spark.network.timeout = 800s is not apply to Spark for running. So anyone have the same problem, anyone have solution for that please support me. Thanks
... View more
Labels:
- Labels:
-
Apache Spark
08-03-2017
01:30 AM
So I need use third party service like Haproxy to running Thrift server load balancing on HDP
... View more
07-31-2017
02:49 AM
@youngick kim I refer that link suggest, but I can not see any guide to solve my issue. Best answer saild : So for now... no load balancing for STS if the cluster is kerberized, otherwise haproxy, httpd +mod_jk or any other load balancer will probably do the work. P/S: Currently, my cluster do not enable kerberrized
... View more
07-28-2017
09:50 AM
Currently I running only one Thrift server on my cluster. I see if I have many client connections, Spark run very slow. I find the solution for this and I see the doc: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_spark-component-guide/content/install-spark-over-ambari.html It said: Deploying the Thrift server on multiple hosts increases scalability of the Thrift server; the number of hosts should take into consideration the cluster capacity allocated to Spark. So If I deploy more Thrift server on my cluster, Can I handle multiple concurrent client connections ? If not, how to I can handle this issue ?. And other question, Can I running two Spark apps with different configure on each?
... View more
Labels:
- Labels:
-
Apache Spark
03-02-2017
08:49 AM
@Jay SenSharma /usr/hdp/current/zookeeper-client/bin/zookeeper-client -server <ZK1>:2181,<ZK2>:2181
that command work with all ZK server
... View more
03-02-2017
08:48 AM
@Deepak Sharma Thank you, that command is worked. I see in other node that I don't install ZK server and ZK client. I can use use command without any warning /usr/hdp/current/zookeeper-client/bin/zookeeper-client So Do you know why it happen ?
... View more
03-02-2017
08:14 AM
In my HDP cluster, I install 3 zookeeper-servers on 3 nodes, other nodes I just only install zookeeper-client.
When I stand on Zookeeper-Server nodes, I can run zookeeper-client command OK.
But when I stand on Zookeeper-Client nodes, I run zookeeper-client and can not connect to zookeeper servers 2017-03-01 23:19:57,076 - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@23ab930d
Welcome to ZooKeeper!
2017-03-01 23:19:57,112 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2017-03-01 23:19:57,205 - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1146] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
[zk: localhost:2181(CONNECTING) 0] 2017-03-01 23:19:58,312 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-03-01 23:19:58,314 - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1146] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
2017-03-01 23:19:59,415 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-03-01 23:19:59,416 - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1146] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
2017-03-01 23:20:00,517 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-03-01 23:20:00,518 - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1146] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
So anyone can have solution for this problem
... View more
Labels: