Member since
09-12-2016
39
Posts
5
Kudos Received
0
Solutions
10-05-2017
02:54 AM
@bkosaraju I think my issue not relevant with your link I am running Spark 1.6.3 on HDP 2.6
... View more
10-03-2017
02:32 AM
Hi All,
I running Spark 1 Thrift Server as Proxy.
But it not running as long as I expected.
Every two days, Spark will die likely cronjob. And sometime I can not access to Spark Thrift Server url.
I often get this ERROR. <html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
<title>Error 500 Server Error</title>
</head>
<body><h2>HTTP ERROR 500</h2>
<p>Problem accessing /jobs/. Reason:
<pre> Server Error</pre></p><h3>Caused by:</h3><pre>java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:3664)
at java.lang.String.<init>(String.java:207)
at java.lang.StringBuilder.toString(StringBuilder.java:407)
at scala.collection.mutable.StringBuilder.toString(StringBuilder.scala:427)
at scala.xml.Node.buildString(Node.scala:161)
at scala.xml.Node.toString(Node.scala:166)
at org.apache.spark.ui.JettyUtils$$anonfun$htmlResponderToServlet$1.apply(JettyUtils.scala:55)
at org.apache.spark.ui.JettyUtils$$anonfun$htmlResponderToServlet$1.apply(JettyUtils.scala:55)
at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:83)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at org.spark-project.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1507)
at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:179)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1478)
at org.spark-project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499)
at org.spark-project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
at org.spark-project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:427)
at org.spark-project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
at org.spark-project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.spark-project.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301)
at org.spark-project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at org.spark-project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.spark-project.jetty.server.Server.handle(Server.java:370)
at org.spark-project.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
at org.spark-project.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:973)
at org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1035)
at org.spark-project.jetty.http.HttpParser.parseNext(HttpParser.java:641)
at org.spark-project.jetty.http.HttpParser.parseAvailable(HttpParser.java:231)
at org.spark-project.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
</pre>
<hr /><i><small>Powered by Jetty://</small></i><br/>
</body>
</html>
I can see the reason is java.lang.OutOfMemoryError: Java heap space. But how can I fix that ?
... View more
Labels:
- Labels:
-
Apache Spark
09-22-2017
10:06 AM
Hi All, I research a lot but not clearly about Share resource between Yarn queues My question is. I have two queue: - Dev: Capacity: 65% Max Capacity: 100% User limit factor: 1 - Test: Capacity: 35% Max Capacity: 100% User limit factor: 1 I want If job on queue Test not using all resource, job on Dev can using free resource on Test. And it do the same with job on queue Test if queue Dev have free resource.
... View more
Labels:
- Labels:
-
Apache YARN
-
Cloudera Manager
08-16-2017
01:38 AM
I'm running Spark and my app suddenly dead. I check log and find this problem is 17/08/15 12:29:40 ERROR TransportChannelHandler: Connection to /192.168.xx.109:44271 has been quiet for 120000 ms while there are outstanding requests. Assuming connection is dead; please adjust spark.network.timeout if this is wrong.
17/08/15 12:29:40 WARN NettyRpcEndpointRef: Error sending message [message = RetrieveSparkProps] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.askTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:172)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:67)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:157)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:259)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
... 12 more
17/08/15 12:29:43 ERROR TransportClient: Failed to send RPC 8631131244922754830 to hdp05.xxx.local/192.168.xx.109:44271: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
It mean spark.network.timeout is configure by default (120s) https://spark.apache.org/docs/1.6.3/configuration.html#networking So I want to increase spark.network.timeout = 800s (higher value than default). I can not find this line on Ambari UI, so I added it to : Spark > Configs > Custom spark-defaults > Add Property. I see it create and add this configure to spark-defaults.conf But when I running Spark app, I still have this ERROR ERROR TransportChannelHandler: Connection to /192.168.xx.109:44271 has been quiet for 120000 ms while there are outstanding requests. Assuming connection is dead; please adjust spark.network.timeout if this is wrong. It seem this config spark.network.timeout = 800s is not apply to Spark for running. So anyone have the same problem, anyone have solution for that please support me. Thanks
... View more
Labels:
- Labels:
-
Apache Spark
08-09-2017
02:02 AM
@Bob Hardaway @Sonu Sahi @Eyad Garelnabi Do you have any idea for this. Thanks
... View more
08-03-2017
10:09 AM
I see your question the same as my case: https://community.hortonworks.com/questions/118802/run-spark-thrift-servers-with-different-yarn-queue.html But how to run Spark Thrift server with different queue. Now I'm starting Spark Thrift server with ambari UI
... View more
08-03-2017
01:30 AM
So I need use third party service like Haproxy to running Thrift server load balancing on HDP
... View more
08-02-2017
04:18 AM
My cluster is HDP 2.6 with 10 nodes. I want to run two Spark2 Thrift server with different queue on my cluster. But on Spark2 configure, I just see one option spark.yarn.queue that override both Thrift server configure. That mean both Thrift server are running the same queue. So anyone have any solution for that I can running two or more Thrift server with different Yarn queue Thanks.
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
-
Cloudera Manager
07-31-2017
02:49 AM
@youngick kim I refer that link suggest, but I can not see any guide to solve my issue. Best answer saild : So for now... no load balancing for STS if the cluster is kerberized, otherwise haproxy, httpd +mod_jk or any other load balancer will probably do the work. P/S: Currently, my cluster do not enable kerberrized
... View more
07-28-2017
09:50 AM
Currently I running only one Thrift server on my cluster. I see if I have many client connections, Spark run very slow. I find the solution for this and I see the doc: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_spark-component-guide/content/install-spark-over-ambari.html It said: Deploying the Thrift server on multiple hosts increases scalability of the Thrift server; the number of hosts should take into consideration the cluster capacity allocated to Spark. So If I deploy more Thrift server on my cluster, Can I handle multiple concurrent client connections ? If not, how to I can handle this issue ?. And other question, Can I running two Spark apps with different configure on each?
... View more
Labels:
- Labels:
-
Apache Spark
07-28-2017
03:06 AM
anyone can help me ?
... View more
07-27-2017
07:00 AM
I use Spark2 on Yarn. I using ORC Files with partitions and caching in memory. I testing concurrent request of Spark. If just only user request, Spark very fast, take 2-3s to finish. But when I increase in few user request, Spark seem very slow, It take 10-15s to finish request. So I need to be able to handle many more requests. Anyone have any document or practical about this.
... View more
Labels:
- Labels:
-
Apache Spark
03-02-2017
08:49 AM
@Jay SenSharma /usr/hdp/current/zookeeper-client/bin/zookeeper-client -server <ZK1>:2181,<ZK2>:2181
that command work with all ZK server
... View more
03-02-2017
08:48 AM
@Deepak Sharma Thank you, that command is worked. I see in other node that I don't install ZK server and ZK client. I can use use command without any warning /usr/hdp/current/zookeeper-client/bin/zookeeper-client So Do you know why it happen ?
... View more
03-02-2017
08:14 AM
In my HDP cluster, I install 3 zookeeper-servers on 3 nodes, other nodes I just only install zookeeper-client.
When I stand on Zookeeper-Server nodes, I can run zookeeper-client command OK.
But when I stand on Zookeeper-Client nodes, I run zookeeper-client and can not connect to zookeeper servers 2017-03-01 23:19:57,076 - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@23ab930d
Welcome to ZooKeeper!
2017-03-01 23:19:57,112 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2017-03-01 23:19:57,205 - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1146] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
[zk: localhost:2181(CONNECTING) 0] 2017-03-01 23:19:58,312 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-03-01 23:19:58,314 - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1146] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
2017-03-01 23:19:59,415 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-03-01 23:19:59,416 - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1146] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
2017-03-01 23:20:00,517 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-03-01 23:20:00,518 - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1146] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
So anyone can have solution for this problem
... View more
Labels:
01-09-2017
06:05 AM
@mqureshi Thank you for your answers. I want ask one more question. If I change just only on Ambari UI. So Is it equal with I used setrep command ? Or I need configure on Ambari UI before use setrep
... View more
01-09-2017
04:41 AM
I have two question about dfs.replication parameter: 1. I know default of replication block is 3. But when I configure dfs.replication=1, Do it affected to cluster performance. 2. I have a lot of data with configure dfs.replication=1, and now I change configure to dfs.replication= 3. So my data will auto replicate or I have to build my data again to replication running. I need to be sure because my data is very important. P/S: any best practice for dfs.replication configure.
... View more
- Tags:
- Hadoop Core
- HDFS
Labels:
- Labels:
-
Apache Hadoop
12-21-2016
06:36 AM
@Rajkumar Singh I find this doc from hortonworks https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_yarn-resource-management/content/setting_application_limits.html and this from apache hadoop: I know if I change yarn.scheduler.capacity.maximum-am-resource-percent then Max Application Master Resources will change. But I want to know which configure it depend to calculate. For example if true I set yarn.scheduler.capacity.maximum-am-resource-percent = 1 (100%) then Max Application Master Resources balance with yarn.nodemanager.resource.memory-mb
... View more
12-21-2016
05:37 AM
1 Kudo
@Rajkumar Singh
So how can I define Max Application Master Resources. In default Configured Max Application Master Limit is set to 20%. If I increase figure, Max Application Master Resources increase. But how this Max Application Master Resources is calculated ?
... View more
12-21-2016
03:17 AM
1 Kudo
When I running yarn app, I often get some app in queue( state ACCEPTED). But you can see I still have free resource. So why app in queue ?
... View more
Labels:
- Labels:
-
Cloudera Manager
12-19-2016
05:46 AM
@Sunile Manjee Do you have any documents relate to this (cpu scheduling) ?
... View more
12-19-2016
05:30 AM
@Timothy Spann Yarn app not big so much, I run Apache Kylin and buid cubes. When running, Yarn app follow my memory setting. But for CPU, even when I set limit for CPU, it always exceed over my limit. I mean when I set CPU limit 80%, but when app running , it reach to 100% even 200%
... View more
12-19-2016
02:51 AM
1 Kudo
In my cluster, when I running an Yarn app, my CPU of each node get very high. I want to prevent this issue. I configure yarn.nodemanager.resource.percentage-physical-cpu-limit=80(%) But nodes still run out of CPU ( some time is 200% CPU cause I can not ssh to this) and also cause high CPU, hbase region server force to stop. I need limit CPU usage of nodes that mean when I run Yarn app, the CPU can not run out of resource. How can I do that. Thanks
... View more
Labels:
- Labels:
-
Apache YARN
11-29-2016
07:55 AM
Hi All, Below is my default queue( that use 100% YARN resource) So I wonder how Max Application Master Resource is calculated ?
... View more
Labels:
- Labels:
-
Apache YARN
10-31-2016
05:29 AM
@mqureshi thanks for your answer. So about the performance two nodes is the same one node if they have the same total hardware( and also the same if I spare to 5 VMs node) ? Can you suggest me small cluster design for 3 or 5 node. I want use this on production. I find the typical cluster referent from document Masters -- HDFS NameNode, YARN ResourceManager, and HBase Master.
Slaves -- HDFS DataNodes, YARN NodeManagers, and HBase RegionServers
... View more
10-31-2016
04:04 AM
I want to deploy HDP on two servers. Each server have 32G RAM. So what best design when install HDP on two servers. What need assign for Master? What need assign for Slave? And for practical, is better if use 1 server have 64G RAM or use 2 servers with 32G RAM each? Thanks.
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)