Member since
05-20-2019
6
Posts
1
Kudos Received
0
Solutions
11-14-2019
04:09 AM
Hi, Is it possible to create Virtual Private Cluster using Cloudera Manager REST API?
... View more
Labels:
- Labels:
-
Cloudera Manager
11-04-2019
03:23 AM
UPD When I submit job in cluster mode - I'm getting the same metrics for executor as in Spark History Server. But the jvm metrics are still absent.
... View more
11-03-2019
10:10 AM
1 Kudo
I'm using Spark on YARN with Ambari 2.7.4 HDP Standalone 3.1.4 Spark 2.3.2 Hadoop 3.1.1 Graphite on Docker latest I was trying to get Spark metrics with Graphite sink following this tutorial. Advanced spark2-metrics-properties in Ambari are: driver.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
executor.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
worker.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
master.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
*.sink.graphite.host=ap-test-m.c.gcp-ps.internal
*.sink.graphite.port=2003
*.sink.graphite.protocol=tcp
*.sink.graphite.period=10
*.sink.graphite.unit=seconds
*.sink.graphite.prefix=app-test
*.source.jvm.class=org.apache.spark.metrics.source.JvmSource Spark submit: export HADOOP_CONF_DIR=/usr/hdp/3.1.4.0-315/hadoop/conf/; spark-submit --class com.Main --master yarn --deploy-mode client --driver-memory 1g --executor-memory 10g --num-executors 2 --executor-cores 2 spark-app.jar /data As a result I'm only getting driver metrics. Also, I was trying to add metrics.properties to spark-submit command together with global spark metrics props, but that didn't help. And finally, I tried conf in spark-submit and in java SparkConf: --conf "spark.metrics.conf.driver.sink.graphite.class"="org.apache.spark.metrics.sink.GraphiteSink"
--conf "spark.metrics.conf.executor.sink.graphite.class"="org.apache.spark.metrics.sink.GraphiteSink"
--conf "worker.sink.graphite.class"="org.apache.spark.metrics.sink.GraphiteSink"
--conf "master.sink.graphite.class"="org.apache.spark.metrics.sink.GraphiteSink"
--conf "spark.metrics.conf.*.sink.graphite.host"="host"
--conf "spark.metrics.conf.*.sink.graphite.port"=2003
--conf "spark.metrics.conf.*.sink.graphite.period"=10
--conf "spark.metrics.conf.*.sink.graphite.unit"=seconds
--conf "spark.metrics.conf.*.sink.graphite.prefix"="app-test"
--conf "spark.metrics.conf.*.source.jvm.class"="org.apache.spark.metrics.source.JvmSource" But that didn't help either. I was trying to submit jobs in both client and cluster modes. CSVSink also gives only driver metrics.
... View more
Labels:
06-12-2018
07:21 PM
Hi,
Trying to run this simple Spring Boot Yarn app. Launching app from Windows for it to be deployed on Hortonworks Sandbox HDP 2.5.
I just simply run original Spring Boot YARN app. Jars are about to deploy to sandbox, but after that I'm getting error:
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /app/gs-yarn-basic/gs-yarn-basic-container-0.1.0.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
All services are working, datanode is up and there is enough memory.
I know, the problem is in network configuration of sandbox - by default it's IP is 172.17.0.2, my network is 192.168.0.x.
When I set in my app
fsUri: hdfs://localhost:9000 according to this article, client tries to talk to datanode, but the datanode is running on 172.17.0.2, so from my host machine it cannot copy files. That's why i tried this instructions and my sanbox now has IP 192.168.0.x. But when I run my app - it says that it cannot connect to Resource Manager Client: Retrying connect to server: sandbox.hortonworks.com/192.168.0.x:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) What else configuration should I change?
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)