Member since
06-23-2016
12
Posts
2
Kudos Received
0
Solutions
08-10-2016
07:51 PM
Thanks for the reply - I have tried something similar to this, and tried not sending the "start" call until I saw the "stop" process had completely in the Ambari UI. The weird thing is that the response to the start call isn't an error message like you had predicted - it's just an empty response. It doesn't matter if I don't wait at all or wait 5 minutes, the response is empty. Is there any api call to "restart" so I don't have to manage the timing of stopping, waiting, starting? EDIT: It looks like from here that clients can't be put into any state other than "installed". I'm going to try to use the Restart API as defined here. $ curl --retry 5 --fail -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Starting SPARK_CLIENT"}, "HostRoles": {"state": "STARTED"}}' http://headnodehost:8080/api/v1/clusters/sparkucigraph39/hosts/wn3-sparku.cyzc0onq2zqudhbbi3tes4j3sd.bx.internal.cloudapp.net/host_components/SPARK_CLIENT
HTTP/1.1 200 OK
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
User: sparkadm
Set-Cookie: AMBARISESSIONID=13hxfu6ocnm4oumibnfjle633;Path=/;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Type: text/plain
Content-Length: 0
Server: Jetty(8.1.17.v20150415)
For comparison, here is the "stop" command: $ curl --retry 5 --fail -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stopping SPARK_CLIENT"}, "HostRoles": {"state": "INSTALLED"}}' http://headnodehost:8080/api/v1/clusters/sparkucigraph39/hosts/wn3-sparku.cyzc0onq2zqudhbbi3tes4j3sd.bx.internal.cloudapp.net/host_components/SPARK_CLIENT
HTTP/1.1 202 Accepted
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
User: sparkadm
Set-Cookie: AMBARISESSIONID=4aelvrdeyd571rpff061lfifd;Path=/;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Type: text/plain
Vary: Accept-Encoding, User-Agent
Content-Length: 148
Server: Jetty(8.1.17.v20150415)
{
"href" : "http://headnodehost:8080/api/v1/clusters/sparkucigraph39/requests/87",
"Requests" : {
"id" : 87,
"status" : "Accepted"
}
}
... View more
08-10-2016
06:59 PM
1 Kudo
The script only stops the services, doesn't start them back up. See the attached image. I can also attach the output of curl for the starting/stopping from the script if that is helpful. The curl requests to stop the services seem to work perfectly, but the calls to start up the services don't even have a body (Content-Length:0). capture1.png
... View more
08-10-2016
05:31 PM
Is there a simple cURL request to restart all services on a cluster (or preferably, just the ones with stale configurations)? I tried to follow the answer at this similarly-named post but the calls only stopped the services, and the calls the start them up again did not have any effect. If there is a way to simply restart the services that would be ideal. Note: this entire process is running in a bash script, so I would prefer answers where the entire restart process can be easily automated (i.e. don't require me to use the Ambari UI or require arduous parsing of a curl JSON response). Thanks for any and all help! [1] https://community.hortonworks.com/questions/29439/ambari-api-to-restart-all-the-services-with-stale.html
... View more
Labels:
- Labels:
-
Apache Ambari
08-05-2016
10:37 PM
1 Kudo
Hi, I'm writing a script that will set up the necessary classpath on a Spark cluster managed by Ambari. Originally, I had a script that would just edit /etc/spark/conf/spark-defaults.conf directly. However, as discussed here, that was bad solution since Ambari regularly overwrites that file with whatever configuration is registered with Ambari. Therefore, I am now using the configs.sh script as described here to make these adjustments. I am looking for a way to determine the AMBARI_HOST and CLUSTER_NAME arguments via the command line. Here is what I have so far: #!/usr/bin/env bash
ambari_host= # TODO: some command to find this out
cluster_name= # TODO: some command to find this out
config_types=("spark.executor.extraClassPath" "spark.driver.extraClassPath")
phoenix_jar="$(ls /usr/hdp/current/phoenix-client/phoenix-*-client-spark.jar)"
sqljdbc_jar="$(ls /usr/share/java/sqljdbc*.jar)"
jars="$phoenix_jar:$sqljdbc_jar"
for config_type in "${config_types[@]}"; do
/var/lib/ambari-server/resources/scripts/configs.sh \
set "$ambari_host" "$cluster_name" spark-defaults "$config_type" "$jars"
done
I can elaborate on anything that's unclear. Thanks in advance! Edit: thank you everyone for the answers! I wish I could accept all of them. For posterity, here's what I ended up with: #!/usr/bin/env bash
ambari_user="$1"
ambari_password="$2"
ambari_port=8080
ambari_host="$(/opt/hostname_scripts/hostname.sh)"
cluster_name="$(curl -u ${ambari_user}:${ambari_password} -i -H 'X-Requested-By: ambari' http://$ambari_host:$ambari_port/api/v1/clusters | sed -n 's/.*"cluster_name" : "\([^\"]*\)".*/\1/p')"
config_types=("spark.executor.extraClassPath" "spark.driver.extraClassPath")
phoenix_jar="$(ls /usr/hdp/current/phoenix-client/phoenix-*-client-spark.jar)"
sqljdbc_jar="$(ls /usr/share/java/sqljdbc*.jar)"
jars="$phoenix_jar:$sqljdbc_jar"
for config_type in "${config_types[@]}"; do
/var/lib/ambari-server/resources/scripts/configs.sh -u "$ambari_user" -p "$ambari_password" -port "$ambari_port" set "$ambari_host" "$cluster_name" spark-defaults "$config_type" "$jars" > /dev/null
done This was also a valuable resource. [1] https://community.hortonworks.com/questions/43587/spark-defulatsconf-constantly-being-overwritten.html [2] https://cwiki.apache.org/confluence/display/AMBARI/Modify+configurations#Modifyconfigurations-Editconfigurationusingconfigs.sh [3] http://lecluster.delaurent.com/one-shot-backup-all-config-files-with-ambari-api/
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Spark
07-05-2016
11:58 PM
Thank you!
... View more
07-05-2016
11:46 PM
I launch my cluster from the command line (via Azure and an ARM template), and ideally I want it to be set up automatically with the right classpath. Is there a way of automating this process (rather than going through the Ambari UI)?
... View more
07-05-2016
11:01 PM
I added a jar to my classpath by adding the extraClassPath options in spark-defaults.conf. Specifically, these two lines: spark.executor.extraClassPath /usr/hdp/2.4.2.0-258/phoenix/phoenix-4.4.0.2.4.2.0-258-client-spark.jar:/usr/share/java/sqljdbc41.jar
spark.driver.extraClassPath /usr/hdp/2.4.2.0-258/phoenix/phoenix-4.4.0.2.4.2.0-258-client-spark.jar:/usr/share/java/sqljdbc41.jar
I wasn't sure which spark-defaults.conf to edit (here are all of them): $ find / -name spark-defaults*.conf* 2> /dev/null
/usr/hdp/2.4.2.0-258/etc/spark/conf/spark-defaults.conf.template
/usr/hdp/2.4.2.0-258/etc/spark/conf/spark-defaults.conf
/etc/spark/conf.backup/spark-defaults.conf.template
/etc/spark/conf.backup/spark-defaults.conf
/etc/spark/2.4.2.0-258/0/spark-defaults.conf.template
/etc/spark/2.4.2.0-258/0/spark-defaults.conf After some trial and error, my jar only ran successfully when I added the extraClassPath options to /etc/spark/2.4.2.0-258/0/spark-defaults.conf. Editing the others did not resolve the "class not found" exceptions. However, it seems that my edits to this file are constantly being erased/overwritten. When I tried to run the jar this morning, the lines I had added to the spark-defaults.conf were no longer there, and at the top of the file was "# Generated by Apache Ambari. Mon Jul 4 00:06:17 2016". When I re-added the two lines, everything worked as it did yesterday. Is there another file I should be editing instead of (or in addition to) /etc/spark/2.4.2.0-258/0/spark-defaults.conf?
... View more
Labels:
- Labels:
-
Apache Spark
07-01-2016
05:14 PM
Thanks for the reply! I'm having trouble finding where set the min/max container sizes. Any insight on how to do that?
... View more
06-30-2016
11:33 PM
Hi, I am trying to submit a job to Spark via Tinkerpop 3.2.0, but I keep running into this exception: 16/06/30 23:26:09 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_1467158618360_0050_01_000002 on host: 192.168.2.23. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_1467158618360_0050_01_000002
Exit code: 1 From my research, it seems that this is signaling low memory, but I have allocated a lot of memory. Below are all the relevant (I think) configurations. spark.master=yarn-client
spark.app.id=gremlin
spark.ui.port=4051
spark.yarn.appMasterEnv.CLASSPATH=$CLASSPATH:/usr/hdp/current/hadoop-mapreduce-client/*:/usr/hdp/current/hadoop-mapreduce-client/lib/*
spark.executor.extraJavaOptions=-Dhdp.version=2.4.2.0-258
spark.executor.instances=4
spark.executor.memory=1g
spark.driver.memory=1g
spark.executor.userClassPathFirst=true
spark.storage.memoryFraction=0.4
spark.shuffle.memoryFraction=0.4
spark.yarn.executor.memoryOverhead=4096
I read that this could be caused by a Java version issue. Though the cluster came with Java 1.7, I had to install and use Java 1.8 instead (Tinkerpop requires 1.8). Is this the cause of the exception, and if so, is there any way around it? I would appreciate any help. Thanks!
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
06-28-2016
07:28 PM
Thanks for the reply. An issue that I'm having is that I'm not sure how to designate that the job is distributed to the existing Spark cluster (rather than just run on the VM).
... View more