About zachkirsch

zachkirsch · ‎08-10-2016

Thanks for the reply - I have tried something similar to this, and tried not sending the "start" call until I saw the "stop" process had completely in the Ambari UI. The weird thing is that the response to the start call isn't an error message like you had predicted - it's just an empty response. It doesn't matter if I don't wait at all or wait 5 minutes, the response is empty. Is there any api call to "restart" so I don't have to manage the timing of stopping, waiting, starting? EDIT: It looks like from here that clients can't be put into any state other than "installed". I'm going to try to use the Restart API as defined here. $ curl --retry 5 --fail -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Starting SPARK_CLIENT"}, "HostRoles": {"state": "STARTED"}}' http://headnodehost:8080/api/v1/clusters/sparkucigraph39/hosts/wn3-sparku.cyzc0onq2zqudhbbi3tes4j3sd.bx.internal.cloudapp.net/host_components/SPARK_CLIENT HTTP/1.1 200 OK X-Frame-Options: DENY X-XSS-Protection: 1; mode=block User: sparkadm Set-Cookie: AMBARISESSIONID=13hxfu6ocnm4oumibnfjle633;Path=/;HttpOnly Expires: Thu, 01 Jan 1970 00:00:00 GMT Content-Type: text/plain Content-Length: 0 Server: Jetty(8.1.17.v20150415) For comparison, here is the "stop" command: $ curl --retry 5 --fail -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stopping SPARK_CLIENT"}, "HostRoles": {"state": "INSTALLED"}}' http://headnodehost:8080/api/v1/clusters/sparkucigraph39/hosts/wn3-sparku.cyzc0onq2zqudhbbi3tes4j3sd.bx.internal.cloudapp.net/host_components/SPARK_CLIENT HTTP/1.1 202 Accepted X-Frame-Options: DENY X-XSS-Protection: 1; mode=block User: sparkadm Set-Cookie: AMBARISESSIONID=4aelvrdeyd571rpff061lfifd;Path=/;HttpOnly Expires: Thu, 01 Jan 1970 00:00:00 GMT Content-Type: text/plain Vary: Accept-Encoding, User-Agent Content-Length: 148 Server: Jetty(8.1.17.v20150415) { "href" : "http://headnodehost:8080/api/v1/clusters/sparkucigraph39/requests/87", "Requests" : { "id" : 87, "status" : "Accepted" } }

zachkirsch · ‎08-10-2016

The script only stops the services, doesn't start them back up. See the attached image. I can also attach the output of curl for the starting/stopping from the script if that is helpful. The curl requests to stop the services seem to work perfectly, but the calls to start up the services don't even have a body (Content-Length:0). capture1.png

zachkirsch · ‎08-10-2016

Is there a simple cURL request to restart all services on a cluster (or preferably, just the ones with stale configurations)? I tried to follow the answer at this similarly-named post but the calls only stopped the services, and the calls the start them up again did not have any effect. If there is a way to simply restart the services that would be ideal. Note: this entire process is running in a bash script, so I would prefer answers where the entire restart process can be easily automated (i.e. don't require me to use the Ambari UI or require arduous parsing of a curl JSON response). Thanks for any and all help! [1] https://community.hortonworks.com/questions/29439/ambari-api-to-restart-all-the-services-with-stale.html

zachkirsch · ‎08-05-2016

Hi, I'm writing a script that will set up the necessary classpath on a Spark cluster managed by Ambari. Originally, I had a script that would just edit /etc/spark/conf/spark-defaults.conf directly. However, as discussed here, that was bad solution since Ambari regularly overwrites that file with whatever configuration is registered with Ambari. Therefore, I am now using the configs.sh script as described here to make these adjustments. I am looking for a way to determine the AMBARI_HOST and CLUSTER_NAME arguments via the command line. Here is what I have so far: #!/usr/bin/env bash ambari_host= # TODO: some command to find this out cluster_name= # TODO: some command to find this out config_types=("spark.executor.extraClassPath" "spark.driver.extraClassPath") phoenix_jar="$(ls /usr/hdp/current/phoenix-client/phoenix-*-client-spark.jar)" sqljdbc_jar="$(ls /usr/share/java/sqljdbc*.jar)" jars="$phoenix_jar:$sqljdbc_jar" for config_type in "${config_types[@]}"; do /var/lib/ambari-server/resources/scripts/configs.sh \ set "$ambari_host" "$cluster_name" spark-defaults "$config_type" "$jars" done I can elaborate on anything that's unclear. Thanks in advance! Edit: thank you everyone for the answers! I wish I could accept all of them. For posterity, here's what I ended up with: #!/usr/bin/env bash ambari_user="$1" ambari_password="$2" ambari_port=8080 ambari_host="$(/opt/hostname_scripts/hostname.sh)" cluster_name="$(curl -u ${ambari_user}:${ambari_password} -i -H 'X-Requested-By: ambari' http://$ambari_host:$ambari_port/api/v1/clusters | sed -n 's/.*"cluster_name" : "$[^\"]*$".*/\1/p')" config_types=("spark.executor.extraClassPath" "spark.driver.extraClassPath") phoenix_jar="$(ls /usr/hdp/current/phoenix-client/phoenix-*-client-spark.jar)" sqljdbc_jar="$(ls /usr/share/java/sqljdbc*.jar)" jars="$phoenix_jar:$sqljdbc_jar" for config_type in "${config_types[@]}"; do /var/lib/ambari-server/resources/scripts/configs.sh -u "$ambari_user" -p "$ambari_password" -port "$ambari_port" set "$ambari_host" "$cluster_name" spark-defaults "$config_type" "$jars" > /dev/null done This was also a valuable resource. [1] https://community.hortonworks.com/questions/43587/spark-defulatsconf-constantly-being-overwritten.html [2] https://cwiki.apache.org/confluence/display/AMBARI/Modify+configurations#Modifyconfigurations-Editconfigurationusingconfigs.sh [3] http://lecluster.delaurent.com/one-shot-backup-all-config-files-with-ambari-api/

zachkirsch · ‎07-05-2016

Thank you!

zachkirsch · ‎07-05-2016

I launch my cluster from the command line (via Azure and an ARM template), and ideally I want it to be set up automatically with the right classpath. Is there a way of automating this process (rather than going through the Ambari UI)?

zachkirsch · ‎07-05-2016

I added a jar to my classpath by adding the extraClassPath options in spark-defaults.conf. Specifically, these two lines: spark.executor.extraClassPath /usr/hdp/2.4.2.0-258/phoenix/phoenix-4.4.0.2.4.2.0-258-client-spark.jar:/usr/share/java/sqljdbc41.jar spark.driver.extraClassPath /usr/hdp/2.4.2.0-258/phoenix/phoenix-4.4.0.2.4.2.0-258-client-spark.jar:/usr/share/java/sqljdbc41.jar I wasn't sure which spark-defaults.conf to edit (here are all of them): $ find / -name spark-defaults*.conf* 2> /dev/null /usr/hdp/2.4.2.0-258/etc/spark/conf/spark-defaults.conf.template /usr/hdp/2.4.2.0-258/etc/spark/conf/spark-defaults.conf /etc/spark/conf.backup/spark-defaults.conf.template /etc/spark/conf.backup/spark-defaults.conf /etc/spark/2.4.2.0-258/0/spark-defaults.conf.template /etc/spark/2.4.2.0-258/0/spark-defaults.conf After some trial and error, my jar only ran successfully when I added the extraClassPath options to /etc/spark/2.4.2.0-258/0/spark-defaults.conf. Editing the others did not resolve the "class not found" exceptions. However, it seems that my edits to this file are constantly being erased/overwritten. When I tried to run the jar this morning, the lines I had added to the spark-defaults.conf were no longer there, and at the top of the file was "# Generated by Apache Ambari. Mon Jul 4 00:06:17 2016". When I re-added the two lines, everything worked as it did yesterday. Is there another file I should be editing instead of (or in addition to) /etc/spark/2.4.2.0-258/0/spark-defaults.conf?

zachkirsch · ‎07-01-2016

Thanks for the reply! I'm having trouble finding where set the min/max container sizes. Any insight on how to do that?

zachkirsch · ‎06-30-2016

Hi, I am trying to submit a job to Spark via Tinkerpop 3.2.0, but I keep running into this exception: 16/06/30 23:26:09 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_1467158618360_0050_01_000002 on host: 192.168.2.23. Exit status: 1. Diagnostics: Exception from container-launch. Container id: container_1467158618360_0050_01_000002 Exit code: 1 From my research, it seems that this is signaling low memory, but I have allocated a lot of memory. Below are all the relevant (I think) configurations. spark.master=yarn-client spark.app.id=gremlin spark.ui.port=4051 spark.yarn.appMasterEnv.CLASSPATH=$CLASSPATH:/usr/hdp/current/hadoop-mapreduce-client/*:/usr/hdp/current/hadoop-mapreduce-client/lib/* spark.executor.extraJavaOptions=-Dhdp.version=2.4.2.0-258 spark.executor.instances=4 spark.executor.memory=1g spark.driver.memory=1g spark.executor.userClassPathFirst=true spark.storage.memoryFraction=0.4 spark.shuffle.memoryFraction=0.4 spark.yarn.executor.memoryOverhead=4096 I read that this could be caused by a Java version issue. Though the cluster came with Java 1.7, I had to install and use Java 1.8 instead (Tinkerpop requires 1.8). Is this the cause of the exception, and if so, is there any way around it? I would appreciate any help. Thanks!

zachkirsch · ‎06-28-2016

Thanks for the reply. An issue that I'm having is that I'm not sure how to designate that the job is distributed to the existing Spark cluster (rather than just run on the VM).

Online	Offline
Last Visited	‎08-10-2016 08:17 PM

Member Since	‎06-23-2016 08:32 PM
Last Visited	‎08-10-2016 08:17 PM
Posts	12
Kudos received	2

Cloudera Community

Re: Ambari REST API to restart all services

Re: Ambari REST API to restart all services

Ambari REST API to restart all services

Determine Ambari Host and Cluster Name from comman...

Re: spark-defaults.conf constantly being overwritt...

Re: spark-defaults.conf constantly being overwritt...

spark-defaults.conf constantly being overwritten

Re: Container marked as failed: Spark & YARN

Container marked as failed: Spark & YARN

Re: Post Job to Spark via YARN from VM on a virtua...