Created 08-30-2017 10:39 AM
I have provisioned a Azure HDInsight Hadoop [3.6] and Spark Cluster [2.1] cluster and did some configuration changes to add some custom attributes [core-site.xml, mapred-site.xml, spark-env.sh, spark-defaults.sh]
After doing those changes I am restarting those services in the following order:
stop_service() { if [ -z "$1" ]; then echo "[`date`] [${USER}] Need service name to stop service" exit 1 fi SERVICENAME=$1 echo "[`date`] [${USER}] Stopping $SERVICENAME" if [[ $SERVICENAME =~ SPARK.* ]]; then curl -u $USERID:$PASSWD -sS -H "X-Requested-By: ambari" -X PUT -d '{"RequestInfo":{"context":"_PARSE_.STOP.$SERVICENAME","operation_level":{"level":"SERVICE","cluster_name":"CLUSTERNAME","service_name":"$SERVICENAME"}},"Body":{"ServiceInfo":{"state":"INSTALLED"}}}' "https://$CLUSTERNAME.azurehdinsight.net/api/v1/clusters/$CLUSTERNAME/services/$SERVICENAME" else curl -u $USERID:$PASSWD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stop $SERVICENAME via REST"}, "Body": {"ServiceInfo": {"state": "INSTALLED"}}}' http://$ACTIVEAMBARIHOST:$PORT/api/v1/clusters/$CLUSTERNAME/services/$SERVICENAME fi sleep 10 } start_service() { if [ -z "$1" ]; then echo "[`date`] [${USER}] Need service name to start service" exit 1 fi sleep 10 SERVICENAME=$1 echo "[`date`] [${USER}] Starting $SERVICENAME" if [[ $SERVICENAME =~ SPARK.* ]]; then startResult=$(curl -u $USERID:$PASSWD -sS -H "X-Requested-By: ambari" -X PUT -d '{"RequestInfo":{"context":"_PARSE_.STOP.$SERVICENAME","operation_level":{"level":"SERVICE","cluster_name":"CLUSTERNAME","service_name":"$SERVICENAME"}},"Body":{"ServiceInfo":{"state":"STARTED"}}}' "https://$CLUSTERNAME.azurehdinsight.net/api/v1/clusters/$CLUSTERNAME/services/$SERVICENAME") else startResult=$(curl -u $USERID:$PASSWD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start $SERVICENAME via REST"}, "Body": {"ServiceInfo": {"state": "STARTED"}}}' http://$ACTIVEAMBARIHOST:$PORT/api/v1/clusters/$CLUSTERNAME/services/$SERVICENAME) fi if ([[ $startResult == *"500 Server Error"* ]] || [[ $startResult == *"400 Bad Request"* ]]) || [[ $startResult == *"internal system exception occurred"* ]]; then sleep 60 echo "[`date`] [${USER}] Retry starting $SERVICENAME" if [[ $SERVICENAME =~ SPARK.* ]]; then startResult=$(curl -u $USERID:$PASSWD -sS -H "X-Requested-By: ambari" -X PUT -d '{"RequestInfo":{"context":"_PARSE_.STOP.$SERVICENAME","operation_level":{"level":"SERVICE","cluster_name":"CLUSTERNAME","service_name":"$SERVICENAME"}},"Body":{"ServiceInfo":{"state":"STARTED"}}}' "https://$CLUSTERNAME.azurehdinsight.net/api/v1/clusters/$CLUSTERNAME/services/$SERVICENAME") else startResult=$(curl -u $USERID:$PASSWD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start $SERVICENAME via REST"}, "Body": {"ServiceInfo": {"state": "STARTED"}}}' http://$ACTIVEAMBARIHOST:$PORT/api/v1/clusters/$CLUSTERNAME/services/$SERVICENAME) fi fi echo "$startResult" }
Note: I have to use "RequestInfo":{"context":"_PARSE_.STOP.$SERVICENAME" .... for SPARK2 service to stop otherwise Spark Thrift server still shows in stale state and need to restart.
[Spark maintenance mode is On]
stop_service SPARK2
stop_service OOZIE
stop_service HIVE
stop_service MAPREDUCE2
stop_service YARN
stop_service HDFS
start_service HDFS
start_service YARN
start_service MAPREDUCE2
start_service HIVE
start_service OOZIE
start_service SPARK2
[Spark maintenance mode is Off]
After doing all this restart surprisingly all the services came up successfully but sometimes services goes to unknown state and I can see yellow question mark in Ambari UI. [stop and start service both returns 200 in place of 202 (accepted)]
But if I run the same script again to provision another cluster with same type it works.
Why this kind of inconsistency there and what is the best way to restart a service if I did some configuration changes?
Created 08-30-2017 11:31 PM
@kalyanasish chanda, After making config changes, you can directly run api to restart stale components.
Find the post regarding how to restart stale components as below.