Member since
03-14-2016
4721
Posts
1111
Kudos Received
874
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2729 | 04-27-2020 03:48 AM | |
| 5287 | 04-26-2020 06:18 PM | |
| 4456 | 04-26-2020 06:05 PM | |
| 3583 | 04-13-2020 08:53 PM | |
| 5383 | 03-31-2020 02:10 AM |
09-04-2017
03:59 PM
@uri ben-ari The mentioned API call will work only on ambari 2.5.x and above. The services / components should not be in maintenance mode else the recovery_enabled might not work. Also if you want to see the difference then try changing the following part of the PUT request to true and then false and then re run the API call. "ServiceComponentInfo":{"recovery_enabled":"true"}
TO
"ServiceComponentInfo":{"recovery_enabled":"false"} .
... View more
09-04-2017
03:55 PM
@John Wright Based on your output it looks like you have configured proxy. So i guess in that case you should also set proxy inside your "/etc/yum.conf" file something like following so that the yum will also start using the same proxy. # cat /etc/yum.conf
[main]
proxy=http://x.x.x.x:3128
cachedir=/var/cache/yum/$basearch/$releasever
keepcache=0
debuglevel=2
logfile=/var/log/yum.log
exactarch=1
obsoletes=1
gpgcheck=1
plugins=1
installonly_limit=3
. The perform a yum clean and then try again. # yum clean all .
... View more
09-04-2017
03:47 PM
@uri ben-ari For various components (of different services) the "auto service" can be enabled as following: curl -H "X-Requested-By: ambari" -u admin:admin -X PUT -d '{"RequestInfo":{"query":"ServiceComponentInfo/component_name.in(APP_TIMELINE_SERVER,NODEMANAGER,RESOURCEMANAGER,ATLAS_SERVER,DATANODE,NAMENODE,NFS_GATEWAY,SECONDARY_NAMENODE,DRPC_SERVER,NIMBUS,STORM_UI_SERVER,SUPERVISOR,FALCON_SERVER,FLUME_HANDLER,HBASE_MASTER,HBASE_REGIONSERVER,HISTORYSERVER,HIVE_METASTORE,HIVE_SERVER,WEBHCAT_SERVER,INFRA_SOLR,KAFKA_BROKER,KNOX_GATEWAY,LIVY2_SERVER,SPARK2_JOBHISTORYSERVER,SPARK2_THRIFTSERVER,LIVY_SERVER,SPARK_JOBHISTORYSERVER,SPARK_THRIFTSERVER,LOGSEARCH_LOGFEEDER,LOGSEARCH_SERVER,METRICS_COLLECTOR,METRICS_GRAFANA,METRICS_MONITOR,OOZIE_SERVER,RANGER_ADMIN,RANGER_TAGSYNC,RANGER_USERSYNC,ZEPPELIN_MASTER,ZOOKEEPER_SERVER)"},"ServiceComponentInfo":{"recovery_enabled":"true"}}' http://localhost:8080/api/v1/clusters/Sandbox/components? . After executing the above API call , you can see the following kind of entry inside your "/var/log/ambari-server/ambari-server.log" which shows how to auto start is enabled on these services & components:
04 Sep 2017 15:46:44,016 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=HDFS, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,016 INFO [ambari-client-thread-27] AbstractResourceProvider:544 - Operations cannot be applied to component NFS_GATEWAY because service HDFS is in the maintenance state of ON
04 Sep 2017 15:46:44,016 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=HIVE, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,016 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: HIVE_METASTORE, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,017 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=HDFS, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,017 INFO [ambari-client-thread-27] AbstractResourceProvider:544 - Operations cannot be applied to component DATANODE because service HDFS is in the maintenance state of ON
04 Sep 2017 15:46:44,017 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=LOGSEARCH, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,017 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: LOGSEARCH_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,017 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=RANGER, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,018 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: RANGER_USERSYNC, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,018 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=YARN, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,018 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: RESOURCEMANAGER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,018 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=YARN, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,018 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: NODEMANAGER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,019 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=OOZIE, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,019 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: OOZIE_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,019 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=HDFS, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,019 INFO [ambari-client-thread-27] AbstractResourceProvider:544 - Operations cannot be applied to component NAMENODE because service HDFS is in the maintenance state of ON
04 Sep 2017 15:46:44,019 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=AMBARI_METRICS, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,020 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: METRICS_MONITOR, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,020 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=SPARK2, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,020 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: SPARK2_THRIFTSERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,020 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=STORM, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,020 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: DRPC_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,021 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=FLUME, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,021 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: FLUME_HANDLER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,021 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=SPARK2, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,021 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: LIVY2_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,022 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=LOGSEARCH, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,022 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: LOGSEARCH_LOGFEEDER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,022 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=KAFKA, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,022 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: KAFKA_BROKER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,023 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=MAPREDUCE2, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,023 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: HISTORYSERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,023 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=ZOOKEEPER, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,023 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: ZOOKEEPER_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,023 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=HIVE, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,024 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: WEBHCAT_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,024 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=HDFS, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,024 INFO [ambari-client-thread-27] AbstractResourceProvider:544 - Operations cannot be applied to component SECONDARY_NAMENODE because service HDFS is in the maintenance state of ON
04 Sep 2017 15:46:44,024 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=FALCON, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,025 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: FALCON_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,025 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=STORM, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,025 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: STORM_UI_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,025 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=STORM, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,026 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: SUPERVISOR, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,026 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=AMBARI_METRICS, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,026 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: METRICS_GRAFANA, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,026 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=YARN, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,027 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: APP_TIMELINE_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,027 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=SPARK, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,027 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: SPARK_JOBHISTORYSERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,028 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=SPARK2, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,029 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: SPARK2_JOBHISTORYSERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,030 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=HBASE, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,031 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: HBASE_MASTER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,031 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=RANGER, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,031 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: RANGER_ADMIN, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,032 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=AMBARI_INFRA, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,032 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: INFRA_SOLR, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,032 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=HIVE, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,032 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: HIVE_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,033 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=SPARK, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,033 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: LIVY_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,033 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=KNOX, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,033 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: KNOX_GATEWAY, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,034 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=ATLAS, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,034 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: ATLAS_SERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,034 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=ZEPPELIN, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,049 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: ZEPPELIN_MASTER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,049 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=RANGER, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,050 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: RANGER_TAGSYNC, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,050 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=STORM, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,050 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: NIMBUS, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,050 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=HBASE, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,050 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: HBASE_REGIONSERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,051 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=SPARK, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,051 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: SPARK_THRIFTSERVER, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,051 INFO [ambari-client-thread-27] AbstractResourceProvider:516 - Received a updateComponent request: [clusterName=Sandbox, serviceName=AMBARI_METRICS, componentName=Sandbox, desiredState=null, recoveryEnabled=true, componentCategory=null]
04 Sep 2017 15:46:44,051 INFO [ambari-client-thread-27] AbstractResourceProvider:559 - Component: METRICS_COLLECTOR, oldRecoveryEnabled: false, newRecoveryEnabled true
04 Sep 2017 15:46:44,161 INFO [ambari-client-thread-27] AmbariManagementControllerImpl:2597 - Created 0 stages
04 Sep 2017 15:46:44,683 INFO [qtp-ambari-agent-39] HeartBeatHandler:289 - Recovery configuration set to RecoveryConfig{, type=AUTO_START, maxCount=6, windowInMinutes=60, retryGap=5, maxLifetimeCount=1024, components=INFRA_SOLR,INFRA_SOLR_CLIENT,METRICS_MONITOR,METRICS_COLLECTOR,METRICS_GRAFANA,ATLAS_CLIENT,ATLAS_SERVER,FALCON_CLIENT,FALCON_SERVER,FLUME_HANDLER,HBASE_CLIENT,HBASE_MASTER,HBASE_REGIONSERVER,HIVE_METASTORE,HIVE_SERVER,HIVE_CLIENT,WEBHCAT_SERVER,KAFKA_BROKER,KNOX_GATEWAY,LOGSEARCH_LOGFEEDER,LOGSEARCH_SERVER,HISTORYSERVER,MAPREDUCE2_CLIENT,OOZIE_SERVER,OOZIE_CLIENT,PIG,RANGER_ADMIN,RANGER_TAGSYNC,RANGER_USERSYNC,SLIDER,SPARK_CLIENT,SPARK_JOBHISTORYSERVER,LIVY_SERVER,LIVY2_SERVER,SPARK2_THRIFTSERVER,SPARK2_CLIENT,SPARK2_JOBHISTORYSERVER,SQOOP,SUPERVISOR,NIMBUS,DRPC_SERVER,STORM_UI_SERVER,TEZ_CLIENT,NODEMANAGER,YARN_CLIENT,APP_TIMELINE_SERVER,RESOURCEMANAGER,ZEPPELIN_MASTER,ZOOKEEPER_SERVER,ZOOKEEPER_CLIENT, recoveryTimestamp=1504540004674} .
... View more
09-04-2017
03:32 PM
@John Wright If only ambari is not able to get the repomd.xml means it might be sending request via some proxy. Better to check if ambari is mistakenly using any proxy? Usually we define the proxyHost port information for ambari inside the file. SO it will be worth checking this file if someone has added a proxy setting there by mistake. # grep 'proxy' /var/lib/ambari-server/ambari-env.sh . https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.1.0/bk_ambari-administration/content/ch_setting_up_an_internet_proxy_server_for_ambari.html Also # grep 'proxy' /etc/yum.conf
... View more
09-04-2017
03:20 PM
@John Wright Looks like from ambari server the HWX repo is not accessible: http://public-repo-1.hortonworks.com/HDP/centos7-ppc/2.x/updates/2.6.2.0/repodata/repomd.xml: [Errno 14] curl#7 - "Failed connect to public-repo-1.hortonworks.com:80; Operation now in progress" . That can happen if the ambari server host is configured to pass the requests to external resources via some proxy server. So can you please check if you have any proxy configured of if the request need to pass through a proxy on your network? Please check the following files for proxy configurations: # grep 'proxy' /etc/yum.conf
# grep 'proxy' ~/.bash_profile
# grep 'proxy' ~/.profile
# grep 'proxy' /var/lib/ambari-server/ambari-env.sh
. export http_proxy=http://localhost:80 . Are you able to do "wget" or "curl" from ambari server host to the repo? wget http://public-repo-1.hortonworks.com/HDP/centos7-ppc/2.x/updates/2.6.2.0/repodata/repomd.xml .
... View more
09-04-2017
09:26 AM
@Kishore Kumar Pleas emake sure that all the cluster nodes has correct FQDN configured. All the hosts are returning the correct FQDN (hostname) when we run the following command: # hostname -f . Try starting the NameNode manually to see if it works ? As mentioned here: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_administration/content/starting_hdp_services.html .
... View more
09-04-2017
08:13 AM
@Rajendra Manjunath You are getting the error because of the following Cause: Caused by: java.io.IOException: No FileSystem for scheme: webhdfs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2644)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92) . So please check if you have the following property "dfs.webhdfs.enabled" set to "true" or not in your hdfs-site.xml? <property>
<name>dfs.webhdfs.enabled</name>
<value>false</value>
<final>true</final>
</property> Also can you please check if the "webhdfs" is working fine at your end or not? Syntax: curl "http://$NAMENODE_HOSTNAME:50070/webhdfs/v1/user?op=LISTSTATUS" Example: curl "http://erie1.example.com:50070/webhdfs/v1/user?op=LISTSTATUS" . Also please let us know which version of ambari are you using.
... View more
09-02-2017
07:19 PM
22 Kudos
Ambari is the heart of any HDP cluster. It provides us the feature of provisioning, managing, monitoring and securing Hadoop / HDP clusters. It's is a Java program which interacts with Database to read the cluster details and runs on embedded jetty server. Many times we find issues with Ambari server performance.
It's Ambari UI operations sometimes responds slowly or the startup might take longer time if it is not properly tuned. So in order to troubleshoot the ambari server performance related issues we should look at some of the data/stats and tuning parameters to make the ambari server perform better. In this article we will talk about some very basic tuning parameters and performance related troubleshooting.
What all information's needed?
When we notice that the ambari server is responding slow then we should look first the following details first:
1). The number of hosts added to the ambari cluster. So that accordingly we can tune the ambari agent thread pools.
2). The number of concurrent users (or the view users) who access the ambari server at a time. So that accordingly we can tune the ambari thread pools.
3). The age of the ambari cluster. If the ambari server is too old then the possibility is that some of the operational logs and the alert histories will be consuming a large amount of the Database which might be causing ambari DB queries to respond slow.
4). The Ambari Database health and it's geographic location from the ambari server, to isolate if there are any network delays.
5). Ambari server memory related tuning parameters to see if the ambari heap is set correctly.
6). For ambari UI slowness we should check the network proxy issues to see if there are any network proxies added between client the ambari server machine Or the network slowness.
7). If the ambari users are synced with the AD or external LDAP and if the communication between server and the AD/LDAP is good.
8). Also the resource availability on the ambari host like the available free memory and if any other service/component running on ambari server is consuming more Memory/CPU/IO.
.
How to Troubleshoot?
Usually we start with checking the ambari server memory settings, host level resource availability (Like: Memory/CPU/IO) and the thread dumps to see where the threads are stuck or taking long time to execute certain api/database calls.
.
Check-1). We will check the ambari-server log to see if there are any repeated warning or error messages.
.
Check-2). First we should check if the ambari-server host has enough free memory and CPU available, Also the list of open files (to see if there are any leaking), netstat output to find out if there are any CLOSE_WAIT or TIME_WAIT sockets. That we can check by running the following commands on the ambari server host.
Example:
# free -m
# top
# lsof -p $AMBARI_PID
# netstat -tnlpa | grep $AMBARI_PID
.
Check-3). If we see that enough free memory and CPU cycles are available then we can check if the thread dump shows us any stuck/blocked threads or the activities of the threads are normal ?
In order to do that we can collect ambari-server thread dumps. We can refer to the following article to know how to colect the ambari server thread dumps. We can use the "$JAVA_HOME/bin/jcmd" or "$JAVA_HOME/bin/jstack" kind of jvm utilities to do so.
https://community.hortonworks.com/articles/72319/how-to-collect-threaddump-using-jcmd-and-analyse-i.html
It is always recommended to collect at least 5-6 thread dumps in some interval like 10 seconds after each thread dump. This gives us a detailed idea about the thread activities during a period of time. The thread dump should be collected when we see the slow response from the ambari server else the thread dumps will show normal behavior.
.
Check-4). Sometimes we may encounter OutOfMemoryError in ambari-server log as following which indicates that ambari server Heap size is not tuned properly or it needs to be increased a bit more:
Exception in thread "qtp-ambari-agent-91" java.lang.OutOfMemoryError: Java heap space
There are some recommendations available for ambari server heap tuning based on the cluster size as part of the doc that can be used for heap tuning: https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-administration/content/ch_tuning_ambari_performance.html
.
We should also check the current memory utilization statistics of the ambari server. We can use the JVM utility "jmap" for the same.
Example:
/usr/jdk64/jdk1.8.0_112/bin/jmap -heap $AMBARI_SERVER_PID
Output:
# /usr/jdk64/jdk1.8.0_112/bin/jmap -heap `cat /var/run/ambari-server/ambari-server.pid`
Attaching to process ID 673, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.112-b15
using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 2147483648 (2048.0MB)
NewSize = 134217728 (128.0MB)
MaxNewSize = 536870912 (512.0MB)
OldSize = 402653184 (384.0MB)
NewRatio = 3
SurvivorRatio = 8
MetaspaceSize = 21807104 (20.796875MB)
CompressedClassSpaceSize = 1073741824 (1024.0MB)
MaxMetaspaceSize = 17592186044415 MB
G1HeapRegionSize = 0 (0.0MB)
Heap Usage:
New Generation (Eden + 1 Survivor Space):
capacity = 120848384 (115.25MB)
used = 78420056 (74.78719329833984MB)
free = 42428328 (40.462806701660156MB)
64.89127401157471% used
Eden Space:
capacity = 107479040 (102.5MB)
used = 72431960 (69.07649993896484MB)
free = 35047080 (33.423500061035156MB)
67.39170725752668% used
From Space:
capacity = 13369344 (12.75MB)
used = 5988096 (5.710693359375MB)
free = 7381248 (7.039306640625MB)
44.7897518382353% used
To Space:
capacity = 13369344 (12.75MB)
used = 0 (0.0MB)
free = 13369344 (12.75MB)
0.0% used
concurrent mark-sweep generation:
capacity = 402653184 (384.0MB)
used = 87617376 (83.55844116210938MB)
free = 315035808 (300.4415588378906MB)
21.760010719299316% used
37359 interned Strings occupying 3641736 bytes.
.
If the used heap usage is high and reaching the max heap then we can try increase the amabri-server memory by editing the "/var/lib/ambari-server/ambari-env.sh" file and increasing the heap memory (-Xmx4g) inside the property "AMBARI_JVM_ARGS" something as following:
# grep 'AMBARI_JVM_ARGS' /var/lib/ambari-server/ambari-env.sh
export AMBARI_JVM_ARGS=$AMBARI_JVM_ARGS' -Xms4g -Xmx4g -XX:MaxPermSize=128m -Djava.security.auth.login.config=$ROOT/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false'
.
Check-5). If we want to monitor heap and garbage collection details over a period of time then we can also enable the Garbage Collection logging for the ambari server by adding the GC log option in ambari "ambari-env.sh" file as following:
# grep 'AMBARI_JVM_ARGS' /var/lib/ambari-server/ambari-env.sh
export AMBARI_JVM_ARGS=$AMBARI_JVM_ARGS' -Xms512m -Xmx2048m -XX:MaxPermSize=128m -Djava.security.auth.login.config=$ROOT/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false -Xloggc:/var/log/ambari-server/ambari-server_gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps'
.
.
Ambari JVM/Database Monitoring using Grafana
Check-6). From Ambari 2.5 onward, We can also check the ambari performance statistics related to ambari jvm and database. For more information on this please refer to: https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.1.0/bk_ambari-operations/content/grafana_ambari_component_dashboards.html
http://$GRAFANA_HOST:3000/dashboard/db/ambari-server-jvm
http://$GRAFANA_HOST:3000/dashboard/db/ambari-server-database
.
If the ambari server metrics are not enabled then we can enable it. To enable Ambari Server metrics, make sure the following config file exists during Ambari Server start/restart - "/etc/ambari-server/conf/metrics.properties".
Currently, only 2 metric sources have been implemented - JVM Metric Source and Database Metric Source. To add / remove a metric source to be tracked the following config needs to be modified in the metrics.properties file.
metric.sources=jvm,database
Example:
# grep 'metric.sources' /etc/ambari-server/conf/metrics.properties
metric.sources=jvm,database
.
NOTE: Please do not forget to add the following line inside the "ambari.properties" file.
# grep 'profiler' /etc/ambari-server/conf/ambari.properties
server.persistence.properties.eclipselink.profiler=org.apache.ambari.server.metrics.system.impl.AmbariPerformanceMonitor
.
.
Ambari Thread Pool Tuning
Check-7). If the cluster size is large then we should also tune the "agent.threadpool.size.max" property inside the "/etc/ambari-server/conf/ambari.properties" file.
"agent.threadpool.size.max" : property sets max number of threads used to process heartbeats from ambari agents. The default value for this property is "25". This basically indicates the size of the Jetty connection pool used for handling incoming Ambari Agent requests.
# grep 'agent.threadpool.size.max' /etc/ambari-server/conf/ambari.properties
50
.
.
Check-8). If inside our ambari server we have some views (like Hive/File View ..etc) which is accessed by many concurrent users Or if there are many users access the ambari UI concurrently or makes Ambari Rest API calls. Then in such cases we should also increase the "client.threadpool.size.max" property value (default values is 25) inside the "/etc/ambari-server/conf/ambari.properties".
"client.threadpool.size.max" : The size of the Jetty connection pool used for handling incoming REST API requests. This should be large enough to handle requests from both web browsers and embedded Views.
# grep 'client.threadpool.size.max' /etc/ambari-server/conf/ambari.properties
100
If the client thread pool size is not set properly then while accessing ambari UI or making Ambari API calls we might see the following kind of response:
{
status: 503,
message: "There are no available threads to handle view requests"
}
.
.
Ambari Connection Pool Tuning.
Check-9). We can also add the following properties to adjust the JDBC connection pool settings for large clusters like above 100 nodes or based on need:
server.jdbc.connection-pool.acquisition-size=5
server.jdbc.connection-pool.max-age=0
server.jdbc.connection-pool.max-idle-time=14400
server.jdbc.connection-pool.max-idle-time-excess=0
server.jdbc.connection-pool.idle-test-interval=7200
- If using MySQL as the Ambari database, in your MSQL configuration, increase the wait_timeout and interacitve_timeout to 8 hours (28800) and max. connections from 32 to 128.
- It is critical that the Ambari configuration for "server.jdbc.connection-pool.max-idle-time" and "server.jdbc.connection-pool.idle-test-interval" must be lower than the MySQL "wait_timeout" and "interactive_timeout" set on the MySQL side. If you choose to decrease these timeout values, adjust down "server.jdbc.connection-pool.max-idle-time" and "server.jdbc.connection-pool.idle-test-interval" accordingly in the Ambari configuration so that they are less than "wait_timeout" and interactive_timeout.
.
.
Ambari Cache Tuning
Check-10). If the cluster size if more than 200 nodes then tuning the Cache will helps sometimes. For that we Calculate the new, larger cache size, using the following relationship, where <cluster_size> is the number of nodes in the cluster.
Following how to approximate value is calculated.
ecCacheSizeValue=60*<cluster_size>
Following part says how to apply that property
On the Ambari Server host, in /etc/ambari-server/conf/ambari-properties, add the following property and value. If the cluster has 500 nodes then we can set it to:
server.ecCacheSize=30000
.
.
Ambari Alert Related Tuning
Check-11). Setting "alerts.cache.enabled" , If the value for this property is set to "true", then alerts processed by the "AlertReceivedListener" will not write alert data to the database on every event. Instead, data like timestamps and text will be kept in a cache and flushed out periodically to the database. The default value is "false". Alert caching was experimental around ambari 2.2.2 version.
We can enable the Alerts cache and then monitor it for few days to see it's effect. We will need to add this parameter to "/etc/ambari-server/conf/ambari.properties". Some other properties related to alert caching & alert execution scheduler are as following.
Example:
alerts.cache.enabled=true
alerts.cache.size=100000
alerts.execution.scheduler.threadpool.size.core=4
alerts.execution.scheduler.threadpool.size.max=8
The "alerts.cache.size" defines the size of the alert cache which is by default set to "50000" when the alerts.cache.enabled.
"alerts.execution.scheduler.threadpool.size.core" defines the core number of threads used to process incoming alert events. The value should be increased as the size of the cluster increases.
"alerts.execution.scheduler.threadpool.size.max" defines the maximum number of threads which will handle published alert events. Default value is "2".
.
.
Ambari API Response Time Check
Check-12). During slowness of ambari we can try running the following curl call (which tries to fetch the cluster details) to see how much time does it take to get the cluster details. It gives us some idea if the cluster json response is taking some time or if it is too large or has lots of .
# time curl -i -u admin:admin -H 'X-Requested-By: ambari' -X GET http://amb25101.example.com:8080/api/v1/clusters/plain_cluster
real 0m20.234s
user 0m0.009s
sys 0m0.017s
# time curl -i -u admin:admin -H 'X-Requested-By: ambari' -X GET http://amb25101.example.com:8080/api/v1/clusters/plain_cluster?fields=Clusters/desired_configs
# time curl -i -u admin:admin -H 'X-Requested-By: ambari' -X GET http://amb25101.example.com:8080/api/v1/clusters/plain_cluster?fields=Clusters/health_report,Clusters/total_hosts,alerts_summary_hosts
"user" means userspace, so the number of CPU seconds spent doing work in the JVM code. User is the amount of CPU time spent in user-mode code (outside the kernel) within the process. This is only actual CPU time used in executing the process. Other processes and time the process spends blocked do not count towards this figure.
"sys" means kernel-space, so the number of cpu-seconds spent doing work in the kernel. Sys is the amount of CPU time spent in the kernel within the process. This means executing CPU time spent in system calls within the kernel, as opposed to library code, which is still running in user-space. Like 'user', this is only CPU time used by the process.
"real" means "wall lock" time. This is all elapsed time including time slices used by other processes and time the process spends blocked (for example if it is waiting for I/O to complete).
Example: For example ["user=3.00 sys=0.05 real=1.00"] means there was
>>> 50ms of kernel work,
>>> 3s of jvm work and
>>> overall it took 1 second
.
.
Ambari Database Query Logging
Check-13). In some cases it is useful to enable the Database Query Logging to find out how the queries are getting executed and how many times which query is getting executed.
We can enable the "server.jdbc.properties.loglevel=2" property inside the "/etc/ambari-server/conf/ambari.properties" file and restart the ambari server which will start writing the JDBC queries to the "/var/log/ambari-server/ambari-server.out" file.
# grep 'server.jdbc.properties.loglevel' /etc/ambari-server/conf/ambari.properties
server.jdbc.properties.loglevel=2
.
Example output of logged queries from ambari-server.out
# grep 'SELECT alert_' ambari-server.out
16:17:19.432 (3) FE=> Parse(stmt=null,query="SELECT alert_id, alert_definition_id, alert_instance, alert_label, alert_state, alert_text, alert_timestamp, cluster_id, component_name, host_name, service_name FROM alert_history WHERE (alert_id = $1)",oids={20})
16:17:19.439 (6) FE=> Parse(stmt=null,query="SELECT alert_id, alert_definition_id, alert_instance, alert_label, alert_state, alert_text, alert_timestamp, cluster_id, component_name, host_name, service_name FROM alert_history WHERE (alert_id = $1)",oids={20})
16:26:38.424 (3) FE=> Parse(stmt=null,query="SELECT t1.alert_id AS a1, t1.definition_id AS a2, t1.firmness AS a3, t1.history_id AS a4, t1.latest_text AS a5, t1.latest_timestamp AS a6, t1.maintenance_state AS a7, t1.occurrences AS a8, t1.original_timestamp AS a9 FROM alert_history t0, alert_definition t2, alert_current t1 WHERE ((((t0.cluster_id = $1) AND (t2.definition_name = $2)) AND (t0.host_name = $3)) AND ((t0.alert_id = t1.history_id) AND (t2.definition_id = t0.alert_definition_id))) LIMIT $4 OFFSET $5",oids={20,1043,1043,23,23})
.
.
Ambari Database Query/Performance Monitor
Check-14). In some cases it is also useful to enable "QueryMonitor" and "PerformanceMonitor" statistics. The "QueryMonitor" is used to measure query executions and cache hits. This can be useful for performance analysis in a complex system. The batch writing, this value is the number of statements to batch (default: 100)
Instead of "QueryMonitor" We also use native EclipseLink "PerformanceMonitor" to count how many queries are actually hitting the DB. The performance monitor and query monitor can be enabled in ambari through "/etc/ambari-server/conf/ambari.properties" using the below property:
Example:
server.persistence.properties.eclipselink.profiler=PerformanceMonitor
server.persistence.properties.eclipselink.jdbc.batch-writing.size=25
server.persistence.properties.eclipselink.profiler=QueryMonitor
In order to know more about how to use them properly, we can refer to the following article: https://community.hortonworks.com/articles/73269/how-to-analyze-the-ambari-servers-db-activity-perf.html
.
.
Ambari Database Cleanup / Purge
Check-15). In some old clusters we see that there are lots of old "alert_history" or old alert notification data entries present in the database that causes slowness, As with time these entries grows much on the database. So the DB dump size also grows and the DB queries can respond slow results. We can use the following command to perform some DB cleanup.
# ambari-server db-cleanup -d 2016-09-30 --cluster-name=MyCluster
For more details on this refer to: https://community.hortonworks.com/articles/134958/ambari-database-cleanup-speed-up.html
https://issues.apache.org/jira/browse/AMBARI-20687
The db-cleanup works well from ambari 2.5.0/2.5.1 (ambari 2.4 there were some issues reported).
.
From Ambari 2.5.2 Onwards: From Ambari 2.5.2 onwards the name of this operation will be changed to "db-purge-history" and apart from the Alert related tables it should also consider of other tables lie host_role_command and execution_commands and if there is any other tables as well.
# ambari-server db-purge-history --cluster-name Prod --from-date 2017-08-01
See: https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.2.0/bk_ambari-administration/content/purging-ambari-server-history.html
The "db-purge-history" command will analyze the following tables in the Ambari Server database and remove those rows that can be deleted that have a create date after the --from-date specified when the command is run.
.
AlertCurrent
AlertNotice
ExecutionCommand
HostRoleCommand
Request
RequestOperationLevel
RequestResourceFilter
RoleSuccessCriteria
Stage
TopologyHostRequest
TopologyHostTask
TopologyLogicalTask
.
.
... View more
Labels:
08-31-2017
12:21 PM
1 Kudo
@Anurag Mishra Yes, the "Download All Client Configs" feature is introduced from Ambari 2.5 onwards. In Ambari 2.4 and previous version you will need to go to individual service and then from the "Service Actions" dropdown you will need to choose "Download Client Configs". Like Ambari UI --> HDFS -->Service Actions --> "Download Client Configs".
... View more
08-31-2017
08:48 AM
@Anurag Mishra Please run the following command once again: # curl -u admin:admin -H "X-Requested-By: ambari" -X GET http://localhost:8080/api/v1/clusters/Sandbox/components?format=client_config_tar -o /tmp/All_Configs/new_cluster_configs.tar.gz Then check : # file /tmp/All_Configs/new_cluster_configs.tar.gz .
... View more