About vpoornalingam

vpoornalingam · ‎03-05-2016

@mallikarjunarao m As suggested by the document as referred to by @Neeraj Sabharwal, it's important to take care of Ambari Metric System memory requirement as well, in case it is going to be installed in the same system as Ambari Server. In which case I would recommend to plan for at least 16-32Gb only for AMS. More than 16 Gb if the cluster is going to run Hbase.

vpoornalingam · ‎03-04-2016

@Örjan Lundberg Please run the following in your Ambari meta database: Select * from host_version; This would show the hosts where installation has failed - and installation needs to be retried.

vpoornalingam · ‎03-03-2016

1. If using Ambari versions before 2.1.2, Disable HBase per region metrics. On Ambari server host. Edit the following files under /var/lib/ambari-server/resources/: common-services/HBASE/0.96.0.2.0/package/templates/HBASE/hadoop-metrics2-hbase.properties-GANGLIA-MASTER.j2 common-services/HBASE/0.96.0.2.0/package/templates/HBASE/hadoop-metrics2-hbase.properties-GANGLIA-RS.j2 and add the following lines at the end before '% endif %}': *.source.filter.class=org.apache.hadoop.metrics2.filter.GlobFilter hbase.*.source.filter.exclude=*Regions* Do a rolling restart of the HBase RegionServers. Note: This does not disable RS metrics. It just disables the per region / per table metrics collected at Region level. This is disabled by default from Ambari 2.1.2. 2. Tune AMS configs: Find out the Heap available to AMS collector host. Change the following settings based on available memory: ams-hbase-env_ :: hbase_master_heapsize = 8192m (Or 16384m if available) ams-hbase-env_ :: hbase_master_xmn_size = 1024m ams-hbase-env_ :: regionserver_xmn_size = 1024m ams-hbase-site :: phoenix.query.spoolThresholdBytes= 25165824 (24 MB from 12 MB) 3. AMS data storage If using embedded mode, change the write paths for ams-hbase-site :: hbase.rootdir ams-hbase-site :: hbase.tmp.dir so that its placed in the fastest possible disk. Also it's better to keep hbase.tmp.dir in a location different from hbase.rootdir After completing the above, stop AMS from Ambari. Once stopped ensure that the process are stopped by doing a ps -aux | grep ams If the process are still around, kill the same and clean up /var/run/ambari-metrics-collector/*.pid file. Now restart AMS services using Ambari.

vpoornalingam · ‎03-02-2016

@Stefan Kupstaitis-Dunkler You could query the Ambari db for table "clusters" for eg. mysql> select * from clusters; +------------+-------------+--------------+--------------+--------------------+---------------+-----------------------+------------------+ | cluster_id | resource_id | cluster_info | cluster_name | provisioning_state | security_type | desired_cluster_state | desired_stack_id | +------------+-------------+--------------+--------------+--------------------+---------------+-----------------------+------------------+ | 2 | 4 | | Test | INSTALLED | NONE | | 3 | +------------+-------------+--------------+--------------+--------------------+---------------+-----------------------+------------------+ 1 row in set (0.01 sec) Or using curl get http://<ambari-server>:8080/api/v1/clusters/C1/ and process.

vpoornalingam · ‎03-02-2016

With recent improvements in Ambari, upgrading can be done easily either using Rolling Upgrade or Express Upgrade. But there are times when upgrade/downgrade gets stuck either since all the precautions were not followed or due to product issues. When upgrade gets stuck, it is typically in an Upgrade Paused status or there aren't any status as such. At this juncture, care needs to be taken to ensure that the ambari-server is not restarted without consulting with Technical Support. Current status of the upgrade can be checked using multiple method: Ambari Log files Ambari API URL's Ambari Databases 1. Ambari Log files Review Ambari Log files for any errors / exception when the upgrade is in progress. 2. Use Ambari URL's to find what are the failures http://<ambari-server>:8080/api/v1/clusters/c1/upgrades For eg http://vcert1.novalocal:8080/api/v1/clusters/VCertify/upgrades This would show all the upgrade or downgrade attempts. for eg if this is a upgrade failure, identify the latest Upgrade attempt number http://vcert1.novalocal:8080/api/v1/clusters/VCertify/upgrades/119?fields=upgrade_groups/upgrade_items/UpgradeItem/status,upgrade_groups/upgrade_items/UpgradeItem/context,upgrade_groups/UpgradeGroup/title this would list all the actions taken as part of upgrade attempt 119. Review the output to identify those JSON outputs without status 'COMPLETED'. This would give a clue to what items have failed to move to COMPLETED status and troubleshoot from there on. 3. Ambari Database Note: Care has to be taken while using Ambari DB. It is mandatory to backup the database before doing any upgrade / downgrade. Following tables in the Ambari database are ideal to start troubleshooting the issues: repo_version - this contains all the repo versions installed in the system mysql> select repo_version_id, stack_id, version, display_name from repo_version; +-----------------+----------+--------------+------------------+ | repo_version_id | stack_id | version | display_name | +-----------------+----------+--------------+------------------+ | 1 | 4 | 2.3.0.0-2557 | HDP-2.3.0.0-2557 | | 2 | 4 | 2.3.2.0-2950 | HDP-2.3.2.0-2950 | | 51 | 4 | 2.3.4.0-3485 | HDP-2.3.4.0 | +-----------------+----------+--------------+------------------+ 3 rows in set (0.00 sec) cluster_version - this contains the current versions in the cluster [installed / current / upgrading etc] mysql> select * from cluster_version; +----+-----------------+------------+-------------+---------------+---------------+------------+ | id | repo_version_id | cluster_id | state | start_time | end_time | user_name | +----+-----------------+------------+-------------+---------------+---------------+------------+ | 1 | 1 | 2 | OUT_OF_SYNC | 1448369111902 | 1448369112183 | _anonymous | | 2 | 2 | 2 | UPGRADING | 1448521029573 | 1452063126443 | admin | | 51 | 51 | 2 | CURRENT | 1450860003969 | 1451397592558 | admin | +----+-----------------+------------+-------------+---------------+---------------+------------+ 3 rows in set (0.00 sec) host_version - this contains the details about the versions installed in a given host mysql> select * from host_version; +----+-----------------+---------+-------------+ | id | repo_version_id | host_id | state | +----+-----------------+---------+-------------+ | 1 | 1 | 4 | OUT_OF_SYNC | | 2 | 1 | 1 | OUT_OF_SYNC | | 3 | 1 | 2 | OUT_OF_SYNC | | 4 | 1 | 3 | OUT_OF_SYNC | | 5 | 2 | 1 | UPGRADED | | 6 | 2 | 3 | UPGRADED | | 7 | 2 | 2 | UPGRADED | | 8 | 2 | 4 | OUT_OF_SYNC | | 51 | 51 | 3 | CURRENT | | 52 | 51 | 2 | CURRENT | | 53 | 51 | 4 | CURRENT | | 54 | 51 | 1 | CURRENT | +----+-----------------+---------+-------------+ 12 rows in set (0.05 sec) hostcomponentstate- shows the current version / state of a given component or service mysql> select * from hostcomponentstate; +-----+------------+------------------------+--------------+------------------+---------------+---------+----------------+---------------+----------------+ | id | cluster_id | component_name | version | current_stack_id | current_state | host_id | service_name | upgrade_state | security_state | +-----+------------+------------------------+--------------+------------------+---------------+---------+----------------+---------------+----------------+ | 2 | 2 | NAMENODE | 2.3.4.0-3485 | 4 | STARTED | 4 | HDFS | NONE | UNSECURED | | 3 | 2 | HISTORYSERVER | 2.3.4.0-3485 | 4 | STARTED | 1 | MAPREDUCE2 | NONE | UNSECURED | | 4 | 2 | APP_TIMELINE_SERVER | 2.3.4.0-3485 | 4 | STARTED | 4 | YARN | NONE | UNSECURED | | 5 | 2 | RESOURCEMANAGER | 2.3.4.0-3485 | 4 | STARTED | 4 | YARN | NONE | UNSECURED | | 6 | 2 | WEBHCAT_SERVER | 2.3.4.0-3485 | 4 | INSTALLED | 2 | HIVE | NONE | UNSECURED | | 8 | 2 | HIVE_SERVER | 2.3.4.0-3485 | 4 | INSTALLED | 2 | HIVE | NONE | UNSECURED | Reviewing the above tables would give you an idea about the current state of upgrade / downgrade. Further troubleshooting would depend on the current state of Ambari upgrade/ downgrade, but the above should give fair clue to troubleshoot the issues.

vpoornalingam · ‎02-29-2016

How about a hard refresh of the Ambari Server / dashboard?

vpoornalingam · ‎02-29-2016

@Harini Yadav One location to check the components for a given product is for eg: /var/lib/ambari-server/resources/common-services/SPARK/1.4.1.2.3/metainfo.xml In this look for <names> below <components>, for eg <component> <name>SPARK_THRIFTSERVER</name> There is no documentation as such from Hortonworks for this. Generic documentation is available at: Ambari CWiki

vpoornalingam · ‎02-29-2016

@Jagdish SaripellaWhich version of HDP are you trying to install using these Ambari Versions? For the given version, could you please list /var/lib/ambari-server/resources/stacks/HDP/version/services?

vpoornalingam · ‎02-29-2016

1) Is going to be a tricky one to handle. Ambari primarily depends on host FQDN for all operations. To change the same, the ambari db might have to be updated. Before doing any changes Ambari db backup needs to be completed. IP addresses doesn't matter. This needs to be tried on a test environment thoroughly. 2)All the port numbers required by HDP needs to be left open in iptables. If Ambari services are still going to run as root, then there would be no change. One of the option would be to setup the second cluster afresh and use distcp to copy from existing cluster to the new one.

vpoornalingam · ‎02-19-2016

Ambari server typically gets to know about the service availability from Ambari agent and using the '*.pid' files created in /var/run. Following covers couple of scenarios to troubleshoot: Scenario 1: Ambari Agent is not communicating appropriately with Ambari Server If all the services are shown to be down in a given node, then it is most likely an Ambari agent issue. Following steps could be used to troubleshoot Ambari agent issues # ambari-agent status Found ambari-agent PID: 19715 ambari-agent running. Agent PID at: /var/run/ambari-agent/ambari-agent.pid Agent out at: /var/log/ambari-agent/ambari-agent.out Agent log at: /var/log/ambari-agent/ambari-agent.log and check if the pid indeed exist by doing a ps -ef. In case the pid doesn't exist, also run ps -ef | grep 'ambari_agent' to see if a stale process is around. Eg, # ps -ef | grep "ambari_agent" root 18626 13528 0 04:45 pts/0 00:00:00 grep ambari_agent root 19707 1 0 Feb17 ? 00:00:00 /usr/bin/python2 /usr/lib/python2.6/site-packages/ambari_agent/AmbariAgent.py start root 19715 19707 1 Feb17 ? 00:28:01 /usr/bin/python2 /usr/lib/python2.6/site-packages/ambari_agent/main.py start If the agent process id and /var/run/ambari-agent/ambari-agent.pid are matching, then possibly there is no issue with the agent process itself. In case there is a mismatch, kill all the stray Ambari Agent process and remove /var/run/ambari-agent/ambari-agent.pid. Then restart the Agent. Once restarted, verify if the services are seen good in the Ambari Dashboard At this point, also review /var/log/ambari-agent/ambari-agent.log & ambari-agent.out to see if there has been issues while starting the process itself. One of the issue could be due to /var/lib/ambari-agent/data/structured-out-status.json. Cat this file to review the content. Typical content could be like following: cat structured-out-status.json {"processes": [], "securityState": "UNKNOWN"} Compare the content with the same file in another node which is working fine. Stop ambari-agent, move this file to another file and restart ambari-agent. Scenario 2: Ambari Agent is good, but the HDP services are still shown to be down If there are only few services which are shown to be down, then it could be due to the /var/run/PRODUCT/product.pid file is not matching with the process running in the node. For eg, if Hiveserver2 service is shown to be not up in Ambari, when hive is actually working fine, check the following files: # cd /var/run/hive # ls -lrt-rw-r--r-- 1 hive hadoop 6 Feb 17 07:15 hive.pid -rw-r--r-- 1 hive hadoop 6 Feb 17 07:16 hive-server.pid Check the content of these files. For eg, # cat hive-server.pid 31342 # ps -ef | grep 31342 hive 31342 1 0 Feb17 ? 00:14:36 /usr/jdk64/jdk1.7.0_67/bin/java -Xmx1024m -Dhdp.version=2.2.9.0-3393 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.2.9.0-3393 -Dhadoop.log.dir=/var/log/hadoop/hive -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.2.9.0-3393/hadoop -Dhadoop.id.str=hive -Dhadoop.root.logger=INFO,console -Djava.library.path=:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:/usr/hdp/2.2.9.0-3393/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx1024m -XX:MaxPermSize=512m -Xmx1437m -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/hdp/2.2.9.0-3393/hive/lib/hive-service-0.14.0.2.2.9.0-3393.jar org.apache.hive.service.server.HiveServer2 --hiveconf hive.aux.jars.path=file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar -hiveconf hive.metastore.uris= -hiveconf hive.log.file=hiveserver2.log -hiveconf hive.log.dir=/var/log/hive If the content of hive-server.pid and the process running for HiveServer2 aren't matching, then Ambari wouldn't report the status correctly. Ensure that these files have correct ownership / permissions. For eg, the pid files for Hive should be owned by hive:hadoop and it should be 644. In this situation, change the ownership/ permission correctly and update the file with the correct PID of hive process. This would ensure that Ambari shows the status correctly. Care should be taken while doing the above by ensuring that this is the only HiveServer2 process running in the system and that HiveServer2 is indeed working fine. If there are multiple HiveServer2 processes, then some of them could be stray which needs to be killed. Post this, if possible also restart the affected services and ensure that the status of the services are correctly shown.

Online	Offline
Last Visited	‎06-24-2019 04:30 PM

Member Since	‎09-29-2015 12:30 PM
Last Visited	‎06-24-2019 04:30 PM
Posts	140
Kudos received	87

Cloudera Community

Re: Changing zookeeper client port in Ambari is no...

Re: How does ambari display service state

Re: How can I change the AMS hbase.rootdir in embe...

Re: While trying to start Hbase getting the follow...

Re: error ambari-action-scheduler

Re: How much memory and cpu is required for ambari...

Re: HDP 2.4 upgrade - Found service components in ...

Ambari Metric Server basic tuning

Re: How to check if Ambari setup was already done?

Using Ambari to check when upgrade / downgrade see...

Re: Same version of Ambari version on diferent ser...

Re: What is the Ambari component name for spark th...

Re: Same version of Ambari version on diferent ser...

Re: What is the recommended method for changing ho...

Ambari shows HDP services to be down whereas they ...