Member since
12-01-2016
25
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1751 | 03-26-2017 03:54 AM |
08-04-2017
12:38 PM
@Jay SenSharma Thanks for getting back on this, the details of Ambari Agent as below ]$ ambari-agent --version
2.2.1.0
]$ rpm -qa|grep ambari-agent
ambari-agent-2.2.1.0-161.x86_64 Its does seem like , the issue indicated in the Jira is relevant to the issue that occurred. As of now this issue has occurred only once but it does seem like migrating would be a good option to avoid this issue in future. Also, i had indicated that Namenode CPU WIO was N/A, after a few hours i am able to see the metric on the Dashboard.
... View more
12-13-2017
02:57 PM
@mqureshi Do you think that switching to G1GC may help in this scenarios? , @mqureshi Does switching to G1GC help in this scenarios?
... View more
06-29-2017
11:21 PM
Hi @ssathish, I did look at the Link you posted and decided to delete the file.
CAUTION:
For some reason a few hours later there were inconsistencies in the cluster . One of the data nodes (D5) were clean up was done had corruption in the way containers were processed. Some jobs for which containers were lunched in D5 executed to completion successfully and some other jobs failed due to Vertex failed error. We could not find any errors in RM log/Datanode Log/Node Manager Log We had to remove D5 off the cluster and reinstall node manager to set things right.
... View more
05-29-2017
06:17 AM
@mqureshi
The cluster currently only has one active name node.
Is there a better way to find out the 'Active Node' ?
I used the following as well.. but does not distinguish
curl --user admin:admin http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/host_components?HostRoles/component_name=NAMENODE&metrics/dfs/FSNamesystem/HAState=active dh01 ~]$ curl --user admin:admin http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/host_components?HostRoles/component_name=NAMENODE&metrics/dfs/FSNamesystem/HAState=active
[1] 16533
-bash: metrics/dfs/FSNamesystem/HAState=active: No such file or directory
[ayguha@dh01 ~]$ {
"href" : "http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/host_components?HostRoles/component_name=NAMENODE",
"items" : [
{
"href" : "http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/hosts/dh01.int.belong.com.au/host_components/NAMENODE",
"HostRoles" : {
"cluster_name" : "belong1",
"component_name" : "NAMENODE",
"host_name" : "dh01.int.belong.com.au"
},
"host" : {
"href" : "http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/hosts/dh01.int.belong.com.au"
}
},
{
"href" : "http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/hosts/dh02.int.belong.com.au/host_components/NAMENODE",
"HostRoles" : {
"cluster_name" : "belong1",
"component_name" : "NAMENODE",
"host_name" : "dh02.int.belong.com.au"
},
"host" : {
"href" : "http://dh01.int.belong.com.au:8080/api/v1/clusters/belong1/hosts/dh02.int.belong.com.au"
}
}
]
}
Also hdfs-site.xml does not have the property dfs.namenode.rpc-address.
... View more
03-26-2017
03:54 AM
1 Kudo
Hi @Jay SenSharma,
Thanks for your input.
I removed the other repo files and have only "ambari.repo", "HDP.repo" and "HDP-UTILS.repo" and ran the following: sudo python /usr/lib/python2.6/site-packages/ambari_agent/HostCleanup.py --silent --skip=users
yum clean all
ls -ltr /etc/yum.repos.d/ But , i still could not get around the following error that was there previously: File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail:
Execution of '/usr/bin/yum -d 0 -e 0 -y install ambari-metrics-monitor' returned 1. ERROR with rpm_check_debug vs depsolve:
libkadm5clnt_mit.so.8()(64bit) is needed by krb5-workstation-1.10.3-65.el6.x86_64
libkadm5clnt_mit.so.8(kadm5clnt_mit_8_MIT)(64bit) is needed by krb5-workstation-1.10.3-65.el6.x86_64
libkadm5srv_mit.so.8()(64bit) is needed by krb5-workstation-1.10.3-65.el6.x86_64
libkadm5srv_mit.so.8(kadm5srv_mit_8_MIT)(64bit) is needed by krb5-workstation-1.10.3-65.el6.x86_64
You could try running: rpm -Va --nofiles --nodigest Your transaction was saved, rerun it with:
yum load-transaction /tmp/yum_save_tx-2017-03-25-06-305zHb5j.yumtx After some googling I did the following on all hosts and i tried the CLUSTER INSTALL WIZARD from AMBARI again for Metrics Collector and Zookeeper installation on all EC2 nodes.
It worked !! 🙂
[ec2-user@ip-172-31-5-78 ~]$ yum install libkadm5
Loaded plugins: amazon-id, rhui-lb, security
Repo rhui-REGION-client-config-server-6 forced skip_if_unavailable=True due to: /etc/pki/rhui/cdn.redhat.com-chain.crt
Repo rhui-REGION-client-config-server-6 forced skip_if_unavailable=True due to: /etc/pki/rhui/product/rhui-client-config-server-6.crt
Repo rhui-REGION-client-config-server-6 forced skip_if_unavailable=True due to: /etc/pki/rhui/rhui-client-config-server-6.key
Repo rhui-REGION-rhel-server-releases forced skip_if_unavailable=True due to: /etc/pki/rhui/cdn.redhat.com-chain.crt
Repo rhui-REGION-rhel-server-releases forced skip_if_unavailable=True due to: /etc/pki/rhui/product/content-rhel6.crt
Repo rhui-REGION-rhel-server-releases forced skip_if_unavailable=True due to: /etc/pki/rhui/content-rhel6.key
Repo rhui-REGION-rhel-server-releases-optional forced skip_if_unavailable=True due to: /etc/pki/rhui/cdn.redhat.com-chain.crt
Repo rhui-REGION-rhel-server-releases-optional forced skip_if_unavailable=True due to: /etc/pki/rhui/product/content-rhel6.crt
Repo rhui-REGION-rhel-server-releases-optional forced skip_if_unavailable=True due to: /etc/pki/rhui/content-rhel6.key
Repo rhui-REGION-rhel-server-rh-common forced skip_if_unavailable=True due to: /etc/pki/rhui/cdn.redhat.com-chain.crt
Repo rhui-REGION-rhel-server-rh-common forced skip_if_unavailable=True due to: /etc/pki/rhui/product/content-rhel6.crt
Repo rhui-REGION-rhel-server-rh-common forced skip_if_unavailable=True due to: /etc/pki/rhui/content-rhel6.key
You need to be root to perform this command.
[ec2-user@ip-172-31-5-78 ~]$ sudo yum install libkadm5
Loaded plugins: amazon-id, rhui-lb, security
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package libkadm5.x86_64 0:1.10.3-65.el6 will be installed
--> Processing Dependency: krb5-libs(x86-64) = 1.10.3-65.el6 for package: libkadm5-1.10.3-65.el6.x86_64
--> Running transaction check
---> Package krb5-libs.x86_64 0:1.10.3-15.el6_5.1 will be updated
--> Processing Dependency: krb5-libs = 1.10.3-15.el6_5.1 for package: krb5-workstation-1.10.3-15.el6_5.1.x86_64
---> Package krb5-libs.x86_64 0:1.10.3-65.el6 will be an update
--> Running transaction check
---> Package krb5-workstation.x86_64 0:1.10.3-15.el6_5.1 will be updated
---> Package krb5-workstation.x86_64 0:1.10.3-65.el6 will be an update
--> Finished Dependency Resolution
Dependencies Resolved
============================================================================================================================================================================
Package Arch Version Repository Size
============================================================================================================================================================================
Installing:
libkadm5 x86_64 1.10.3-65.el6 rhui-REGION-rhel-server-releases 143 k
Updating for dependencies:
krb5-libs x86_64 1.10.3-65.el6 rhui-REGION-rhel-server-releases 675 k
krb5-workstation x86_64 1.10.3-65.el6 rhui-REGION-rhel-server-releases 814 k
Transaction Summary
============================================================================================================================================================================
Install 1 Package(s)
Upgrade 2 Package(s)
Total size: 1.6 M
Total download size: 143 k
Is this ok [y/N]: y
Downloading Packages:
libkadm5-1.10.3-65.el6.x86_64.rpm | 143 kB 00:00
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Updating : krb5-libs-1.10.3-65.el6.x86_64 1/5
Installing : libkadm5-1.10.3-65.el6.x86_64 2/5
Updating : krb5-workstation-1.10.3-65.el6.x86_64 3/5
Cleanup : krb5-workstation-1.10.3-15.el6_5.1.x86_64 4/5
Cleanup : krb5-libs-1.10.3-15.el6_5.1.x86_64 5/5
Verifying : krb5-libs-1.10.3-65.el6.x86_64 1/5
Verifying : libkadm5-1.10.3-65.el6.x86_64 2/5
Verifying : krb5-workstation-1.10.3-65.el6.x86_64 3/5
Verifying : krb5-libs-1.10.3-15.el6_5.1.x86_64 4/5
Verifying : krb5-workstation-1.10.3-15.el6_5.1.x86_64 5/5
Installed:
libkadm5.x86_64 0:1.10.3-65.el6
Dependency Updated:
krb5-libs.x86_64 0:1.10.3-65.el6 krb5-workstation.x86_64 0:1.10.3-65.el6
Complete!
... View more