About Eric_Periard

Eric_Periard · ‎07-11-2016

Additionally I configured HA and followed: https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.1.1/bk_Ambari_Users_Guide/content/_how_to_configure_namenode_high_availability.html and https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_hadoop-ha/content/ha-nn-config-cluster.html Still no cigar...

Eric_Periard · ‎07-11-2016

So SSHFence never seem to have worked for me with failover activated. I enabled sshfence, made an hdfs user on ambari, generated an ssh-keygen key for passwordless session, manually tested the said ssh passwordless connection.... everythinjg to have been set yet still not working... whenever one of my namenode failed over in the backend ambari falsely reported both as active. so I went to default and used the following script as its the only way I can get my primary nn to stay active and secondary as standby. Essentially the script pings the NN's every minute and if they respond it checks for the current status and forces them into an active:standby state. Theoretically having the snn as active and nn as standby should be fine, however on ambari it never reports the status correctly and unless I force transition the nodes it doesnt report them active:standby and the hdfs://ClusterName fails to work.... If someone has a better solution I'd love to hear it.... For those wondering I'm running on Ambari 2.2.2.0 and HDP 2.4.2.0 on a CentOS 6 x64 environment. Additionally looking at the documentation it implies creating a user and making and ssh script to run fencing approach... what I dont get is what is the point of running a said script if nn complains that "failover is activated... you cannot manually failover the nodes" or something along that line. There's something I'm definitely missing. Anyhow the solution above has been working for me but it doesnt feel clean and I'd like to know how to community handles HA and what scripting approach you use....

Eric_Periard · ‎06-29-2016

So in the end knowing my config was fine I added stack.upgrade.bypass.prechecks=true to /etc/ambari-server/conf/ambari.properties and chose to disregard the warning. The upgrade went fine and all test are green. Essentially we went fro hdp 2.1 back then to the bleeding edge and some steps had to be done manually. So somehow after a few tries we succeeded but most likely left some artifacts behind.... I'm still interested to find out where this entry is located and where I could clean it up. thankfully we're getting professional services soon and building a brand new pro-level cluster with the help of some hortonworks engineers so there wont be any weird or unknown configuration choices.

Eric_Periard · ‎06-29-2016

"Further, change the status of host_role_command with id 1 to COMPLETED" How can this be done manually?

Eric_Periard · ‎06-29-2016

Unfortunately I ran all the queries in the post and I dont see any abnormalities...

Eric_Periard · ‎06-28-2016

then if I run the ambari-server set-current.... command I get the following: ERROR: Exiting with exit code 1. REASON: Error during setting current version. Http status code - 500. { "status" : 500, "message" : "org.apache.ambari.server.controller.spi.SystemException: Finalization failed. More details: \nSTDOUT: Begin finalizing the upgrade of cluster Timbit to version 2.4.2.0-258\n\nSTDERR: The following 516 host component(s) have not been upgraded to version 2.4.2.0-258. Please install and upgrade the Stack Version on those hosts and try again.\nHost components:\nPIG on host dn7.HugeData.lab\nPIG on host dn3.HugeData.lab\nPIG on host dn26.HugeData.lab\nPIG on host dn9.HugeData.lab\nPIG on host dn22.HugeData.lab\nPIG on host dn27.HugeData.lab\nPIG on host dn8.HugeData.lab\nPIG on host dn6.HugeData.lab\nPIG on host dn5.HugeData.lab\nPIG on host dn19.HugeData.lab\nPIG on host snn.HugeData.lab\nPIG on host dn18.HugeData.lab\nPIG on host dn21.HugeData.lab\nPIG on host dn17.HugeData.lab\nPIG on host dn23.HugeData.lab\nPIG on host dn25.HugeData.lab\nPIG on host dn1.HugeData.lab\nPIG on host dn15.HugeData.lab\nPIG on host dn14.HugeData.lab\nPIG on host esn2.HugeData.lab\nPIG on host esn.HugeData.lab\nPIG on host dn16.HugeData.lab\nPIG on host dn10.HugeData.lab\nPIG on host dn2.HugeData.lab\nPIG on host dn4.HugeData.lab\nPIG on host dn12.HugeData.lab\nPIG on host nn.HugeData.lab\nPIG on host dn11.HugeData.lab\nPIG on host dn28.HugeData.lab\nPIG on host dn20.HugeData.lab\nSPARK_JOBHISTORYSERVER on host esn2.HugeData.lab\nSPARK_CLIENT on host dn7.HugeData.lab\nSPARK_CLIENT on host dn3.HugeData.lab\nSPARK_CLIENT on host dn26.HugeData.lab\nSPARK_CLIENT on host dn9.HugeData.lab\nSPARK_CLIENT on host dn22.HugeData.lab\nSPARK_CLIENT on host ...

Eric_Periard · ‎06-28-2016

Finalize upgrade successful for nn.HugeData.lab/x.x.x.40:8020 Finalize upgrade successful for snn.HugeData.lab/x.x.x.41:8020 I then re-run the check and it still fails ://

Eric_Periard · ‎06-28-2016

So I've upgraded Ambari to the bleeding edge 2.2.2.0 today and I was about to rollout HDP 2.4.2.0-258 and I am stumped at the pre-check script after all the HDP-2.4.2.0 packages have been sucessfully installed across the board. Upgrade to HDP-2.4.2.0 Requirements You must meet these requirements before you can proceed. A previous upgrade did not complete. Reason: Upgrade attempt (id: 1, request id: 2,681, from version: 2.2.6.0-2800, to version: 2.4.0.0-169) did not complete task with id 17,829 since its state is FAILED instead of COMPLETED. Please ensure that you called: ambari-server set-current --cluster-name=$CLUSTERNAME --version-display-name=$VERSION_NAME Further, change the status of host_role_command with id 1 to COMPLETED Failed on: HugeData I ran the command as instructed: ambari-server set-current --cluster-name=HugeData --version-display-name=HDP-2.4.2.0 To no avail... I am stumped at this point in time and not sure where to look to change that manually in the backend? As far as I am concerned we had been running 2.4.0.0-169 without any issues (except for the NN failover) for about a month... According to the error above we missed something in the 2.2.x to 2.4.x upgrade...... I'm sure there's a value I can edit to set as successful but I am not sure right now. Your input would be much appreciated 🙂

Eric_Periard · ‎06-16-2016

Good to know that I'm not going crazy then.... I have a feeling this is related? https://issues.apache.org/jira/browse/AMBARI-15235 Mentioned as fixed somewhat in 2.2.2? I'm still on 2.2.1.1.

Eric_Periard · ‎06-16-2016

Greetings! So it seems that my configuration is wrong OR Ambari 2.2.2.1 has a refresh issue? Basically I'm running a cluster with an high availability NN's configuration. For some reason unknown when NN fails SNN becomes the active node as expected and NN goes into standby once the service is restarted. I can confirm the failover is successful by running hdfs haadmin -getServiceState nn1 & hdfs haadmin -getServiceState nn2. Respectively from that point nn1 reports Standby and nn2 reports Active. The funky part however is that on Ambari both NameNodes are marked as Active even though the backend failed over, so Ambari should report NN Standby and SNN Active. So the DFS can be written to by simply using the typical hdfs dfs -put test.log <path>/test.log Now to force Ambari to refresh the status I run the following command: echo N | hdfs haadmin -transitionToStandby --forcemanual nn2 and then essentially nn2 is marked as Standby and nn1 becomes Active and Ambari refreshes to display NN as Active and SNN as Standby and the world is happy..... So from a SysAdmin perspective I can write data to the filesystem and I'm happy and consider that an Ambari bug, however from programmer colleague it causes havok has he can't write/read/modify the file system from Java/API/hdfs://url. Is this a known issue? Expected behaviour? And last but not least what defines the hdfs://url value ? Is there an additional parameter to add from that url to fresh? thanks! Eric

Online	Offline
Last Visited	‎03-14-2017 07:09 PM

Member Since	‎06-08-2016 12:31 PM
Last Visited	‎03-14-2017 07:09 PM
Posts	33
Kudos received	10

Cloudera Community

Re: HDP Upgrade 2.4.0.0 to 2.4.2.0 Failing a pre-c...

Re: SSH Fence not working?

SSH Fence not working?

Re: HDP Upgrade 2.4.0.0 to 2.4.2.0 Failing a pre-c...

Re: HDP Upgrade 2.4.0.0 to 2.4.2.0 Failing a pre-c...

Re: HDP Upgrade 2.4.0.0 to 2.4.2.0 Failing a pre-c...

Re: HDP Upgrade 2.4.0.0 to 2.4.2.0 Failing a pre-c...

Re: HDP Upgrade 2.4.0.0 to 2.4.2.0 Failing a pre-c...

HDP Upgrade 2.4.0.0 to 2.4.2.0 Failing a pre-check

Re: NameNode HA Ambari Display Issue

NameNode HA Ambari Display Issue