Created 06-28-2016 07:26 PM
So I've upgraded Ambari to the bleeding edge 2.2.2.0 today and I was about to rollout HDP 2.4.2.0-258 and I am stumped at the pre-check script after all the HDP-2.4.2.0 packages have been sucessfully installed across the board.
Upgrade to HDP-2.4.2.0
Requirements
You must meet these requirements before you can proceed.
A previous upgrade did not complete. Reason: Upgrade attempt (id: 1, request id: 2,681, from version: 2.2.6.0-2800, to version: 2.4.0.0-169) did not complete task with id 17,829 since its state is FAILED instead of COMPLETED. Please ensure that you called: ambari-server set-current --cluster-name=$CLUSTERNAME --version-display-name=$VERSION_NAME Further, change the status of host_role_command with id 1 to COMPLETED
Failed on: HugeData
I ran the command as instructed:
ambari-server set-current --cluster-name=HugeData --version-display-name=HDP-2.4.2.0
To no avail... I am stumped at this point in time and not sure where to look to change that manually in the backend? As far as I am concerned we had been running 2.4.0.0-169 without any issues (except for the NN failover) for about a month...
According to the error above we missed something in the 2.2.x to 2.4.x upgrade...... I'm sure there's a value I can edit to set as successful but I am not sure right now.
Your input would be much appreciated 🙂
Created 06-29-2016 09:32 PM
So in the end knowing my config was fine I added stack.upgrade.bypass.prechecks=true to /etc/ambari-server/conf/ambari.properties and chose to disregard the warning. The upgrade went fine and all test are green. Essentially we went fro hdp 2.1 back then to the bleeding edge and some steps had to be done manually. So somehow after a few tries we succeeded but most likely left some artifacts behind....
I'm still interested to find out where this entry is located and where I could clean it up.
thankfully we're getting professional services soon and building a brand new pro-level cluster with the help of some hortonworks engineers so there wont be any weird or unknown configuration choices.
Created 06-28-2016 07:45 PM
Did you finalize the upgrade from 2.2.6.0-2800 to 2.4.0.0-169? You can look at directory on your namenode where your fsimage file and the edits are stored and see if you are keeping info for current and the old version.
The command to finalize the upgrade is: hdfs dfsadmin -finalizeUpgrade
Hope this helps.
Created 06-28-2016 07:47 PM
Finalize upgrade successful for nn.HugeData.lab/x.x.x.40:8020
Finalize upgrade successful for snn.HugeData.lab/x.x.x.41:8020
I then re-run the check and it still fails ://
Created 06-28-2016 08:12 PM
then if I run the ambari-server set-current.... command I get the following:
ERROR: Exiting with exit code 1.
REASON: Error during setting current version. Http status code - 500.
{
"status" : 500,
"message" : "org.apache.ambari.server.controller.spi.SystemException: Finalization failed. More details: \nSTDOUT: Begin finalizing the upgrade of cluster Timbit to version 2.4.2.0-258\n\nSTDERR: The following 516 host component(s) have not been upgraded to version 2.4.2.0-258. Please install and upgrade the Stack Version on those hosts and try again.\nHost components:\nPIG on host dn7.HugeData.lab\nPIG on host dn3.HugeData.lab\nPIG on host dn26.HugeData.lab\nPIG on host dn9.HugeData.lab\nPIG on host dn22.HugeData.lab\nPIG on host dn27.HugeData.lab\nPIG on host dn8.HugeData.lab\nPIG on host dn6.HugeData.lab\nPIG on host dn5.HugeData.lab\nPIG on host dn19.HugeData.lab\nPIG on host snn.HugeData.lab\nPIG on host dn18.HugeData.lab\nPIG on host dn21.HugeData.lab\nPIG on host dn17.HugeData.lab\nPIG on host dn23.HugeData.lab\nPIG on host dn25.HugeData.lab\nPIG on host dn1.HugeData.lab\nPIG on host dn15.HugeData.lab\nPIG on host dn14.HugeData.lab\nPIG on host esn2.HugeData.lab\nPIG on host esn.HugeData.lab\nPIG on host dn16.HugeData.lab\nPIG on host dn10.HugeData.lab\nPIG on host dn2.HugeData.lab\nPIG on host dn4.HugeData.lab\nPIG on host dn12.HugeData.lab\nPIG on host nn.HugeData.lab\nPIG on host dn11.HugeData.lab\nPIG on host dn28.HugeData.lab\nPIG on host dn20.HugeData.lab\nSPARK_JOBHISTORYSERVER on host esn2.HugeData.lab\nSPARK_CLIENT on host dn7.HugeData.lab\nSPARK_CLIENT on host dn3.HugeData.lab\nSPARK_CLIENT on host dn26.HugeData.lab\nSPARK_CLIENT on host dn9.HugeData.lab\nSPARK_CLIENT on host dn22.HugeData.lab\nSPARK_CLIENT on host
...
Created 06-29-2016 05:39 AM
Please use this article and see if the previous upgrade went good using, Using Ambari to check when upgrade / downgrade seem to be stuck
If everything looks good for the hosts / versions [ie there shouldn't be any in INSTALL_FAILED state], then the finalization could help - else those need to be fixed first.
Created 06-29-2016 06:55 PM
Unfortunately I ran all the queries in the post and I dont see any abnormalities...
Created 06-29-2016 07:20 PM
"Further, change the status of host_role_command with id 1 to COMPLETED"
How can this be done manually?
Created 07-20-2016 06:02 PM
In the ambari DB you would do:
update host_role_command set status = 'COMPLETED' where request_id = 1;
Also, instead of the other options you took, you could have set 17,829 to completed with almost the same command:
update host_role_command set status = 'COMPLETED' where task_id = 17829;
Created 07-26-2016 12:50 PM
Interesting, I'm no DB guru still learning my way around PostGre, will check it out as I'm sure the value is there and will nag me far into the future.
What would be the exact query to do that really?
Created 06-29-2016 09:32 PM
So in the end knowing my config was fine I added stack.upgrade.bypass.prechecks=true to /etc/ambari-server/conf/ambari.properties and chose to disregard the warning. The upgrade went fine and all test are green. Essentially we went fro hdp 2.1 back then to the bleeding edge and some steps had to be done manually. So somehow after a few tries we succeeded but most likely left some artifacts behind....
I'm still interested to find out where this entry is located and where I could clean it up.
thankfully we're getting professional services soon and building a brand new pro-level cluster with the help of some hortonworks engineers so there wont be any weird or unknown configuration choices.