We have HDP 2.2 in production, another cluster on HDP 2.3 and now we plan to upgrade both the clusters to 2.4 version. Our HDP is Kerberized and setup with Yarn, HDFS, SOLR, Zookeeper, Hive with Tez, Sqoop, Ambari, Oozie,Knox,Ranger, TDE.
Thanks in advance
Hi @Smart Solutions ,
I am not sure if your Production cluster can afford downtime but you could simply do an express update. This will gracefully shut down ever service, upgrade the components all as once and then bring the entire cluster back up. If that isn't an option, that you will need to use Ambari's rolling upgrade functionality.
I would recommend upgrading your non production cluster up to HDP 2.4, test all your production workloads to ensure nothing was broken. If everything works out find, I would then follow our documentation to upgrade from HDP 2.2 to HDP 2.4 - found here .
As for risks it's hard to determine them without testing on your non-production cluster first.
Hi @Smart Solutions,
Note that Ambari 2.2.2 and HDP 2.4.2 are available from today so you may want to upgrade to this version directly. This new Ambari release adds Grafana support for rich dashboards and other news described here. The release note for HDP 2.4.2 are here.
As you can see in the upgrade documentation you have the choice between rolling and express upgrade for both migration (2.2 -> 2.4.X and 2.3 -> 2.4.X). If you want no cluster downtime go for the rolling upgrade and make sure that you go through all the prerequisites that are in the doc. The express upgrade has been introduced for large clusters where the rolling upgrade may take long time. By large clusters I mean hundreds or thousands of nodes.
If you are under support synchronise with the support team and keep them informed that you are upgrading.
As of risks, it's hard to identify them and it depends on your applications, your cluster state, etc. Ambari has been optimised to reduce risks however there's no zero risk in this kind of operations. This why we highly recommend backing your databases and do an HDFS checkpoint. Again, go through the upgrade doc and understand all the requirements and prerequisites.
It's also important to have a healthy cluster before upgrading.