● Prepare an Upgrade Log to keep track of any upgrade
issues and their workarounds
● [Optional but recommended] Prepare one or more Lab
(virtual) clusters and install the
current HDP stack and Ambari. Use
these clusters for mock upgrades and rollbacks to troubleshoot any upgrade
issues.
Upgrading Clusters
● To upgrade a single cluster use the Upgrade Procedure
given below.
● [Optional but recommended] Mock (lab) cluster upgrade:
Attempt an upgrade on a Lab cluster. Some steps of the Upgrade Procedure can be
skipped in order to concentrate on critical parts.
● Test (Dev) cluster upgrade: For upgrading an important,
production cluster it is strongly recommended to attempt the upgrade first on a
test cluster (eg. Dev) similar to the production cluster: running the current
versions of HDP and Ambari, having
similar topology and the components and configuration like the production
cluster but on a smaller number of nodes.
● Log every issue encountered during lab and test
upgrades and its workaround, so that we minimize any down time during the main
cluster upgrade.
● Main (Production) cluster upgrade
● Book the upgrade date and time in advance
● Estimate cluster down-time based on results of the test
upgrade. Note that regardless of the preparation and any test upgrades some new
issues will appear.
● Inform all interested parties
● Confirm that the Support is on stand-by
● Do the upgrade
A Single Cluster Upgrade Procedure
Prepare the Cluster for the Upgrade
● Run identified validation applications before the upgrade,
and record results and execution times for each of them
● Get ready for the upgrade: Correct any errors and/or
alerts and warnings on the cluster
● Check the state of the HDFS filesystem and finalize it
if not already finalized
● Capture the HDFS status and save the HDFS namespace
● Backup NameNode metadata and all DBs supporting the
cluster (Ambari, Hive metastore, Oozie, Ranger, Hue)
Perform Upgrade
● Execute cluster upgrade using the official HDP upgrade
document
● Review new properties, in particular pay attention to
changed property values, changed property names, and new meaning of existing
properties (if any)
Post Upgrade Validation
● Run the Smoke test for each service and troubleshoot
any issues
● If any validation application is failing or execution
times are much longer than before the upgrade, review and adjust cluster
properties, repeating validation applications until they are stable and don’t
run slower than before the upgrade
● Record in the Upgrade Log any issues encountered and
workarounds.
Final Steps
● Install new HDP Components, not used before the upgrade
(if any), run smoke test for each of them and troubleshoot any issues
● Finalize HDFS upgrade
● Configure HA of selected components (like NN, RM,
HiveServer2, HBase, Oozie)
● Perform Ambari Takeover of HDP components not being
managed by Ambari earlier
● Enable Kerberos Security: the KDC and existing
principals and keytabs can be reused, add principals for new components