Support Questions
Find answers, ask questions, and share your expertise

I plan an upgrade from HDP 2.3.4 to HDP 2.4.0. What is a good checklist to perform this upgrade? Are there any reasons for concern?

Explorer
 
1 ACCEPTED SOLUTION

Accepted Solutions

Re: I plan an upgrade from HDP 2.3.4 to HDP 2.4.0. What is a good checklist to perform this upgrade? Are there any reasons for concern?

@Anil Bagga, you can follow the documentation for the upgrade that @ssathish specified, however, I would like to emphasize a few important tasks that need to make your checklist. They may look like a no brainer but quite often unhealthy installations are upgraded and when issues occur, it is much more difficult to debug what happened. As such, I always recommend to check your current installation, identify and address issues before upgrade. The purpose is to confirm that the cluster is healthy and will experience minimal service disruption before attempting an upgrade.

  • Use the Ambari Web UI to ensure that all services in the cluster are running.
  • For each service in the cluster, run the Service Check on the service’s Service Actions menu to confirm that the service is operational. Service checks are used extensively during rolling upgrade so if they fail when run manually, they will likely fail during upgrade too.
  • For each service use the Stop and Start buttons in the Ambari Web UI to verify that the service can be stopped and started. Services are repeatedly stopped and started during upgrade. If they fail when initiated manually, they will likely fail during upgrade too.
  • Understand and, as necessary, remediate any Ambari alerts.
  • Ensure that you have an up-to-date backup of any supporting databases, including Hive,

    Ranger, Oozie, and any others.

  • Enable HDFS, YARN, HBase, and Hive HA to minimize service disruption.
  • Ensure that each cluster node has at least 2.5 GB of disk space available for each HDP version. The target installation directory is /usr/hdp/<version>.
  • Operationally, be aware that new service user accounts might be created to support new software projects that were not installed as part of the earlier HDP release. For example, these new user accounts might need to be added to an LDAP server or created as Kerberos principals.

Another thing that is extremely important is to understand what are the issues with your existent version and workarounds in place, as well known issues of the new release and if existent issues are not fixed, how would your port the fixes.

Not last, test, test, test and finalize your upgrade when you are convinced that you did everything that was necessary to reduce risk. Until FINALIZE you can always rollback, after that is more difficult.

As you know, with the recent versions of Ambari you can do either Rolling or Express Upgrade. It depends on your business requirements, but an Express Upgrade can be done during a maintenance window, while Rolling Upgrade for a large cluster can take significant time. Current release of Ambari does the upgrade sequentially, one node at the time. There is no parallelism.

Good luck!

View solution in original post

2 REPLIES 2

Re: I plan an upgrade from HDP 2.3.4 to HDP 2.4.0. What is a good checklist to perform this upgrade? Are there any reasons for concern?

Re: I plan an upgrade from HDP 2.3.4 to HDP 2.4.0. What is a good checklist to perform this upgrade? Are there any reasons for concern?

@Anil Bagga, you can follow the documentation for the upgrade that @ssathish specified, however, I would like to emphasize a few important tasks that need to make your checklist. They may look like a no brainer but quite often unhealthy installations are upgraded and when issues occur, it is much more difficult to debug what happened. As such, I always recommend to check your current installation, identify and address issues before upgrade. The purpose is to confirm that the cluster is healthy and will experience minimal service disruption before attempting an upgrade.

  • Use the Ambari Web UI to ensure that all services in the cluster are running.
  • For each service in the cluster, run the Service Check on the service’s Service Actions menu to confirm that the service is operational. Service checks are used extensively during rolling upgrade so if they fail when run manually, they will likely fail during upgrade too.
  • For each service use the Stop and Start buttons in the Ambari Web UI to verify that the service can be stopped and started. Services are repeatedly stopped and started during upgrade. If they fail when initiated manually, they will likely fail during upgrade too.
  • Understand and, as necessary, remediate any Ambari alerts.
  • Ensure that you have an up-to-date backup of any supporting databases, including Hive,

    Ranger, Oozie, and any others.

  • Enable HDFS, YARN, HBase, and Hive HA to minimize service disruption.
  • Ensure that each cluster node has at least 2.5 GB of disk space available for each HDP version. The target installation directory is /usr/hdp/<version>.
  • Operationally, be aware that new service user accounts might be created to support new software projects that were not installed as part of the earlier HDP release. For example, these new user accounts might need to be added to an LDAP server or created as Kerberos principals.

Another thing that is extremely important is to understand what are the issues with your existent version and workarounds in place, as well known issues of the new release and if existent issues are not fixed, how would your port the fixes.

Not last, test, test, test and finalize your upgrade when you are convinced that you did everything that was necessary to reduce risk. Until FINALIZE you can always rollback, after that is more difficult.

As you know, with the recent versions of Ambari you can do either Rolling or Express Upgrade. It depends on your business requirements, but an Express Upgrade can be done during a maintenance window, while Rolling Upgrade for a large cluster can take significant time. Current release of Ambari does the upgrade sequentially, one node at the time. There is no parallelism.

Good luck!

View solution in original post