This is a purely planning step. The expected deliverable is an
Gather all details about existing environment to plan for the upgrade path and
associated upgrade tasks.
1) Determine Upgrade Path
Based on the current and target version of the HDP stack, and
whether Ambari is used or not, select the supported upgrade guide from
Hortonworks documentation site. Identify key requirement if Namenode HA or
other HA needs to be disabled or Security needs to be disabled.
First group: Industrial benchmarks like Teragen &
Terasort, TestDFSIO, Hive TPC-DS, and HBase performance tests. As the minimum
use Teragen & Terasort with multiple mappers for Teragen and multiple
reducers for Terasort.
Second group (optional): User defined validation
. Identify representative applications (together with the input
data) which are being used most often. Be sure to include at least one for
every used Hadoop component like MapReduce, Hive, Pig, HBase, Oozie, Storm,
Kafka and others.
4) Finalize Project Management Items
Scope: Identify clusters to be upgraded and components
to upgrade and newly install (if any).
HR: Staff upgrade teams. Also, some validation applications can be run by developers
Time: Identify upgrade tasks, timeline and task owners.
QA: Carefully identify validation tasks
Risk: Estimate down-time for each cluster upgrade.
Resources: Prepare the cluster on which the upgrade will be tested (eg., Dev). When upgrading production clusters
it is strongly recommended to attempt the upgrade first on a test cluster.