Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Upgrade or Fresh install

avatar
Explorer

Hi All,

We have 10 node cluster. We have only few teams are using the cluster at the moment.What do you suggest fresh install or Upgrade . If so why ? Could you please explain the pain points of both . What is the best practice here .

Clean installation of OS, HDP , HDF or upgrade of HDP and HDF . If fresh install . we will take a back of all the data to another machine and reinstall everything .

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Lenu K

Your question is rather wide for a small cluster all depends on manpower at hand, for HDF remember to back up the flow files, below are immediately what comes into my mind.

Fresh Install pros and con's

  • Better planned
  • Here you get a clean installation maybe properly configured mistakes learned from the current cluster setup.
  • Straightforward no upgrade surprises.
  • Loose Customization

Upgrade pros and cons'

  • Must plan properly and document steps
  • Expect technical surprises and challenge.
  • Plan support if not having one already on the D-day
  • Challenges mold you to a better hadoopist!
  • See Mandatory Post-Upgrade Tasks

Best practice

  • Verify that the file system you selected is supported HWX
  • Pre-create all the databases
  • Backup your cluster before either of the above.
  • Plan for at least NN/RM HA (NN are the brain so allocate good memory) MUST have 3 Zookeeper
  • HDD planning is important SSD for SCSI
  • Restrict access to the cluster from the ONLY edge node.
  • Kerberize the Cluster
  • Configure SSL think of SSD for Zk,Hbase and OS can also use the SSD acceleration for temp tables in hive, exposing the SSD via HDFS
  • Plan well the Data center network(Backup lines)
  • Size your nodes memory and storage properly.
  • Beware if performance is a must especially with Kafka and Storm are memory intensive.
  • Delegate authorization to Ranger.
  • Test upgrade procedures for new versions of existing components
  • Execute performance tests of custom-built applications
  • Allow end-users to perform user acceptance testing
  • Execute integration tests where custom-built applications communicate with third-party software
  • Experiment with new software that is beta quality and may not be ready for usage at all
  • Execute security penetration tests (typically done by an external company)
  • Let application developers modify configuration parameters and restart services on short notice
  • Maintain a mirror image of the production environment to be activated in case of natural disaster or unforeseen events
  • Execute regression tests that compare the outputs of new application code with existing code running in production

HTH

View solution in original post

1 REPLY 1

avatar
Master Mentor

@Lenu K

Your question is rather wide for a small cluster all depends on manpower at hand, for HDF remember to back up the flow files, below are immediately what comes into my mind.

Fresh Install pros and con's

  • Better planned
  • Here you get a clean installation maybe properly configured mistakes learned from the current cluster setup.
  • Straightforward no upgrade surprises.
  • Loose Customization

Upgrade pros and cons'

  • Must plan properly and document steps
  • Expect technical surprises and challenge.
  • Plan support if not having one already on the D-day
  • Challenges mold you to a better hadoopist!
  • See Mandatory Post-Upgrade Tasks

Best practice

  • Verify that the file system you selected is supported HWX
  • Pre-create all the databases
  • Backup your cluster before either of the above.
  • Plan for at least NN/RM HA (NN are the brain so allocate good memory) MUST have 3 Zookeeper
  • HDD planning is important SSD for SCSI
  • Restrict access to the cluster from the ONLY edge node.
  • Kerberize the Cluster
  • Configure SSL think of SSD for Zk,Hbase and OS can also use the SSD acceleration for temp tables in hive, exposing the SSD via HDFS
  • Plan well the Data center network(Backup lines)
  • Size your nodes memory and storage properly.
  • Beware if performance is a must especially with Kafka and Storm are memory intensive.
  • Delegate authorization to Ranger.
  • Test upgrade procedures for new versions of existing components
  • Execute performance tests of custom-built applications
  • Allow end-users to perform user acceptance testing
  • Execute integration tests where custom-built applications communicate with third-party software
  • Experiment with new software that is beta quality and may not be ready for usage at all
  • Execute security penetration tests (typically done by an external company)
  • Let application developers modify configuration parameters and restart services on short notice
  • Maintain a mirror image of the production environment to be activated in case of natural disaster or unforeseen events
  • Execute regression tests that compare the outputs of new application code with existing code running in production

HTH