Support Questions

Find answers, ask questions, and share your expertise

In-place update from Apache Hadoop 2.7 deployment to HDP 2.x with Ambari,In-place update or upgrade from Apache Hadoop 2.7.1 to HDP 2.x

avatar
Explorer

I inherited a multi-host deployment of Apache Hadoop 2.7 that is used primarily for HDFS. I'd like to update it in-place to the equivalent version of HDP 2.x with Ambari. Any guidance on how to achieve this?

Any help will be much appreciated.

,

All -

I have inherited a multi-server deployment of Apache Hadoop 2.7.1 for the primary use case of HDFS storage. Basically downloaded and installed from open source site. Is there a way to gracefully convert this deployment into a HDP 2.x with Ambari?

1 ACCEPTED SOLUTION

avatar

Hi @Andy Max,

This should certainly be doable and should be relatively straight forward. In the end, I would recommend you stand up a small sandbox environment that mimics the current one and test out this process and develop a concrete playbook. The rough steps that I would recommend you try are:

  • Stop all services on the existing cluster.
  • Use Apache Ambari to install a "dummy" HDP 2.4.x cluster on the current cluster: http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Installing_HDP_AMB/content/ch_Getting_Rea...
    • Install barebones with only HDFS, ZK, etc. services.
    • Make sure that the HDFS namenode and datanode directories are dummy directories that do not point to your existing data and namenode directories.
  • Stop all services via Ambari.
  • In Ambari, change the data and namenode directories for HDFS to point to your old directories.
  • Start the services back up and verify that the data is available.

This should work smoothly with HDP 2.4 because 2.4 also include Apache Hadoop 2.7.1 so the file system version is identical.

This all assumes that you are only using HDFS. It could all get a bit hairier if you have Hive tables sitting on top with some metadata that needs to be migrated.

Cheers,

Brandon

View solution in original post

2 REPLIES 2

avatar

Hi @Andy Max,

This should certainly be doable and should be relatively straight forward. In the end, I would recommend you stand up a small sandbox environment that mimics the current one and test out this process and develop a concrete playbook. The rough steps that I would recommend you try are:

  • Stop all services on the existing cluster.
  • Use Apache Ambari to install a "dummy" HDP 2.4.x cluster on the current cluster: http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Installing_HDP_AMB/content/ch_Getting_Rea...
    • Install barebones with only HDFS, ZK, etc. services.
    • Make sure that the HDFS namenode and datanode directories are dummy directories that do not point to your existing data and namenode directories.
  • Stop all services via Ambari.
  • In Ambari, change the data and namenode directories for HDFS to point to your old directories.
  • Start the services back up and verify that the data is available.

This should work smoothly with HDP 2.4 because 2.4 also include Apache Hadoop 2.7.1 so the file system version is identical.

This all assumes that you are only using HDFS. It could all get a bit hairier if you have Hive tables sitting on top with some metadata that needs to be migrated.

Cheers,

Brandon

avatar
Explorer

Thanks @Brandon Wilson. Will give this a shot and let you know.