Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar
Master Collaborator

Introduction

Phoenix is a popular solution to provide OLTP and operational analytics on top of HBase for low latency. Hortonworks Data Platform (HDP), Cloudera Data Platform (CDP) are the most popular platforms for Phoenix to interact with HBase.

Nowadays, many customers choose to migrate to Cloudera Data Platform to better manage their Hadoop clusters and implement the latest solutions in big data.

This article discussed how to migrate Phoenix data/index tables to the newer version CDP Private Cloud Base.

Environment

Source cluster HDP 2.6.5 , HDP 3.1.5

Target cluster CDP PvC 7.1.5, CDP PvC 7.1.6, CDP PvC 7.1.7

Migration steps

The SYSTEM table will be automatically created when Phoenix-sqlline initially starts. It will contain the metadata of Phoenix tables. In order to show Phoenix data/index tables in the target cluster, we need to migrate SYSTEM tables from the source cluster as well.

  1. Stop Phoenix service on the CDP cluster
    You can stop the service on Cloudera Manager > Services > Phoenix Service > Stop
  2. Drop the system.% tables on CDP cluster (from HBase)
    In HBase shell, drop all the SYSTEM tables.
    hbase:006:0> disable_all "SYSTEM.*"
    hbase:006:0> drop_all "SYSTEM.*"
  3. Copy the system, data, and index tables to the CDP cluster
    There are many methods to copy data between HBase clusters. I would recommend using snapshots to keep the schema same.
    Source HBase:
    1. Take snapshots of all SYSTEM tables and data tables
      hbase(main):020:0> snapshot "SYSTEM.CATALOG","CATALOG_snap"
    2. ExportSnapshot to the target cluster
      sudo -u hdfs hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot CATALOG_snap -copy-to hdfs://Target_Active_NameNode:8020/hbase -mappers 16 -bandwidth 200
      Your HBase directory path may be different. Check HBase configuration in Cloudera Manager for the path.
    3. In the Target cluster, the owner may become a different user who triggers MapReduce. So, we need to change the owner back to default hbase:hbase
      sudo -u hdfs hdfs dfs -chown -R hbase:hbase /hbase
    4. In HBase shell, use clone_snapshot to create new tables
      clone_snapshot "CATALOG_snap","SYSTEM.CATALOG"
      When you complete the above steps, you should have all the SYSTEM tables and data tables, and index tables in your target HBase. For example, the following is copied from HDP2.6.5 cluster and created in CDP.
      hbase:013:0> list
      TABLE
      SYSTEM.CATALOG
      SYSTEM.FUNCTION
      SYSTEM.SEQUENCE
      SYSTEM.STATS
      TEST
  4. Start Phoenix service, enter phoenix-sqlline, and then check if you can query the table.

  5. (Optional) If HDP already enabled NamespaceMapping, we should also set isNamespaceMappingEnabled to true on the CDP cluster in both client/service hbase-site.xml, and restart the Phoenix service.

Known Bug of Migration Process

Starting from Phoenix 5.1.0/ CDP 7.1.6, there is a bug during SYSTEM tables auto-upgrade. The fix will be included in the future CDP release. The customer should raise cases with Cloudera support and apply a hotfix for this bug on top of CDP 7.1.6/ 7.1.7.

Refer to PHOENIX-6534

Disclaimer

This article does not contain all the versions of HDP and CDP, and also does not test all the situations. It only chooses the popular or latest versions. If you followed steps but failed or met with a new issue, please feel free to ask in the Community or raise a case with Cloudera support.

1,482 Views