Community Articles

Find and share helpful community-sourced technical articles.
avatar
Contributor

 

Introduction 

This article focuses on backup and restore of Atlas data during HDP3 to CDP migration. 

Steps to Backup on HDP3

Run the following commands on the Atlas server host of HDP3. 
  • Command to get the metrics From Atlas API or Atlas UI 

 

curl -k -g -X GET -u admin:admin -H "Content-Type: application/json" -H"Cache-Control: no-cache" "https://atlas_host:21443/api/atlas/admin/metrics" > atlas_metrics.json

 

  • Extract the entityActive from atlas_metrics.json and make it as a list sample is shown below

 

# cat metics_types.list
hive_db_ddl
hive_table
hive_db
hbase_namespace
hive_process
hive_storagedesc
hdfs_path
hbase_table
hive_column_lineage
hbase_column_family
hive_column
hive_process_execution
hive_table_ddl​

 

  • Export API Script to export all entities and save it as zip file

 

mkdir /tmp/atlas_backup
cd /tmp/atlas_backup
for t in `cat metics_types.list`
do
mkdir -p $t
curl -k -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d "{\"itemsToExport\": [{\"typeName\": \"$t\"}], \"options\": {\"matchType\": \"forType\", \"fetchType\": \"full\"}}" "https://atlas_host:21443/api/atlas/admin/export" > $t/Atlas-$t.zip
done​

 

Note: unzip and check one of the zip file. expect to see .json files with entities informations.
 

Steps to Import on CDP

  • Remediation steps
    • Unzip and extract the json files from the backup directory /tmp/atlas_backup.  Expect to see .json file with entity information. 
    • Replace the Atlas cluster_name in .json files with CDP Atlas cluster_name. Note: in CDP default value of cluster_name is 'cm'. 
    • Replace the HDFS Namespace directory e.g hdfs://HDFSNamespace:8020/
    • Replace the patterns which are applicable in CDP e.g @cluster_name
  • Import API - Script to Import all entities from zip file

 

cd /tmp/atlas_backup
for t in `ls /tmp/atlas_backup/*/*.zip`
do
curl -ivk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d "{\"options\": {\"fileName\": \"$t\"}}" "https://atlas_host:21443/api/atlas/admin/importfile"
done​

 

  • Command to get the metrics From Atlas UI

 

curl -k -g -X GET -u admin:admin -H "Content-Type: application/json" -H"Cache-Control: no-cache" "https://atlas_host:21443/api/atlas/admin/metrics" > atlas_metrics_final.json

 

  • Compare the atlas_metrics.json with atlas_metrics_final.json
719 Views
0 Kudos