Reply
New Contributor
Posts: 3
Registered: ‎08-21-2018

Need Script for to automate Distcp job- to migrate data from Production to QA Daily

Hi,

 

I want copy data dalily from production to QA. Requiest you please provide me script when namenode changes in production and QA.

 

Thank you.

Posts: 519
Topics: 14
Kudos: 90
Solutions: 45
Registered: ‎09-02-2016

Re: Need Script for to automate Distcp job- to migrate data from Production to QA Daily

@public

 

Are you using Cloudera Manager? if so, do you have same version of CM in your prod and qa then you can try the below

 

1. CM -> Backup menu -> Peers (add peer) - one time 
2. CM -> Bakcup menu -> Replication schedule -> Create schedule (as many as you want)

Highlighted
Expert Contributor
Posts: 338
Registered: ‎01-25-2017

Re: Need Script for to automate Distcp job- to migrate data from Production to QA Daily

[ Edited ]

Hi @public Do you have HA nameNode?

 

Are you using cloudera? which version?

 

@saranvisa I assume these options available only in the Enterprise version.

 

I will help and provide you a script

New Contributor
Posts: 3
Registered: ‎08-21-2018

Re: Need Script for to automate Distcp job- to migrate data from Production to QA Daily

Yes we have HA and we are using CDH 5.12.1

New Contributor
Posts: 3
Registered: ‎08-21-2018

Re: Need Script for to automate Distcp job- to migrate data from Production to QA Daily

Hi 

 

I follwed your steps. Its getting below error

 

$> dr/distcp.sh ["-bandwidth","100","-i","-m","20","-prbugpa","-skipAclErr","-skipcrccheck","-skiplistingcrccheck","-update","-proxyuser","guc","-log","/user/PROXY_USER_PLACEHOLDER/.cm/distcp/2018-11-28_16856","-sourceconf","source-client-conf","-sourceprincipal","hdfs/cdhnameqa1l.home.eat.brinker.org@HOME.EAT.BRINKER.ORG","-sourcetktcache","source.tgt","-useSnapshots","distcp-39-1373113265","-ignoreSnapshotFailures","-strategy","dynamic","-filters","exclusion-filter.list","-scheduleId","39","-scheduleName","Teradata","/data/analysis/teradata/","/data/analysis/chekdata"]

 

Current working directory: /run/cloudera-scm-agent/process/5637-hdfs-distcp-7884b73f
Launching one-off process: /usr/lib64/cmf/service/dr/distcp.sh -bandwidth 100 -i -m 20 -prbugpa -skipAclErr -update -proxyuser guc -log /user/PROXY_USER_PLACEHOLDER/.cm/distcp/2018-11-28_16855 -sourceconf source-client-conf -sourceprincipal hdfs/cdhnameqa1l.home.eat.brinker.org@HOME.EAT.BRINKER.ORG -sourcetktcache source.tgt -useSnapshots distcp-38--383162628 -ignoreSnapshotFailures -strategy dynamic -filters exclusion-filter.list -scheduleId 38 -scheduleName Test /data/analysis/chekdata/check_summary_bgem_dgem /data/analysis/chekdata/
Wed Nov 28 06:08:09 CST 2018
Running on: cdhdatadv2l.home.eat.brinker.org (10.154.92.7)
JAVA_HOME=/usr/java/jdk1.8.0_144
using /usr/java/jdk1.8.0_144 as JAVA_HOME
using 5 as CDH_VERSION
using /run/cloudera-scm-agent/process/5637-hdfs-distcp-7884b73f as CONF_DIR
/bin/kinit
using hdfs/cdhnamedv2l.home.eat.brinker.org@HOME.EAT.BRINKER.ORG as Kerberos principal
using /run/cloudera-scm-agent/process/5637-hdfs-distcp-7884b73f/krb5cc_993 as Kerberos ticket cache
using /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop-mapreduce as CDH_MR2_HOME

 

 

Announcements