Support Questions
Find answers, ask questions, and share your expertise

how to copy data from local hdfs to another hdfs in AWS using NIFI

Highlighted

how to copy data from local hdfs to another hdfs in AWS using NIFI

I have a local hdp Cluster in meinem local machine and also NIFI installed. On the AWS I installed another hdp Cluster which is not kerberized. My Problem is how to copy all my data from the local cluster to the cluster in AWS using NIFI. Can I use puthdfs? How can I configure it for AWS? I will be thankful if someone can help.

7 REPLIES 7
Highlighted

Re: how to copy data from local hdfs to another hdfs in AWS using NIFI

Super Guru

@Chokri Ben Necib

Read data from HDFS using a local nifi install and then send to Nifi installed in AWS using site-to-site protocol. Here is a link to documentation on site to site configuration.

https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.1/bk_user-guide/content/configure-site-to-sit...

If Nifi is not an option, distcp can be used. Distcp is widely used for copying data between clusters, when Nifi is not used.

For security keys, please see if you can use the following method. the document shows it for S3, but I am wondering if you might be able to use this same methid for your keys also.

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_cloud-data-access/content/s3-credential-...

Highlighted

Re: how to copy data from local hdfs to another hdfs in AWS using NIFI

@mqureshi thanks for the idea. Now, NIFI ist not installed on the AWS Cluster (Edge Node) only locally. ist that to be done first ? I hink with gethdfs and puthdfs processors of nifi at the local Cluster could be work !!!

Highlighted

Re: how to copy data from local hdfs to another hdfs in AWS using NIFI

Super Guru

@Chokri Ben Necib

You need to install Nifi on both AWS and in your local data center from where you will be moving data. You cannot put data into a remote HDFS cluster. Even if it works (it shouldn't, if for nothing, then for at least security reasons), it would be ridiculously slow.

Highlighted

Re: how to copy data from local hdfs to another hdfs in AWS using NIFI

ok. Installing NIFI on AWS would take a lot time. Is there another way without using NIFI? distcp is also a tool to copy data between two clusters but I am not sure that it works for AWS Cluster.

Highlighted

Re: how to copy data from local hdfs to another hdfs in AWS using NIFI

Super Guru

I just updated my answer. Yes, Distcp can be used.

Highlighted

Re: how to copy data from local hdfs to another hdfs in AWS using NIFI

Unfortunately, Distcp does not work for AWS cluster. I could not add the creditential of AWS in the command (xxxx.pem File).

Re: how to copy data from local hdfs to another hdfs in AWS using NIFI

Super Guru

@Chokri Ben Necib

Please see my updated answer. Not sure if it will help, but it might work.