Support Questions
Find answers, ask questions, and share your expertise

Uploading to hdfs from an external EC2 instance


I'm trying to upload files into an HDP cluster running on AWS from a separate EC2 instance. What is the best practice to achieve this?

I can make the EC2 instance authenticate with the HDP datanode using ssh, but I would rather just upload the file directly to HDFS using the CLI, but I'm not sure how to do this.


I think directly copy should be ok (and if kerberos enabled , authenticate and then copy with hdfs dfs -put< file> hdfs://<namenode><port>/folder/), assuming your EC2 instance is in the same VPC as your HDP cluster and you don't have any access problem.