Uploading to hdfs from an external EC2 instance


I'm trying to upload files into an HDP cluster running on AWS from a separate EC2 instance. What is the best practice to achieve this?

I can make the EC2 instance authenticate with the HDP datanode using ssh, but I would rather just upload the file directly to HDFS using the CLI, but I'm not sure how to do this.


I think directly copy should be ok (and if kerberos enabled , authenticate and then copy with hdfs dfs -put< file> hdfs://<namenode><port>/folder/), assuming your EC2 instance is in the same VPC as your HDP cluster and you don't have any access problem.