Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Is it possibe to import data from AWS S3 to Hadoop using SQOOP?

Is it possibe to import data from AWS S3 to Hadoop using SQOOP?

New Contributor

I got a requiremrnt to load data from AWS S3 to HDFS on incremental basis in CDH 5.8 platfrom. Please help on this. Below link says it is possible. Guide me on this.

 

https://sqoop.apache.org/docs/1.99.7/user/examples/S3Import.html

2 REPLIES 2

Re: Is it possibe to import data from AWS S3 to Hadoop using SQOOP?

Champion
I have never used Sqoop for this but if there is a connector it should work.

If the data is not large I would use the hdfs command to do it as it is less complex.

hdfs dfs -cp s3a://bucket/path/to/object hdfs:///hdfs/path.

What is your specific question on the link provided? Have you tried it or don't understand a portion of it?

Re: Is it possibe to import data from AWS S3 to Hadoop using SQOOP?

New Contributor
Hi,
I got a customer request to use sqoop to pull data from S3 to hdfs on incremental basis. As per my knowledge in sqoop, it is only used for importing/exporting data between RDMS and Hadoop.

Since I found in Sqoop documentation that data import is possible from S3 to HDFS. I need your guidance on this. As said in your response which connector I need to have here?

When I am trying to create the hdfs connector(create link -c hdfs-connector) as per the sqoop2 doc via sqoop cli, its always giving me connection refused error. But my sqoop2 server is up and running fine. When I give "show connector" it displays avaliable connectors( hdfs-connector & generic-jdbc-connector). Don't know why I am getting connection refused error.


Thanks
Govind
Don't have an account?
Coming from Hortonworks? Activate your account here