Reply
New Contributor
Posts: 2
Registered: ‎06-23-2017

Is it possibe to import data from AWS S3 to Hadoop using SQOOP?

I got a requiremrnt to load data from AWS S3 to HDFS on incremental basis in CDH 5.8 platfrom. Please help on this. Below link says it is possible. Guide me on this.

 

https://sqoop.apache.org/docs/1.99.7/user/examples/S3Import.html

Posts: 642
Topics: 3
Kudos: 105
Solutions: 67
Registered: ‎08-16-2016

Re: Is it possibe to import data from AWS S3 to Hadoop using SQOOP?

I have never used Sqoop for this but if there is a connector it should work.

If the data is not large I would use the hdfs command to do it as it is less complex.

hdfs dfs -cp s3a://bucket/path/to/object hdfs:///hdfs/path.

What is your specific question on the link provided? Have you tried it or don't understand a portion of it?
New Contributor
Posts: 2
Registered: ‎06-23-2017

Re: Is it possibe to import data from AWS S3 to Hadoop using SQOOP?

Hi,
I got a customer request to use sqoop to pull data from S3 to hdfs on incremental basis. As per my knowledge in sqoop, it is only used for importing/exporting data between RDMS and Hadoop.

Since I found in Sqoop documentation that data import is possible from S3 to HDFS. I need your guidance on this. As said in your response which connector I need to have here?

When I am trying to create the hdfs connector(create link -c hdfs-connector) as per the sqoop2 doc via sqoop cli, its always giving me connection refused error. But my sqoop2 server is up and running fine. When I give "show connector" it displays avaliable connectors( hdfs-connector & generic-jdbc-connector). Don't know why I am getting connection refused error.


Thanks
Govind
Announcements