Support Questions
Find answers, ask questions, and share your expertise

How to use S3 bucket as target locaiton for data load using SQOOP

How to use S3 bucket as target locaiton for data load using SQOOP

can some one help in providing the solution to use S3 bucket as target dir in SQOOP. I am getting below error.

17/01/24 15:25:53 ERROR tool.ImportTool: Imported Failed: Wrong FS: s3a://av-horton-poc/test, expected: hdfs://ip-10-228-210-170:8020

Regards,

Shashi

8 REPLIES 8

Re: How to use S3 bucket as target locaiton for data load using SQOOP

@shashi cheppela

Are you using --hive-import? If so, you may be running into this: SQOOP-3403. Workaround is to not use --hive-import with s3:// as --target-dir.

Re: How to use S3 bucket as target locaiton for data load using SQOOP

Thanks for the reply.

I am not using --hive-import. i am writing data into a file in a s3 directory.

Below is my sample sqoop command.

sqoop import --connect jdbc:postgresql://<IP>:5432/test --username gpadmin --P --query "<query>" \ -delete-target-dir --target-dir s3a://test-bucket/sqoop-load/ \ --fields-terminated-by '\001' --hive-delims-replacement \"SPECIAL\" \ -m 1

Regards,

Shashi

Re: How to use S3 bucket as target locaiton for data load using SQOOP

@shashi cheppela

I see you are using --delete-target-dir. This attempts to delete the import target directory if it exists. Permissions on your S3 bucket may be tripping you up. I'd start by removing that.

You may want to double-check the s3a classpath dependencies and authentication properties you are using.

see: https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-a...

see also: http://docs.hortonworks.com/HDPDocuments/HDCloudAWS/HDCloudAWS-1.8.0/bk_hdcloud-aws/content/s3-troub...

Re: How to use S3 bucket as target locaiton for data load using SQOOP

@shashi cheppela

Let me know how things are going. If I've helped answer your question, I'd very much appreciate if you would recognize my effort by Accepting the answer. Thanks. Tom

Re: How to use S3 bucket as target locaiton for data load using SQOOP

Explorer
@Tom McCuch

Hi, after removing --delete-target-dir, was able to write. However when adding the property fails, i have verified my IAM role and it has s3:* permission. Any thoughts.

Re: How to use S3 bucket as target locaiton for data load using SQOOP

@Anandakrishnan Ramakrishnan

--delete-target-dir is meant to delete the <HDFS-target-dir> provided in command before writing data to this directory. If it isn't a permissions issue, I am suspecting that it may be failing because this isn't an HDFS directory.

Re: How to use S3 bucket as target locaiton for data load using SQOOP

Rising Star

@shashi cheppela, curious if you were able to resolve this and, if so, what you did? Thanks.

Re: How to use S3 bucket as target locaiton for data load using SQOOP

There is a Open Apache Jira for this issue:

https://issues.apache.org/jira/browse/SQOOP-3043

One workaround you can do is to import data to Hdfs, and than use distcp to copy from Hdfs to S3