Support Questions
Find answers, ask questions, and share your expertise

How to use S3 bucket as target locaiton for data load using SQOOP

can some one help in providing the solution to use S3 bucket as target dir in SQOOP. I am getting below error.

17/01/24 15:25:53 ERROR tool.ImportTool: Imported Failed: Wrong FS: s3a://av-horton-poc/test, expected: hdfs://ip-10-228-210-170:8020




@shashi cheppela

Are you using --hive-import? If so, you may be running into this: SQOOP-3403. Workaround is to not use --hive-import with s3:// as --target-dir.

Thanks for the reply.

I am not using --hive-import. i am writing data into a file in a s3 directory.

Below is my sample sqoop command.

sqoop import --connect jdbc:postgresql://<IP>:5432/test --username gpadmin --P --query "<query>" \ -delete-target-dir --target-dir s3a://test-bucket/sqoop-load/ \ --fields-terminated-by '\001' --hive-delims-replacement \"SPECIAL\" \ -m 1



@shashi cheppela

I see you are using --delete-target-dir. This attempts to delete the import target directory if it exists. Permissions on your S3 bucket may be tripping you up. I'd start by removing that.

You may want to double-check the s3a classpath dependencies and authentication properties you are using.


see also:

@shashi cheppela

Let me know how things are going. If I've helped answer your question, I'd very much appreciate if you would recognize my effort by Accepting the answer. Thanks. Tom

@Tom McCuch

Hi, after removing --delete-target-dir, was able to write. However when adding the property fails, i have verified my IAM role and it has s3:* permission. Any thoughts.

@Anandakrishnan Ramakrishnan

--delete-target-dir is meant to delete the <HDFS-target-dir> provided in command before writing data to this directory. If it isn't a permissions issue, I am suspecting that it may be failing because this isn't an HDFS directory.

Rising Star

@shashi cheppela, curious if you were able to resolve this and, if so, what you did? Thanks.

There is a Open Apache Jira for this issue:

One workaround you can do is to import data to Hdfs, and than use distcp to copy from Hdfs to S3