Support Questions
Find answers, ask questions, and share your expertise

Loading to S3 Fails - CDH 5.3.0

New Contributor

Since upgrading our cluster from 5.1.2 to 5.3.0, we have been unable to load data to a Hive table that points to S3. It fails with the following error:

 

Loading data to table schema.table_name partition (dt=null)
Failed with exception Wrong FS: s3n://<s3_bucket>/converted_installs/.hive-staging_hive_2015-01-26_11-05-32_849_2677145287515034575-1/-ext-10000/dt=2015-01-25/000000_0.gz, expected: hdfs://<name_node>:8020
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

The table itself was created using the following DDL (I removed the columns, since they are not very important):

 

...
ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde'
STORED AS TEXTFILE
LOCATION 's3n://<s3_bucket>/data/warehouse_v1/converted_installs';

We don't have any issues writing to tables that reside on HDFS locally, but for some reason, writing to S3 fails. Anyone have an idea how to fix this?

27 REPLIES 27

New Contributor

This was fixed in release 5.4.4. We have installed and successfully retested.

Explorer
Thank you for following up on this Jim, it's greatly appreciated. I will migrate to 5.4.4 when time allows.

Hi, 

 

Are you sure this is fixed? We just upgraded to  1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6 and are seeing the same issue.

inserting into a s3 table and selecting from a hdfs table fails. We are using hiveserver 1 and the hive cli.

 

 

 

Thanks,

 

-Brandon

Explorer
Which s3 provider are you using? Can you show the query that is failing, and the stacktrace? It may prove helpful.

We've tried s3n and s3a, both have the same results: 

 

hive> insert overwrite table s3_brand_test select * from brand;

 

Error:

Failed with exception Wrong FS: s3a://hdfs.hive/s3_brand_test/.hive-staging_hive_2015-07-27_15-27-59_802_7825996587812451043-1/-ext-10000/000000_0, expected: hdfs://myhost.com:8020 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

 

stacktrace:

java.lang.IllegalArgumentException: Wrong FS: s3a://hdfs.hive/s3_brand_test/.hive-staging_hive_2015-07-27_15-57-08_961_2773119355707995501-1/-ext-10000/000000_0, expected: hdfs:/

/myhost.com:8020

        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)

        at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)

        at org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:1916)

        at org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:262)

        at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1195)

        at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2487)

        at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2765)

        at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1614)

        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:297)

        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)

        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)

        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)

        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)

        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)

        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)

        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)

        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)

        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)

        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)

        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)

        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:702)

        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)

        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:483)

        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)

        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Cloudera Employee
This fix is in CDH5.4.5. CDH5.4.4 does not have the fix. CDH5.4.5 is not released yet.

Ah, Sorry I missed that.

 

 

Do we know the eta on 5.4.5 or is there a patch we could apply?

 

Thanks

 -B

Cloudera Employee
Current eta on 5.4.5 is end of August.

Thanks!