Reply
Cloudera Employee
Posts: 4
Registered: ‎12-16-2014

Re: Loading to S3 Fails - CDH 5.3.0 FIXED IN 5.4.4

CDH5.4.5 is release on Aug 18, it should have the s3 fix. 

New Contributor
Posts: 7
Registered: ‎07-27-2015

Re: Loading to S3 Fails - CDH 5.3.0 FIXED IN 5.4.4

Just verified it's working!

 

Thanks!

 

-B

New Contributor
Posts: 1
Registered: ‎08-20-2015

Re: Loading to S3 Fails - CDH 5.3.0 FIXED IN 5.4.4

i upgraded to 5.4.5 but am still facing the problem.  Any config changes we need to do ?

 

 

15/08/20 14:14:42 INFO BlockManagerMasterActor: Registering block manager ip-10-224-15-31.aws.chotel.com:34645 with 530.0 MB RAM, BlockManagerId(1, ip-10-224-15-31.aws.chotel.com, 34645)

Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: s3a://spark-poc-1/in/paceCYShell.parquet, expected: hdfs://ip-10-224-15-26.aws.chotel.com:8020

at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)

at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:465)

at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$6.apply(newParquet.scala:252)

at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$6.apply(newParquet.scala:251)

 

 

 

New Contributor
Posts: 7
Registered: ‎07-27-2015

Re: Loading to S3 Fails - CDH 5.3.0 FIXED IN 5.4.4

It's working from the hive cli. I'm not sure about from spark.

Explorer
Posts: 6
Registered: ‎09-02-2015

Re: Loading to S3 Fails - CDH 5.3.0 FIXED IN 5.4.4

Even I am facing the same issue, tried running the query from both beeline and hive cli.

Hive version -- /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/jars/hive-common-1.1.0-cdh5.4.5.jar!/hive-log4j.properties

Beeline version -- Beeline version 1.1.0-cdh5.4.5 by Apache Hive

 

 

CREATE TABLE IF NOT EXISTS <s3_table>

STORED AS AVRO

LOCATION 's3a://***/***/***/***/'

AS

select * from <hdfs_table>;

 

--------------------

HIVE LOGS:

Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1442849170635_4506, Tracking URL = http://********************/
Kill Command = /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job -kill job_1442849170635_4506
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-09-29 22:16:53,027 Stage-1 map = 0%, reduce = 0%
2015-09-29 22:17:12,722 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.28 sec
MapReduce Total cumulative CPU time: 3 seconds 280 msec
Ended Job = job_***************
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://**************/hive_hive_2015-09-29_22-16-37_095_1255538331105842480-1/-ext-10001
Moving data to: s3a://******************
Failed with exception Wrong FS: s3a://***********, expected: hdfs://*************
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 3.28 sec HDFS Read: 2455930 HDFS Write: 2746940 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 280 msec

-------------------------------------------

BEELINE LOGS

 

INFO : Moving data to: s3a://********** from hdfs://**********/hive_hive_2015-09-29_22-28-16_959_3663235137172518120-507/-ext-10001
ERROR : Failed with exception Wrong FS: s3a://************, expected: hdfs://*************
java.lang.IllegalArgumentException: Wrong FS: s3a://*********, expected: hdfs://************
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
at org.apache.hadoop.hive.shims.Hadoop23Shims.getFullFileStatus(Hadoop23Shims.java:724)
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2471)
at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:105)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:222)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1638)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1398)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1182)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1048)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1043)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144)
at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask (state=08S01,code=1)

---------------------------------------------

 

Any help is appreciated !!!

 

Cloudera Employee
Posts: 30
Registered: ‎12-09-2014

Re: Loading to S3 Fails - CDH 5.3.0 FIXED IN 5.4.4

Yea I think there's multiple issues, sorry for the inconvenience.  This one is for moving between tables on different file-system (s3 and non-s3).

 

This one is HIVE-7476, fixed in CDH5.5.

Explorer
Posts: 6
Registered: ‎09-02-2015

Re: Loading to S3 Fails - CDH 5.3.0 FIXED IN 5.4.4

Thank you for the reply.

FYI:

Instead of CTAS, if I create an external/managed table on S3 and do insert from HDFS to S3 it is working. 

Highlighted
Cloudera Employee
Posts: 30
Registered: ‎12-09-2014

Re: Loading to S3 Fails - CDH 5.3.0 FIXED IN 5.4.4

Nice, thanks for update.

Announcements