Created 02-12-2018 01:39 AM
Issue
Issue I am having is this here, but setting the two configs are not working for me (seems it works for some and not others) https://forums.aws.amazon.com/message.jspa?messageID=768332
Goal:
I am writing this test query with small data size to output results to S3.
INSERT OVERWRITE DIRECTORY 's3a://demo/' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE select * from demo_table;
Notes:
Error:
The error when attempting the query to output to S3 is:
2018-02-12 01:12:58,790 INFO [HiveServer2-Background-Pool: Thread-363]: log.PerfLogger (PerfLogger.java:PerfLogEnd(177)) - </PERFLOG method=releaseLocks start=1518397978790 end=1518397978790 duration=0 from=org.apache.hadoop.hive.ql.Driver> 2018-02-12 01:12:58,791 ERROR [HiveServer2-Background-Pool: Thread-363]: operation.Operation (SQLOperation.java:run(258)) - Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:324) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:199) at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2018-02-12 01:12:58,794 INFO [HiveServer2-Handler-Pool: Thread-106]: session.HiveSessionImpl (HiveSessionImpl.java:acquireAfterOpLock(342)) - We are setting the hadoop caller context to 5e6f48a9-7014-4d15-b02c-579557b5fb98 for thread HiveServer2-Handler-Pool: Thread-106
Additional note:
The query writes the tmp files to 's3a://demo/' but then fails with the above error. Tmp files look like
[hdfs@gkeys0 centos]$ hdfs dfs -ls -R s3a://demo/
drwxrwxrwx - hdfs hdfs 0 2018-02-12 02:12 s3a://demo/.hive-staging_hive_2018-02-12_02-08-27_090_2945283769634970656-1 drwxrwxrwx - hdfs hdfs 0 2018-02-12 02:12 s3a://demo/.hive-staging_hive_2018-02-12_02-08-27_090_2945283769634970656-1/-ext-10000 -rw-rw-rw- 1 hdfs hdfs 38106 2018-02-12 02:09 s3a://demo/.hive-staging_hive_2018-02-12_02-08-27_090_2945283769634970656-1/-ext-10000/000000_0 -rw-rw-rw- 1 hdfs hdfs 6570 2018-02-12 02:09 s3a://demo/.hive-staging_hive_2018-02-12_02-08-27_090_2945283769634970656-1/-ext-10000/000001_0
Am I missing a config to set, or something like that?
Created 02-13-2018 01:01 AM
Greg,
See if you can write to folder inside the bucket rather than directly writing into root level bucket.
INSERT OVERWRITE DIRECTORY 's3a://demo/testdata' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE select*from demo_table;
Created 02-13-2018 01:01 AM
Greg,
See if you can write to folder inside the bucket rather than directly writing into root level bucket.
INSERT OVERWRITE DIRECTORY 's3a://demo/testdata' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE select*from demo_table;
Created 02-13-2018 01:51 AM
Created 02-13-2018 04:00 PM
let's just say there's "ambiguity" about how root directories are treated in object stores and filesystems, and rename() is a key troublespot everywhere. It's known there are quirks here, but as normal s3/wasb/adl useage goes to subdirectories, nobody has ever sat down with HDFS to argue the subtleties of renaming something into the root directory