Created 12-26-2016 12:25 PM
I am on HDP 2.5 and when trying to write hive query output to S3, I get below exception.
Caused by: java.lang.NoClassDefFoundError: org/jets3t/service/ServiceException at org.apache.hadoop.fs.s3native.NativeS3FileSystem.createDefaultStore(NativeS3FileSystem.java:342) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:332) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:348) at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.initializeOp(VectorFileSinkOperator.java:70) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:363) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:489) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:231) ... 15 more Caused by: java.lang.ClassNotFoundException: org.jets3t.service.ServiceException at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 34 more
Below is what I did from hive shell
INSERT OVERWRITE DIRECTORY 's3n://santhosh.aws.com/tmp' SELECT * FROM REGION
The jets3t library is part of the hive classpath ?
Created 12-26-2016 10:50 PM
S3N is really old and pretty much deprecated. Can you change your URL to "s3a://santhosh.aws.com/tmp" and ensure that you have "fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem". If you do not have InstanceProfileCredentialProvider, you have to configure "fs.s3a.access.key and fs.s3a.secret.key".
Created 12-26-2016 10:50 PM
S3N is really old and pretty much deprecated. Can you change your URL to "s3a://santhosh.aws.com/tmp" and ensure that you have "fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem". If you do not have InstanceProfileCredentialProvider, you have to configure "fs.s3a.access.key and fs.s3a.secret.key".
Created 12-27-2016 06:58 AM
Thanks, this works !