Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

unable to write hive query output to s3

I am on HDP 2.5 and when trying to write hive query output to S3, I get below exception.



Caused by: java.lang.NoClassDefFoundError: org/jets3t/service/ServiceException

	at org.apache.hadoop.fs.s3native.NativeS3FileSystem.createDefaultStore(NativeS3FileSystem.java:342)

	at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:332)

	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761)

	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)

	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795)

	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777)

	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)

	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)

	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:348)

	at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.initializeOp(VectorFileSinkOperator.java:70)

	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:363)

	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)

	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)

	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)

	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)

	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)

	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)

	at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:489)

	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:231)

	... 15 more

Caused by: java.lang.ClassNotFoundException: org.jets3t.service.ServiceException

	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)

	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)

	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

	... 34 more

Below is what I did from hive shell

INSERT OVERWRITE DIRECTORY 's3n://santhosh.aws.com/tmp'
SELECT * FROM REGION

The jets3t library is part of the hive classpath ?

1 ACCEPTED SOLUTION

Contributor

S3N is really old and pretty much deprecated. Can you change your URL to "s3a://santhosh.aws.com/tmp" and ensure that you have "fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem". If you do not have InstanceProfileCredentialProvider, you have to configure "fs.s3a.access.key and fs.s3a.secret.key".

View solution in original post

11 REPLIES 11

what version of jet2st library is on your classpath? Jet3st 0.9.0 has introduced ServiceException, if you have older library then you need to upgrade lib.

@Rajkumar Singh

Thank you for your reply. Should HDP not take care of packaging this correctly ? This issue I see it in HDP 2.5

@Santhosh B Gowda

I can see hive is picking right jar from these locations, are you seeing different jar version on classpath?



java    25940 hive  mem    REG              252,1   539735  1180054 /usr/hdp/2.5.0.0-1133/hadoop-mapreduce/jets3t-0.9.0.jar

java    25940 hive  mem    REG              252,1   539735  1179933 /usr/hdp/2.5.0.0-1133/hadoop-yarn/lib/jets3t-0.9.0.jar

java    25940 hive  mem    REG              252,1   539735  1053479 /usr/hdp/2.5.0.0-1133/hadoop/lib/jets3t-0.9.0.jar

java    25940 hive  183r   REG              252,1   539735  1053479 /usr/hdp/2.5.0.0-1133/hadoop/lib/jets3t-0.9.0.jar

java    25940 hive  297r   REG              252,1   539735  1179933 /usr/hdp/2.5.0.0-1133/hadoop-yarn/lib/jets3t-0.9.0.jar

java    25940 hive  415r   REG              252,1   539735  1180054 /usr/hdp/2.5.0.0-1133/hadoop-mapreduce/jets3t-0.9.0.jar

@Rajkumar Singh

I can see the jar's in specified location, how did we check whether is loading these jars ?

ls -lrt /usr/hdp/2.5.3.0-14/hadoop/lib/jets3t-0.9.0.jar
-rw-r--r--. 1 root root 539735 Nov 10 18:00 /usr/hdp/2.5.3.0-14/hadoop/lib/jets3t-0.9.0.jar

@Santhosh B Gowda

if you are using hive-cli/hiveserver2 then get the process id and check

lsof -p <pid> | grep jets3t

it will tell you what jets3t jar available on the classpath

@Rajkumar Singh Thanks. I could see that jets3t-0.9.0.jar is loaded.

Also as per @Rajesh Balamohan suggestion moving from s3n to s3a , I could get it working.

@Santhosh B Gowda

Was this a fresh install or an upgrade from an older version of HDP? If this was an upgrade, this thread may be useful:

http://stackoverflow.com/questions/33852044/why-can-i-not-read-from-the-aws-s3-in-spark-application-...

As I see in your last post, you mention a path /usr/hdp/2.5.3.0-14/hadoop/lib/jets3t-0.9.0.jar, could you also run the following and post the result?

ls -lrt /usr/hdp/

See link below to learn why s3a is a better option than s3n, but that may not be the cause for your issue.

https://wiki.apache.org/hadoop/AmazonS3

@Constantin Stanca I see this issue with both fresh and upgraded system and moving from s3n and s3a help me in uploading to S3.

Contributor

S3N is really old and pretty much deprecated. Can you change your URL to "s3a://santhosh.aws.com/tmp" and ensure that you have "fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem". If you do not have InstanceProfileCredentialProvider, you have to configure "fs.s3a.access.key and fs.s3a.secret.key".

Thanks, this works !

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.