Support Questions

Find answers, ask questions, and share your expertise

unable to write hive query output to s3

avatar

I am on HDP 2.5 and when trying to write hive query output to S3, I get below exception.



Caused by: java.lang.NoClassDefFoundError: org/jets3t/service/ServiceException

	at org.apache.hadoop.fs.s3native.NativeS3FileSystem.createDefaultStore(NativeS3FileSystem.java:342)

	at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:332)

	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761)

	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)

	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795)

	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777)

	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)

	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)

	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:348)

	at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.initializeOp(VectorFileSinkOperator.java:70)

	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:363)

	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)

	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)

	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)

	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)

	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)

	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)

	at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:489)

	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:231)

	... 15 more

Caused by: java.lang.ClassNotFoundException: org.jets3t.service.ServiceException

	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)

	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)

	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

	... 34 more

Below is what I did from hive shell

INSERT OVERWRITE DIRECTORY 's3n://santhosh.aws.com/tmp'
SELECT * FROM REGION

The jets3t library is part of the hive classpath ?

1 ACCEPTED SOLUTION

avatar
Rising Star

S3N is really old and pretty much deprecated. Can you change your URL to "s3a://santhosh.aws.com/tmp" and ensure that you have "fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem". If you do not have InstanceProfileCredentialProvider, you have to configure "fs.s3a.access.key and fs.s3a.secret.key".

View solution in original post

11 REPLIES 11

avatar
Super Guru

what version of jet2st library is on your classpath? Jet3st 0.9.0 has introduced ServiceException, if you have older library then you need to upgrade lib.

avatar

@Rajkumar Singh

Thank you for your reply. Should HDP not take care of packaging this correctly ? This issue I see it in HDP 2.5

avatar
Super Guru

@Santhosh B Gowda

I can see hive is picking right jar from these locations, are you seeing different jar version on classpath?



java    25940 hive  mem    REG              252,1   539735  1180054 /usr/hdp/2.5.0.0-1133/hadoop-mapreduce/jets3t-0.9.0.jar

java    25940 hive  mem    REG              252,1   539735  1179933 /usr/hdp/2.5.0.0-1133/hadoop-yarn/lib/jets3t-0.9.0.jar

java    25940 hive  mem    REG              252,1   539735  1053479 /usr/hdp/2.5.0.0-1133/hadoop/lib/jets3t-0.9.0.jar

java    25940 hive  183r   REG              252,1   539735  1053479 /usr/hdp/2.5.0.0-1133/hadoop/lib/jets3t-0.9.0.jar

java    25940 hive  297r   REG              252,1   539735  1179933 /usr/hdp/2.5.0.0-1133/hadoop-yarn/lib/jets3t-0.9.0.jar

java    25940 hive  415r   REG              252,1   539735  1180054 /usr/hdp/2.5.0.0-1133/hadoop-mapreduce/jets3t-0.9.0.jar

avatar

@Rajkumar Singh

I can see the jar's in specified location, how did we check whether is loading these jars ?

ls -lrt /usr/hdp/2.5.3.0-14/hadoop/lib/jets3t-0.9.0.jar
-rw-r--r--. 1 root root 539735 Nov 10 18:00 /usr/hdp/2.5.3.0-14/hadoop/lib/jets3t-0.9.0.jar

avatar
Super Guru

@Santhosh B Gowda

if you are using hive-cli/hiveserver2 then get the process id and check

lsof -p <pid> | grep jets3t

it will tell you what jets3t jar available on the classpath

avatar

@Rajkumar Singh Thanks. I could see that jets3t-0.9.0.jar is loaded.

Also as per @Rajesh Balamohan suggestion moving from s3n to s3a , I could get it working.

avatar
Super Guru

@Santhosh B Gowda

Was this a fresh install or an upgrade from an older version of HDP? If this was an upgrade, this thread may be useful:

http://stackoverflow.com/questions/33852044/why-can-i-not-read-from-the-aws-s3-in-spark-application-...

As I see in your last post, you mention a path /usr/hdp/2.5.3.0-14/hadoop/lib/jets3t-0.9.0.jar, could you also run the following and post the result?

ls -lrt /usr/hdp/

avatar
Super Guru

See link below to learn why s3a is a better option than s3n, but that may not be the cause for your issue.

https://wiki.apache.org/hadoop/AmazonS3

avatar

@Constantin Stanca I see this issue with both fresh and upgraded system and moving from s3n and s3a help me in uploading to S3.