Support Questions

Find answers, ask questions, and share your expertise

BUG REPORT (with fix): metadata server is missing a jar required for S3 metadata extraction

avatar
Explorer

I followed all the instructions here: https://www.cloudera.com/documentation/enterprise/latest/topics/navigator_s3.html

 

Things weren't working and I noticed that the metadata server was logging an exception:

 

 

Host: FOO
File: /var/log/cloudera-scm-navigator/mgmt-cmf-mgmt-NAVIGATORMETASERVER-FOO.log.out

[ExtractorServicePoller-0]: Unable to execute task
java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: com/mysql/jdbc/StringUtils
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at com.cloudera.nav.extract.ExtractorScheduler.removeCompletedTasks(ExtractorScheduler.java:214)
	at com.cloudera.nav.extract.ExtractorScheduler.poll(ExtractorScheduler.java:154)
	at com.cloudera.nav.extract.ExtractorScheduler.poll(ExtractorScheduler.java:149)
	at com.cloudera.nav.extract.ExtractorScheduler.access$000(ExtractorScheduler.java:50)
	at com.cloudera.nav.extract.ExtractorScheduler$1.run(ExtractorScheduler.java:94)
	at com.cloudera.nav.extract.ExtractorScheduler$RefreshPollPeriod.run(ExtractorScheduler.java:278)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: com/mysql/jdbc/StringUtils
	at com.cloudera.nav.s3.AwsRegionUtils.getBucketRegionViaS3Api(AwsRegionUtils.java:15)
	at com.cloudera.nav.s3.extractor.S3ExtractorTaskFactory.getIncrementalEnabledTasks(S3ExtractorTaskFactory.java:79)
	at com.cloudera.nav.s3.extractor.S3ExtractorTaskFactory.getTasks(S3ExtractorTaskFactory.java:59)
	at com.cloudera.nav.s3.extractor.S3ExtractorRunnable.run(S3ExtractorRunnable.java:109)
	at com.cloudera.nav.s3.extractor.S3ExtractorFactory$S3TasksSequentialRunnable.run(S3ExtractorFactory.java:186)
	at com.cloudera.nav.extract.ExtractorScheduler$ErrorLoggingRunnable.run(ExtractorScheduler.java:249)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	... 3 more

Essentially a missing class.

 

I poked around on the node and coudln't find any mysql jars, so I took matters into my own hands:

 

1. Download the mysql connector from here: https://dev.mysql.com/downloads/file/?id=468319
2. Exctract it.
3. scp mysql-connector-java-5.1.41-bin.jar to /usr/share/cmf/cloudera-navigator-server/jars/
4. Restart the metadata server.

 

Things work great. Navigator shows S3 as a data source now.

2 ACCEPTED SOLUTIONS

avatar
Expert Contributor

Hello,

 

Thank you for reaching out to us on this issue. I'll pass this information along however you should note that we do not provide nor ship the odbc/jdbc connectors for a variety of database types. When you are installing Navigator and several other services our documentation does reference the need to obtain and install the proper database driver in order for services to work. This may be slightly more difficult to locate in our documentation unfrotunately. It certainly does not appear in the S3 documnetation you referenced.

 

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ig_install_path_b.html#cmig_topic...

 

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ig_mysql.html#cmig_topic_5_5

 

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ig_mysql.html#cmig_topic_5_5_3

---
Customer Operations Engineer | Security SME | Cloudera, Inc.

View solution in original post

avatar
Expert Contributor

Hello,

 

Our engineering teams have confirmed the conditions you have identified. We expect corrections to be availble in future releases.

---
Customer Operations Engineer | Security SME | Cloudera, Inc.

View solution in original post

7 REPLIES 7

avatar
Explorer
BTW, this was with CDH 5.10.0 with latest CM and Navigator versions.

avatar
Expert Contributor

Hello,

 

Thank you for reaching out to us on this issue. I'll pass this information along however you should note that we do not provide nor ship the odbc/jdbc connectors for a variety of database types. When you are installing Navigator and several other services our documentation does reference the need to obtain and install the proper database driver in order for services to work. This may be slightly more difficult to locate in our documentation unfrotunately. It certainly does not appear in the S3 documnetation you referenced.

 

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ig_install_path_b.html#cmig_topic...

 

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ig_mysql.html#cmig_topic_5_5

 

https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ig_mysql.html#cmig_topic_5_5_3

---
Customer Operations Engineer | Security SME | Cloudera, Inc.

avatar
Explorer

you should note that we do not provide nor ship the odbc/jdbc connectors for a variety of database types.

 

Sure, that makes sense. I but I just want to point out that the Cloudera s3 code has a hard functional dependency on a class provided by Mysql, which is kind of silly when you think about it.

 

Bear in mind, that nowhere in mystack is a running instance of Mysql. There shouldn't be any need for this dependency.

 

If I had to venture a guess, I'd say that an errant dev accidentally introduced the dependency while trying to use another StringUtils library.

avatar
Expert Contributor

Hi,

 

The log data you provided points the following service.

 

mgmt-cmf-mgmt-NAVIGATORMETASERVER

 

This services requires a database backend such as Oracle, MySQL, or PostgreSQL. The database is usually defined and matched to what is used directly by Cloudera Manager though it can be configured seperately. 

 

When you followed the documentation to enable S3 with navigator you essentailly enabled a plugin set which allows you to use Navigator to view technical metadata, assign business metadata, and view lineage for S3 objects on your cluster. The meta information is stored in a backing database of some type. Can you please review the configuration of the Cloudera Management Services and let me know what type of database is listed there? I am not aware of a hard requirement for the mySQL driver but rather there is a requirement based on the type of backing database the service is trying to use to store information.

 

Cloudera Manager -> Cloudera Management Service -> Configuration -> Database -> Navigator Metadata Server Database Type

 

If this is something other than mySQL then I think we are on the right path for me to submit information to our engineering teams.

 

 

---
Customer Operations Engineer | Security SME | Cloudera, Inc.

avatar
Explorer

navms.db.type is configured to be a PostgreSQL instance.

avatar
Expert Contributor

Hi,

 

Thank you for the information you have provided. I have created an internal case to investigate this. If we have any additional questions we weill attempt to reach out to you directly. If you are a licensed customer you may also submit a case or contact a member of your account team.

---
Customer Operations Engineer | Security SME | Cloudera, Inc.

avatar
Expert Contributor

Hello,

 

Our engineering teams have confirmed the conditions you have identified. We expect corrections to be availble in future releases.

---
Customer Operations Engineer | Security SME | Cloudera, Inc.