03-08-2017 02:51 PM
I followed all the instructions here: https://www.cloudera.com/documentation/enterprise/latest/topics/navigator_s3.html
Things weren't working and I noticed that the metadata server was logging an exception:
Host: FOO File: /var/log/cloudera-scm-navigator/mgmt-cmf-mgmt-NAVIGATORMETASERVER-FOO.log.out [ExtractorServicePoller-0]: Unable to execute task java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: com/mysql/jdbc/StringUtils at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at com.cloudera.nav.extract.ExtractorScheduler.removeCompletedTasks(ExtractorScheduler.java:214) at com.cloudera.nav.extract.ExtractorScheduler.poll(ExtractorScheduler.java:154) at com.cloudera.nav.extract.ExtractorScheduler.poll(ExtractorScheduler.java:149) at com.cloudera.nav.extract.ExtractorScheduler.access$000(ExtractorScheduler.java:50) at com.cloudera.nav.extract.ExtractorScheduler$1.run(ExtractorScheduler.java:94) at com.cloudera.nav.extract.ExtractorScheduler$RefreshPollPeriod.run(ExtractorScheduler.java:278) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NoClassDefFoundError: com/mysql/jdbc/StringUtils at com.cloudera.nav.s3.AwsRegionUtils.getBucketRegionViaS3Api(AwsRegionUtils.java:15) at com.cloudera.nav.s3.extractor.S3ExtractorTaskFactory.getIncrementalEnabledTasks(S3ExtractorTaskFactory.java:79) at com.cloudera.nav.s3.extractor.S3ExtractorTaskFactory.getTasks(S3ExtractorTaskFactory.java:59) at com.cloudera.nav.s3.extractor.S3ExtractorRunnable.run(S3ExtractorRunnable.java:109) at com.cloudera.nav.s3.extractor.S3ExtractorFactory$S3TasksSequentialRunnable.run(S3ExtractorFactory.java:186) at com.cloudera.nav.extract.ExtractorScheduler$ErrorLoggingRunnable.run(ExtractorScheduler.java:249) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) ... 3 more
Essentially a missing class.
I poked around on the node and coudln't find any mysql jars, so I took matters into my own hands:
1. Download the mysql connector from here: https://dev.mysql.com/downloads/file/?id=468319
2. Exctract it.
3. scp mysql-connector-java-5.1.41-bin.jar to /usr/share/cmf/cloudera-navigator-server/jars/
4. Restart the metadata server.
Things work great. Navigator shows S3 as a data source now.
03-14-2017 08:29 AM
Thank you for reaching out to us on this issue. I'll pass this information along however you should note that we do not provide nor ship the odbc/jdbc connectors for a variety of database types. When you are installing Navigator and several other services our documentation does reference the need to obtain and install the proper database driver in order for services to work. This may be slightly more difficult to locate in our documentation unfrotunately. It certainly does not appear in the S3 documnetation you referenced.
03-14-2017 09:34 AM
> you should note that we do not provide nor ship the odbc/jdbc connectors for a variety of database types.
Sure, that makes sense. I but I just want to point out that the Cloudera s3 code has a hard functional dependency on a class provided by Mysql, which is kind of silly when you think about it.
Bear in mind, that nowhere in mystack is a running instance of Mysql. There shouldn't be any need for this dependency.
If I had to venture a guess, I'd say that an errant dev accidentally introduced the dependency while trying to use another StringUtils library.
03-14-2017 10:26 AM - edited 03-14-2017 10:27 AM
The log data you provided points the following service.
This services requires a database backend such as Oracle, MySQL, or PostgreSQL. The database is usually defined and matched to what is used directly by Cloudera Manager though it can be configured seperately.
When you followed the documentation to enable S3 with navigator you essentailly enabled a plugin set which allows you to use Navigator to view technical metadata, assign business metadata, and view lineage for S3 objects on your cluster. The meta information is stored in a backing database of some type. Can you please review the configuration of the Cloudera Management Services and let me know what type of database is listed there? I am not aware of a hard requirement for the mySQL driver but rather there is a requirement based on the type of backing database the service is trying to use to store information.
Cloudera Manager -> Cloudera Management Service -> Configuration -> Database -> Navigator Metadata Server Database Type
If this is something other than mySQL then I think we are on the right path for me to submit information to our engineering teams.
03-14-2017 11:22 AM
Thank you for the information you have provided. I have created an internal case to investigate this. If we have any additional questions we weill attempt to reach out to you directly. If you are a licensed customer you may also submit a case or contact a member of your account team.
03-16-2017 01:00 PM
Our engineering teams have confirmed the conditions you have identified. We expect corrections to be availble in future releases.