05-02-2015 04:43 AM
I installed CDH 5.4 with cloudera manager from http://archive.cloudera.com/cdh5/parcels/5.4/
Now when i start the pyspark or spark-shell i get the Exception: "Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/databind/Module"
This might be caused by librarries missing from the spark-assembly-1.3.0-cdh5.4.0-hadoop2.6.0-cdh5.4.0.jar
There is no com.fasterxml sub directory in the lib directory of the jar.
In the 1.3.1 assembly downloaded from the apache spark website there is a com.fasterxml lib directory and starting spark-shell and pyspark works perfect.
How do i replace the CDH 5.4 parcel spark version with the apache spark version (for hadoop 2.6) is it enough if i just upload the new spark-assembly to the spark hdfs directory?
and is there someting wrong with the CDH 5.4 spark jar?
thx for any assistance
05-02-2015 09:51 AM
ok, finally got spark working by upgrading to spark 1.3.1 downloaded from the spark project website.
i replaced all the old spark jars and executables in parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/spark and now everything works ok.
i didn't change the enviroment vars or other config files after installen cdh 5.4. so that could not be the cause of spark not working.
05-02-2015 09:57 AM
I don't think this is a great idea, and would not advise anyone to do this. You're incompletely modifying the default deployment in a way that does not necessarily work with the rest of the ecosystem. Since CDH 5.4 works as-is (try a new VM if you don't believe me), it must be something with your environment.
05-02-2015 10:24 AM
i aggree this is not the ideal fix but it seems to work.
To find out what is wrong with my environment config i could search for another week. Something must have gone slightly wrong when i did the upgrade from 5.3 to 5.4.
When a new CDH version comes out i will change everything back to the officially supported version.
05-07-2015 01:45 AM - edited 05-07-2015 01:46 AM
I recently upgraded Spark from 1.2 to 1.3 through the package upgrade route. I'm also facing the same issue :
2015-05-07 04:26:51,625 ERROR akka.actor.ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-3] shutting down ActorSystem [sparkMaster]
at java.lang.Class.forName0(Native Method)
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.databind.Module
at java.security.AccessController.doPrivileged(Native Method)
... 22 more
I don't find the fasterxml libraries in the spark-assembly jars
05-07-2015 10:33 PM
I think this is all patching over some deeper problem in your configuration then. You have old versions of something on some classpath. Nothing requires you to rebuild any code.
11-03-2015 09:37 PM
I had the same issue, running spark-submit on cdh5.4.0, and a search led to this forum post.
Running a Spark build normally produces a spark-assembly-<version>.jar that includes all of the jackson classes (com.fasterxml.jackson.*) and also includes the class that is missing here.
Looking at the Cloudera-built spark-assembly jar, it doesn't contain any jackson classes at all.
Instead, Cloudera separately adds a bunch of jackson jars from their jars folder to the classpath. Note that jackson-module-scala is missing from this list:
[ah_tmp_guest@apollo-mini-jp-cdhspark-001 bin]$ ./compute-classpath.sh | tr ':' '\n' | grep jackson
So it really appears that this is a Cloudera bug. What am I missing?
Also, here's a Jira for this issue that someone filed back in May. No love at all:
11-04-2015 01:40 AM
Generally, Spark distributions are built using its "hadoop-provided" profile, which is not I think how you are building Spark. That is, Hadoop classes and a bunch of its dependencies are provided at runtime by the cluster. This is why you find a lot less bundled in the CDH assembly (or if you build with hadoop-provided yourself). That much is not a bug, no.
I am not clear on the cause in the OP, since CDH does not show this error if you simply run pyspark. There must be more to it that's missing from the description, if it's not indeed just down to a local deployment problem.