Support Questions

Find answers, ask questions, and share your expertise

problem about compiling mahout from source in cdh5 environment

avatar
Explorer

I am using cdh5 with hadoop version Hadoop 2.3.0-cdh5.1.3

I got this mahout-0.9-cdh5.1.3-src.tar.gz from this location: http://archive.cloudera.com/cdh5/cdh/5/

I tried to compile this mahout from source with mvn using the followed command line: 

mvn -Dhadoop2.version=2.3.0-cdh5.1.3 -DskipTests clean package

 

then I got this error:

[INFO] Scanning for projects...

[WARNING]

[WARNING] Some problems were encountered while building the effective model for org.apache.mahout:mahout-buildtools:jar:0.9-cdh5.1.3

[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-site-plugin is missing. @ line 47, column 15

[WARNING]

[WARNING] Some problems were encountered while building the effective model for org.apache.mahout:mahout-math-scala:jar:0.9-cdh5.1.3

[WARNING] 'build.plugins.plugin.version' for org.scala-tools:maven-scala-plugin is missing. @ org.apache.mahout:mahout-math-scala:[unknown-version], /home/powerlee/mahout-0.9-cdh5.1.3/math-scala/pom.xml, line 108, column 15

[WARNING]

[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.

[WARNING]

[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.

[WARNING]

[ERROR] The projects in the reactor contain a cyclic reference: Edge between 'Vertex{label='org.apache.mahout:mahout-buildtools:0.9-cdh5.1.3'}' and 'Vertex{label='org.apache.mahout:mahout:0.9-cdh5.1.3'}' introduces to cycle in the graph org.apache.mahout:mahout:0.9-cdh5.1.3 --> org.apache.mahout:mahout-buildtools:0.9-cdh5.1.3 --> org.apache.mahout:mahout:0.9-cdh5.1.3 -> [Help 1]

[ERROR]

[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

[ERROR]

[ERROR] For more information about the errors and possible solutions, please read the following articles:

[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectCycleException

 

Is there any idea about how to compile the mahout from source in cdh5 environment ?

 

I had also tried to use mahout-distribution-0.9-src.tar.gz from apache web site to compile. However, it probably cannot run in this cloudera hadoop 2 environment, even the compilation was succeed, because of the 

compatibility about mahout with hadoop 2. When running, a common error would be: 

java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Ah, add "-Dhadoop.profile=23" to the build line. Does that work?

View solution in original post

9 REPLIES 9

avatar
Master Collaborator

Yes, I've seen this too. The issue is that you need to not build "buildtools". Our packaging system, I believe, does not build or include this since it is not relevant. You can simply comment out "<module>buildtools</module>".

 

I don't believe Mahout 0.9 works with any Hadoop 2 distribution out of the box, no. You can get it to work with the right build profile settings though. We also contributed some fixes to the project (after 0.9) to make it work, and these are backported in the CDH release.

 

You don't need to build any of this if you just want the artifacts. They're published at https://repository.cloudera.com/artifactory/cloudera-repos/

 

Of course they are also already present in CDH.

avatar
Explorer

Yes, I want to make a little bit of modification on mahout, that why I need to compile from source.

 

I tried what you said: comment out "<module>buildtools</module>". Luckily, it started to download dependence and compile. Succeed of building mahout math, but met an error when compiling mahout core:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project mahout-core: Compilation failure: Compilation failure:
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/classifier/df/tools/Frequencies.java:[28,30] package org.apache.hadoop.conf does not exist
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/classifier/df/tools/Frequencies.java:[29,30] package org.apache.hadoop.conf does not exist
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/classifier/df/tools/Frequencies.java:[30,28] package org.apache.hadoop.fs does not exist
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/classifier/df/tools/Frequencies.java:[31,28] package org.apache.hadoop.fs does not exist
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/classifier/df/tools/Frequencies.java:[32,30] cannot find symbol
[ERROR] symbol: class Tool
[ERROR] location: package org.apache.hadoop.util
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/classifier/df/tools/Frequencies.java:[33,30] cannot find symbol
[ERROR] symbol: class ToolRunner
[ERROR] location: package org.apache.hadoop.util
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/classifier/df/tools/Frequencies.java:[45,40] cannot find symbol
[ERROR] symbol: class Configured
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/classifier/df/tools/Frequencies.java:[45,62] cannot find symbol
[ERROR] symbol: class Tool
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/GramKey.java:[25,28] package org.apache.hadoop.io does not exist
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/GramKey.java:[26,28] package org.apache.hadoop.io does not exist
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/GramKey.java:[27,28] package org.apache.hadoop.io does not exist
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/Gram.java:[27,28] package org.apache.hadoop.io does not exist
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/Gram.java:[28,28] package org.apache.hadoop.io does not exist
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/Gram.java:[29,28] package org.apache.hadoop.io does not exist
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/Gram.java:[38,27] cannot find symbol
[ERROR] symbol: class BinaryComparable
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/Gram.java:[38,55] cannot find symbol
[ERROR] symbol: class WritableComparable
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/Gram.java:[38,74] cannot find symbol
[ERROR] symbol: class BinaryComparable
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/GramKey.java:[32,36] cannot find symbol
[ERROR] symbol: class BinaryComparable
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/GramKey.java:[32,64] cannot find symbol
[ERROR] symbol: class WritableComparable
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/vectorizer/collocations/llr/GramKey.java:[32,83] cannot find symbol
[ERROR] symbol: class BinaryComparable
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/DenseBlockWritable.java:[24,28] package org.apache.hadoop.io does not exist
[ERROR] /home/powerlee/mahout-0.9-cdh5.1.3/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/DenseBlockWritable.java:[38,44] cannot find symbol
[ERROR] symbol: class Writable

 

Do you know what's the problem here? I guess it should be some problem about hadoop version...

avatar
Master Collaborator

This looks like errors in the modifications you made. This string is not in the original source code for instance:

 

packagejavax.servlet不存在

 

 

avatar
Explorer

sorry

"packagejavax.servlet不存在" is just error message means "packagejavax.servlet does not exist"

avatar
Explorer

at this point, I didn't do any modification yet

avatar
Master Collaborator

Oh right, OK. Is Maven able to download the depenendencies? you don't see earlier errors?

avatar
Master Collaborator

Ah, add "-Dhadoop.profile=23" to the build line. Does that work?

avatar
Explorer

Yes, you are right this time it works. Thanks a lot!

 

After successfully built mahout, I ran a sample job, it also worked well.

 

Let me just make it clear of whole procedure in case some one else needs the same help:

First, just comment out "<module>buildtools</module>" in pom.xml, no other changes

Second, just run: mvn -Dhadoop.profile=23 -DskipTests clean package

Then you get what you want

 

By the way, I still wonder how this "-Dhadoop.profile=23" affects everything. Can you explain it?

 

avatar
Master Collaborator

Have a look at the parent packaging pom to see some additional settings like this that affect the CDH packaging. I don't know how much they're documented beyond this as it's generally rare for anyone to try to rebuild the source. Still, it ought not be hard.