Reply
New Contributor
Posts: 5
Registered: ‎12-17-2018

Conflicting versions of classes in Cloudera 5.14.4

[ Edited ]

Hi,

 
On our current project, we are using cloudera (a bit customized), distribution (version 5.14.4). Its essentially a standard data pipeline with sqoop, spark, spark structured streaming, kafka, hbase etc..
I was trying to figure out if I can replicate the whole pipeline locally with embedded database, local spark, embedded kafka and using HBase testing utilities to start in memory hbase cluster. I am hitting constantly with multiple library version conflicts.
 
1. There is problem with google guava. I had to explicitly exclude it from all cloudera dependencies and add version 12.0.1 explicitly.
2. I see that org.apache.hadoop:hadoop-mapreduce-client-core:2.6.0-cdh5.14.4:jar and org.apache.hadoop:hadoop-mapreduce-client-core:2.6.0-cdh5.14.4:jar, there are conflicting classes in com.apache.hadoop.mapreduce e.g. JobID.
3. I also see an error 
     java.lang.NoSuchFieldError: IS_SECURITY_ENABLED
   while starting mini dfs cluster, when hadoop namenode starts jetty. That means there is some conflicting dependency for Jetty.
 
Have people seen these issues while working on cloudera or other big data platform? Are there any cloudera user groups where I can post these questions? I did a quick google search, but did not find a group or mailing list for cloudera users.
Has anyone tried doing a miniature data pipeline integration test with sqoop, embedded hbase and spark?
New Contributor
Posts: 5
Registered: ‎12-17-2018

Re: Conflicting versions of classes in Cloudera 5.14.4

IS_SECURITY_ENABLED issue is because of conflict in 

org.mortbay.jetty:jsp-2.1:6.1.14:jar
tomcat:jasper-runtime:5.5.23:jar

 

Had to explicitly exclude tomcat:jasper-runtime:5.5.23:jar from all spark testing and habse-testing-utilities dependencies, and then it worked.

New Contributor
Posts: 5
Registered: ‎12-17-2018

Re: Conflicting versions of classes in Cloudera 5.14.4

Also, in my test project, isolated Sqoop and Spark into two separate modules, which isolated the conflicting dependencies on hadoop map reduce.

Announcements
New solutions