Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Storm JAR Version Conflicts

avatar
Rising Star

Hi all,

We have a Storm topology that has a bolt which is required to go over the proxy, to do so we are using httpcore and httpclient, but the versions we are using are newer then the version which Storm has. The latest version has some new methods that we are using which the old version packaged with Storm does not have. This in turn is causing our Bolt to fail repeatedly with a 'NoSuchMethod' error... We believe this is being caused because the Bolt is picking up the older version on the classpath and not the newest version that we packaged into the fat Jar with the topology.

In MapReduce we can set to respect the user classpath first... Is there any such feature in Storm that we can use to get around this other then implmenting our own ClassLoader?

1 ACCEPTED SOLUTION

avatar
Rising Star

Solution was to remove the offending jars from the /extlib-deamon folder.. It turns out these jars are in here for Ranger and not Storm itself... Because we are not using Ranger with Storm removing these was not a problem.

View solution in original post

8 REPLIES 8

avatar
Master Mentor

did you try to use maven exclude tag to ignore the versions shipped with Storm? https://maven.apache.org/guides/introduction/introduction-to-optional-and-excludes-dependencies.html

avatar

@jniemiec which version of storm are you using. From 0.10 onwards we relocated the dependencies in storm so that user topologies won't run into conflicts like above.

avatar

@jniemiec I had NoSuchMethod error trying to run twitter demo in hdp 2.3.2. find attached pom.xml that worked for me, maybe it can help: pomxml.txt

avatar
Expert Contributor

You'll need to use the maven shade plugin to relocate your httpcore and httpclient classes to a different package in order to avoid a conflict. Relocating Classes

Should be something like this (not tested):

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-shade-plugin</artifactId>
  <version>2.4.2</version>
  <executions>
    <execution>
      <phase>package</phase>
      <goals>
        <goal>shade</goal>
      </goals>
      <configuration>
        <relocations>
          <relocation>
            <pattern>org.apache.httpcomponents</pattern>
            <shadedPattern>org.shaded.httpcomponents</shadedPattern>
          </relocation>
        </relocations>
      </configuration>
    </execution>
  </executions>
</plugin>

avatar
Rising Star

Hi guys,

Using Maven is now allowed by the Legal team here, we can only use Gradle, and only what Gradle packages out of the box, no add-ons. Ergo Gradle does not have the ability to re-locate package names to prevent collisions like we can do with Maven Shade.

Yes @schintalapani we are using Storm .10.

This Jira looks EXACTLY what we need.

https://issues.apache.org/jira/browse/STORM-129

avatar
Rising Star

Solution was to remove the offending jars from the /extlib-deamon folder.. It turns out these jars are in here for Ranger and not Storm itself... Because we are not using Ranger with Storm removing these was not a problem.

avatar
Rising Star

avatar

Per the Storm documentation (http://storm.apache.org/releases/0.10.0/Setting-up-a-Storm-cluster.html): "If you need support from external libraries or custom plugins, you can place such jars into the extlib/ and extlib-daemon/ directories. Note that the extlib-daemon/ directory stores jars used only by daemons (Nimbus, Supervisor, DRPC, UI, Logviewer), e.g., HDFS and customized scheduling libraries. Accordingly, two environmental variables STORM_EXT_CLASSPATH and STORM_EXT_CLASSPATH_DAEMON can be configured by users for including the external classpath and daemon-only external classpath."

This means that extlib-daemon should not be on the classpath for workers. This functionality was introduced with STORM-483 (https://github.com/apache/storm/commit/05306d5053ff91bd323c4b54cd246c9f928ca339), and supervisor.clj was supposed to be updated as follows:

           topo-classpath (if-let [cp (storm-conf TOPOLOGY-CLASSPATH)]
                             [cp]
                             [])
 -          classpath (-> (current-classpath)
 +          classpath (-> (worker-classpath)
                          (add-to-classpath [stormjar])
                          (add-to-classpath topo-classpath))
            top-gc-opts (storm-conf TOPOLOGY-WORKER-GC-CHILDOPTS)

However, I decompiled the Hortonworks storm-core jar, and the old version of the code that calls current-classpath still appears:

    public static final Var const__10 = RT.var((String)"backtype.storm.util", (String)"add-to-classpath");
    public static final Var const__11 = RT.var((String)"backtype.storm.util", (String)"current-classpath");

...
        v15 = new Object[1];
        v16 = stormjar;
        stormjar = null;
        v15[0] = v16;
        v17 = topo_classpath;
        topo_classpath = null;
        classpath = ((IFn)supervisor$fn__6546.const__10.getRawRoot()).invoke(((IFn)supervisor$fn__6546.const__10.getRawRoot()).invoke(((IFn)supervisor$fn__6546.const__11.getRawRoot()).invoke(), (Object)RT.vector((Object[])v15)), (Object)v17);

I believe that worker-classpath was designed to construct a classpath for a worker JVM that does not include daemon-specific locations (like extlib-daemon). However since the HDP version does not call worker-classpath, the worker ends up inheriting the supervisor's classpath via the call to current-classpath. I checked storm-core-0.10.0.2.4.1.1-3.jar which I believe is the latest HDP build, and it still does not call worker-classpath. This seems like a bug.