<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to Use Spark MLLib Model in Storm? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-Use-Spark-MLLib-Model-in-Storm/m-p/139819#M102444</link>
    <description>&lt;P&gt;You can use PMML (https://de.wikipedia.org/wiki/Predictive_Model_Markup_Language). &lt;/P&gt;&lt;P&gt;Spark does support (not all) model to be exported to PMML:&lt;/P&gt;&lt;P&gt;&lt;A href="http://spark.apache.org/docs/latest/mllib-pmml-model-export.html" target="_blank"&gt;http://spark.apache.org/docs/latest/mllib-pmml-model-export.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;(&lt;STRONG&gt;UPDATE&lt;/STRONG&gt;: As &lt;A rel="user" href="https://community.cloudera.com/users/104/sball.html" nodeid="104"&gt;@Simon Elliston Ball&lt;/A&gt; rightfully points out in his answer, in case the PMML model is not supported the Spark libs can be reused as most of them have no dependency to the SparkContext)&lt;/P&gt;&lt;P&gt;One way could be to use JPMML with Java in Storm:&lt;/P&gt;&lt;P&gt;&lt;A href="http://henning.kropponline.de/2015/09/06/jpmml-example-random-forest/" target="_blank"&gt;http://henning.kropponline.de/2015/09/06/jpmml-example-random-forest/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://github.com/jpmml/jpmml-storm" target="_blank"&gt;https://github.com/jpmml/jpmml-storm&lt;/A&gt;&lt;/P&gt;&lt;P&gt;The other could be to use R in Storm. I have seen it done, but don't have a reference at hand.&lt;/P&gt;</description>
    <pubDate>Tue, 22 Mar 2016 23:14:50 GMT</pubDate>
    <dc:creator>hkropp</dc:creator>
    <dc:date>2016-03-22T23:14:50Z</dc:date>
    <item>
      <title>How to Use Spark MLLib Model in Storm?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-Use-Spark-MLLib-Model-in-Storm/m-p/139818#M102443</link>
      <description>&lt;P&gt;Is there a way to train the model offline in Spark MLLib, and then use it for online ML in Storm?&lt;/P&gt;</description>
      <pubDate>Tue, 22 Mar 2016 22:58:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-Use-Spark-MLLib-Model-in-Storm/m-p/139818#M102443</guid>
      <dc:creator>yjiang</dc:creator>
      <dc:date>2016-03-22T22:58:48Z</dc:date>
    </item>
    <item>
      <title>Re: How to Use Spark MLLib Model in Storm?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-Use-Spark-MLLib-Model-in-Storm/m-p/139819#M102444</link>
      <description>&lt;P&gt;You can use PMML (https://de.wikipedia.org/wiki/Predictive_Model_Markup_Language). &lt;/P&gt;&lt;P&gt;Spark does support (not all) model to be exported to PMML:&lt;/P&gt;&lt;P&gt;&lt;A href="http://spark.apache.org/docs/latest/mllib-pmml-model-export.html" target="_blank"&gt;http://spark.apache.org/docs/latest/mllib-pmml-model-export.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;(&lt;STRONG&gt;UPDATE&lt;/STRONG&gt;: As &lt;A rel="user" href="https://community.cloudera.com/users/104/sball.html" nodeid="104"&gt;@Simon Elliston Ball&lt;/A&gt; rightfully points out in his answer, in case the PMML model is not supported the Spark libs can be reused as most of them have no dependency to the SparkContext)&lt;/P&gt;&lt;P&gt;One way could be to use JPMML with Java in Storm:&lt;/P&gt;&lt;P&gt;&lt;A href="http://henning.kropponline.de/2015/09/06/jpmml-example-random-forest/" target="_blank"&gt;http://henning.kropponline.de/2015/09/06/jpmml-example-random-forest/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://github.com/jpmml/jpmml-storm" target="_blank"&gt;https://github.com/jpmml/jpmml-storm&lt;/A&gt;&lt;/P&gt;&lt;P&gt;The other could be to use R in Storm. I have seen it done, but don't have a reference at hand.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Mar 2016 23:14:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-Use-Spark-MLLib-Model-in-Storm/m-p/139819#M102444</guid>
      <dc:creator>hkropp</dc:creator>
      <dc:date>2016-03-22T23:14:50Z</dc:date>
    </item>
    <item>
      <title>Re: How to Use Spark MLLib Model in Storm?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-Use-Spark-MLLib-Model-in-Storm/m-p/139820#M102445</link>
      <description>&lt;P&gt;In in an advanced architecture you would leverage Zookeeper to announce a new model to the topology without taking it offline.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Mar 2016 23:17:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-Use-Spark-MLLib-Model-in-Storm/m-p/139820#M102445</guid>
      <dc:creator>hkropp</dc:creator>
      <dc:date>2016-03-22T23:17:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to Use Spark MLLib Model in Storm?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-Use-Spark-MLLib-Model-in-Storm/m-p/139821#M102446</link>
      <description>&lt;P&gt;PMML is certainly a good option, but be aware that Spark does not support the transformation elements of PMML, so you will need to recreate any feature scaling and transformation before the scoring step.&lt;/P&gt;&lt;P&gt;The other thing to note is that many of the Spark Model classes do not depend on the spark context, so you can link spark to you storm topology and just use the Spark Model itself. &lt;/P&gt;&lt;P&gt;This can lead to some unnecessary code in your jar, but has the advantage that you don't need to go through the PMML format. &lt;/P&gt;</description>
      <pubDate>Wed, 23 Mar 2016 00:08:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-Use-Spark-MLLib-Model-in-Storm/m-p/139821#M102446</guid>
      <dc:creator>sball</dc:creator>
      <dc:date>2016-03-23T00:08:34Z</dc:date>
    </item>
    <item>
      <title>Re: How to Use Spark MLLib Model in Storm?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-Use-Spark-MLLib-Model-in-Storm/m-p/139822#M102447</link>
      <description>&lt;P&gt;+1 for the aspect to reuse Spark code itself&lt;/P&gt;</description>
      <pubDate>Wed, 23 Mar 2016 02:22:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-Use-Spark-MLLib-Model-in-Storm/m-p/139822#M102447</guid>
      <dc:creator>hkropp</dc:creator>
      <dc:date>2016-03-23T02:22:58Z</dc:date>
    </item>
  </channel>
</rss>

