<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: how to query data from mongodb with spark in zeeplin? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177720#M77510</link>
    <description>&lt;P&gt;Thanks &lt;A rel="user" href="https://community.cloudera.com/users/11048/falbani.html" nodeid="11048"&gt;@Felix Albani&lt;/A&gt; ! I have a question! is spark interpreter is the best in this case? is spark work with mongodb in the same way with hdfs (memory+speed) ??&lt;/P&gt;</description>
    <pubDate>Fri, 01 Jun 2018 21:39:23 GMT</pubDate>
    <dc:creator>chaouki_trabels</dc:creator>
    <dc:date>2018-06-01T21:39:23Z</dc:date>
    <item>
      <title>how to query data from mongodb with spark in zeeplin?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177717#M77507</link>
      <description>&lt;P&gt;Hello , I recently installed hdp-2. with hdp2.6 sandbox in vmawre ! I runned some job with spark to transform csv data and I saved them in mongodb Now I want to visualise some charts dashboard from my Mongodb data base so I added mongodb interpreter to zeeplin but it seems mongodb not good since I have a collection that contain 2 GB so I decide to work with spark ! how can I read data from Mongodb ! I must import some library for example  com.mongodb.spark.sql._ and com.mongodb.spark.config.ReadConfig  com.mongodb.spark.MongoSpark how I can do this in zeeplin ! &lt;/P&gt;&lt;P&gt;in addition spark interpreter is better than mongodb interpreter in this context.&lt;/P&gt;&lt;P&gt;Thanks in advance&lt;/P&gt;</description>
      <pubDate>Mon, 23 Apr 2018 16:38:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177717#M77507</guid>
      <dc:creator>chaouki_trabels</dc:creator>
      <dc:date>2018-04-23T16:38:59Z</dc:date>
    </item>
    <item>
      <title>Re: how to query data from mongodb with spark in zeeplin?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177718#M77508</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/71223/chaoukitrabelsi.html" nodeid="71223"&gt;@chaouki trabelsi &lt;/A&gt;&lt;P&gt;any updates on this issue ? it is very important.&lt;/P&gt;</description>
      <pubDate>Fri, 01 Jun 2018 19:25:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177718#M77508</guid>
      <dc:creator>144675</dc:creator>
      <dc:date>2018-06-01T19:25:25Z</dc:date>
    </item>
    <item>
      <title>Re: how to query data from mongodb with spark in zeeplin?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177719#M77509</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/71223/chaoukitrabelsi.html" nodeid="71223"&gt;@chaouki trabelsi&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/47540/144675.html" nodeid="47540"&gt;@Victor&lt;/A&gt;&lt;/P&gt;&lt;P&gt;There are 2 approaches you can take. One is using package and the other is using jars (you need to download the jars)&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Package approach&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Add the following configuration on your zeppelin spark interpreter:&lt;/P&gt;&lt;PRE&gt;spark.jars.packages = org.mongodb.spark:mongo-spark-connector_2.11:2.2.2
# for more information read here &lt;A href="https://spark-packages.org/package/mongodb/mongo-spark" target="_blank"&gt;https://spark-packages.org/package/mongodb/mongo-spark&lt;/A&gt;&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;Jar approach&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;You need to add the mongo db connector jars to the spark interpreter configuration. &lt;/P&gt;&lt;P&gt;1. Download the mongodb connector jar for spark (depending on your spark version make sure you download the correct scala version - if spark2 you should use 2.11 scala) &lt;/P&gt;&lt;P&gt;2. Add the jars to the zeppelin spark interpreter using spark.jars property&lt;/P&gt;&lt;PRE&gt;spark.jars = /location/of/jars&lt;/PRE&gt;&lt;P&gt;On both cases you need to save and restart the interpreter.&lt;/P&gt;&lt;P&gt;HTH&lt;/P&gt;&lt;P&gt;*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.&lt;/P&gt;</description>
      <pubDate>Fri, 01 Jun 2018 21:06:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177719#M77509</guid>
      <dc:creator>falbani</dc:creator>
      <dc:date>2018-06-01T21:06:12Z</dc:date>
    </item>
    <item>
      <title>Re: how to query data from mongodb with spark in zeeplin?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177720#M77510</link>
      <description>&lt;P&gt;Thanks &lt;A rel="user" href="https://community.cloudera.com/users/11048/falbani.html" nodeid="11048"&gt;@Felix Albani&lt;/A&gt; ! I have a question! is spark interpreter is the best in this case? is spark work with mongodb in the same way with hdfs (memory+speed) ??&lt;/P&gt;</description>
      <pubDate>Fri, 01 Jun 2018 21:39:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177720#M77510</guid>
      <dc:creator>chaouki_trabels</dc:creator>
      <dc:date>2018-06-01T21:39:23Z</dc:date>
    </item>
    <item>
      <title>Re: how to query data from mongodb with spark in zeeplin?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177721#M77511</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/71223/chaoukitrabelsi.html" nodeid="71223"&gt;@chaouki trabelsi&lt;/A&gt; mongodb connector is build to leverage spark parallelism. So I think is a good alternative on this case. If you have further questions on how to use it or anything else please open a separate thread! Thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 01 Jun 2018 21:47:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177721#M77511</guid>
      <dc:creator>falbani</dc:creator>
      <dc:date>2018-06-01T21:47:27Z</dc:date>
    </item>
    <item>
      <title>Re: how to query data from mongodb with spark in zeeplin?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177722#M77512</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11048/falbani.html" nodeid="11048" target="_blank"&gt;@Felix Albani&lt;/A&gt; Can you kindly help me with this error&lt;/P&gt;&lt;P&gt;And my code is&lt;/P&gt;&lt;PRE&gt;%spark2.pyspark
from pyspark.sql import SparkSession
my_spark = SparkSession     .builder     .appName("myApp")     .config("spark.mongodb.input.uri", "mongodb://127.0.0.1/db.col")     .config("spark.mongodb.output.uri", "mongodb://127.0.0.1/db.col")     .getOrCreate()&lt;/PRE&gt;&lt;DIV&gt;And the output is&lt;/DIV&gt;&lt;PRE&gt;&amp;lt;pyspark.sql.session.SparkSession object at 0x7ffa96a92c18&amp;gt; &lt;/PRE&gt;&lt;DIV&gt;Then this one causes an error&lt;/DIV&gt;&lt;PRE&gt;df = my_spark.read.format("com.mongodb.spark.sql.DefaultSource").load()&lt;/PRE&gt;&lt;P&gt;And the error is &lt;/P&gt;&lt;P&gt;": java.lang.NoClassDefFoundError: com/mongodb/ConnectionString"&lt;/P&gt;&lt;P&gt;Here is the jar file I added in the Zeppelin interpreter&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="77820-1.png" style="width: 1222px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19547iF62283CA9C25E869/image-size/medium?v=v2&amp;amp;px=400" role="button" title="77820-1.png" alt="77820-1.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 09:40:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177722#M77512</guid>
      <dc:creator>144675</dc:creator>
      <dc:date>2019-08-18T09:40:04Z</dc:date>
    </item>
    <item>
      <title>Re: how to query data from mongodb with spark in zeeplin?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177723#M77513</link>
      <description>&lt;P&gt;
	guys I am having the following issue trying to query mongo db from zeppelin with spark:&lt;/P&gt;&lt;P&gt;
	java.lang.IllegalArgumentException: Missing collection name. Set via the 'spark.mongodb.input.uri' or 'spark.mongodb.input.collection' property&lt;/P&gt;&lt;P&gt;
	I have set mongo-spark-connector_2.11:2.2.2 in dependencies of spark2 interpretator&lt;/P&gt;&lt;P&gt;
	and my code is:&lt;/P&gt;&lt;PRE&gt;%spark2
import com.mongodb.spark._
spark.conf.set("spark.mongodb.input.uri", "mongodb://myip:myport/mydb.collection")
spark.conf.set("spark.mongodb.output.uri", "mongodb://myip:myport/mydb.collection")

val rdd = MongoSpark.load(sc)
&lt;/PRE&gt;&lt;P&gt;I also tried:&lt;/P&gt;&lt;PRE&gt;%spark2
sc.stop()
import org.apache.spark.sql.SparkSession
import com.mongodb.spark._
import com.mongodb.spark.config._

val spark_custom_session = SparkSession.builder()
      .master("local")
      .appName("ZeplinMongo")
      .config("spark.mongodb.input.database", "mongodb://myip:myport/mydb.collection")
      .config("spark.mongodb.output.uri", "mongodb://myip:myport/mydb.collection")
      .config("spark.mongodb.output.collection", "mongodb://myip:myport/mydb.collection")      .getOrCreate()
val customRdd = MongoSpark.load(spark_custom_session)
rdd.count
&lt;/PRE&gt;&lt;P&gt;And&lt;/P&gt;&lt;PRE&gt;import com.mongodb.spark.config._
val readConfig = ReadConfig(Map(
    "spark.mongodb.input.uri" -&amp;gt; "mongodb://myip:myport/mydb.collection", 
    "spark.mongodb.input.readPreference.name" -&amp;gt; "secondaryPreferred"), 
    Some(ReadConfig(sc)))
val customRdd = MongoSpark.load(sc, readConfig)
customRdd.count&lt;/PRE&gt;&lt;P&gt;What ever I do I get:&lt;/P&gt;&lt;P&gt;import org.apache.spark.sql.SparkSession
import com.mongodb.spark._
import com.mongodb.spark.config._
spark_custom_session: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@4f9c7e5f
java.lang.IllegalArgumentException: Missing collection name. Set via the 'spark.mongodb.input.uri' or 'spark.mongodb.input.collection' property
  at com.mongodb.spark.config.MongoCompanionConfig$class.collectionName(MongoCompanionConfig.scala:270)
  at com.mongodb.spark.config.ReadConfig$.collectionName(ReadConfig.scala:39)
  at com.mongodb.spark.config.ReadConfig$.apply(ReadConfig.scala:60)
  at com.mongodb.spark.config.ReadConfig$.apply(ReadConfig.scala:39)
  at com.mongodb.spark.config.MongoCompanionConfig$class.apply(MongoCompanionConfig.scala:124)
  at com.mongodb.spark.config.ReadConfig$.apply(ReadConfig.scala:39)
  at com.mongodb.spark.config.MongoCompanionConfig$class.apply(MongoCompanionConfig.scala:113)
  at com.mongodb.spark.config.ReadConfig$.apply(ReadConfig.scala:39)
  at com.mongodb.spark.MongoSpark$Builder.build(MongoSpark.scala:231)
  at com.mongodb.spark.MongoSpark$.load(MongoSpark.scala:84)
  ... 73 elided&lt;/P&gt;&lt;P&gt;PLEASE HELP! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 19:48:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177723#M77513</guid>
      <dc:creator>dpevni</dc:creator>
      <dc:date>2018-09-18T19:48:41Z</dc:date>
    </item>
    <item>
      <title>Re: how to query data from mongodb with spark in zeeplin?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177724#M77514</link>
      <description>&lt;P&gt;Hello &lt;A rel="user" href="https://community.cloudera.com/users/86330/dpevni.html" nodeid="86330"&gt;@Daniel Pevni&lt;/A&gt; what's spark version and mongodb version ?&lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 20:15:08 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177724#M77514</guid>
      <dc:creator>chaoukitrabelsi</dc:creator>
      <dc:date>2018-09-18T20:15:08Z</dc:date>
    </item>
    <item>
      <title>Re: how to query data from mongodb with spark in zeeplin?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177725#M77515</link>
      <description>&lt;P&gt;spark2 mongoldb 3.2&lt;/P&gt;</description>
      <pubDate>Fri, 21 Sep 2018 05:20:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/how-to-query-data-from-mongodb-with-spark-in-zeeplin/m-p/177725#M77515</guid>
      <dc:creator>dpevni</dc:creator>
      <dc:date>2018-09-21T05:20:58Z</dc:date>
    </item>
  </channel>
</rss>

