<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Spark : How to make calls to database using foreachPartition in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Spark-How-to-make-calls-to-database-using-foreachPartition/m-p/123340#M86084</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/13772/chmamidala.html" nodeid="13772"&gt;@Aditya Mamidala&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Here's a working example of foreachPartition that I've used as part of a project. This is part of a Spark Streaming process, where "event" is a DStream, and each stream is written to HBase via Phoenix (JDBC). I have a structure similar to what you tried in your code, where I first use foreachRDD then foreachPartition. &lt;/P&gt;&lt;PRE&gt; event.map(x =&amp;gt; x._2 ).foreachRDD { rdd =&amp;gt;
    rdd.foreachPartition { rddpartition =&amp;gt;
        val thinUrl = "jdbc:phoenix:phoenix.dev:2181:/hbase"
        val conn = DriverManager.getConnection(thinUrl)
        rddpartition.foreach { record =&amp;gt;
            conn.createStatement().execute("UPSERT INTO myTable VALUES (" + record._1 + ")" )
        }
        conn.commit()
    }
}
&lt;/PRE&gt;&lt;P&gt;The full project is located &lt;A href="https://github.com/zaratsian/network_topology_analysis/blob/master/SparkNetworkAnalysis/src/main/scala/com/github/zaratsian/SparkStreaming/SparkNetworkAnalysis.scala"&gt;here&lt;/A&gt;. &lt;/P&gt;</description>
    <pubDate>Mon, 27 Feb 2017 21:11:03 GMT</pubDate>
    <dc:creator>dzaratsian</dc:creator>
    <dc:date>2017-02-27T21:11:03Z</dc:date>
  </channel>
</rss>

