<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Not able to create SparkR dataframe using read.df in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Not-able-to-create-SparkR-dataframe-using-read-df/m-p/126613#M47394</link>
    <description>&lt;P&gt;Using Hortonworks Sandbox, I am setting up SparkR in both RStudio and Zeppelin. This below code works properly in RStudio and SparkR shell but not in Zeppelin, please have a look:&lt;/P&gt;&lt;PRE&gt;if (nchar(Sys.getenv("SPARK_HOME")) &amp;lt; 1) {
  Sys.setenv(SPARK_HOME = "/usr/hdp/2.5.0.0-1245/spark")
}
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
sc &amp;lt;- sparkR.init(master = "local[*]", sparkEnvir = list(spark.driver.memory="2g"),sparkPackages="com.databricks:spark-csv_2.10:1.4.0")
sqlContext &amp;lt;- sparkRSQL.init(sc)
train_df &amp;lt;- read.df(sqlContext,"/tmp/first_8.csv","csv", header = "true", inferSchema = "true")&lt;/PRE&gt;&lt;P&gt;But when I do this in Zeppelin using livy.spark interpreter, I get ClassNotFound Exception:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;java.lang.ClassNotFoundException: Failed to find data source: csv. Please find packages at &lt;A href="http://spark-packages.org" target="_blank"&gt;http://spark-packages.org&lt;/A&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;I am also importing the dependencies using dep interpreter -&lt;/P&gt;&lt;PRE&gt;%dep
z.reset()
z.load("com.databricks:spark-csv_2.10:1.4.0")&lt;/PRE&gt;&lt;P&gt;But this seems to make no impact I guess. I have also tried manually copying &lt;STRONG&gt;spark-csv_2.10-1.4.0.jar &lt;/STRONG&gt;to&lt;STRONG&gt; &lt;/STRONG&gt;/usr/hdp/2.5.0.0-1245/spark/lib, but it is not working. Has anyone experienced this before? Thanks in advance&lt;/P&gt;</description>
    <pubDate>Tue, 29 Nov 2016 07:36:55 GMT</pubDate>
    <dc:creator>mrizvi</dc:creator>
    <dc:date>2016-11-29T07:36:55Z</dc:date>
    <item>
      <title>Not able to create SparkR dataframe using read.df</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Not-able-to-create-SparkR-dataframe-using-read-df/m-p/126613#M47394</link>
      <description>&lt;P&gt;Using Hortonworks Sandbox, I am setting up SparkR in both RStudio and Zeppelin. This below code works properly in RStudio and SparkR shell but not in Zeppelin, please have a look:&lt;/P&gt;&lt;PRE&gt;if (nchar(Sys.getenv("SPARK_HOME")) &amp;lt; 1) {
  Sys.setenv(SPARK_HOME = "/usr/hdp/2.5.0.0-1245/spark")
}
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
sc &amp;lt;- sparkR.init(master = "local[*]", sparkEnvir = list(spark.driver.memory="2g"),sparkPackages="com.databricks:spark-csv_2.10:1.4.0")
sqlContext &amp;lt;- sparkRSQL.init(sc)
train_df &amp;lt;- read.df(sqlContext,"/tmp/first_8.csv","csv", header = "true", inferSchema = "true")&lt;/PRE&gt;&lt;P&gt;But when I do this in Zeppelin using livy.spark interpreter, I get ClassNotFound Exception:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;java.lang.ClassNotFoundException: Failed to find data source: csv. Please find packages at &lt;A href="http://spark-packages.org" target="_blank"&gt;http://spark-packages.org&lt;/A&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;I am also importing the dependencies using dep interpreter -&lt;/P&gt;&lt;PRE&gt;%dep
z.reset()
z.load("com.databricks:spark-csv_2.10:1.4.0")&lt;/PRE&gt;&lt;P&gt;But this seems to make no impact I guess. I have also tried manually copying &lt;STRONG&gt;spark-csv_2.10-1.4.0.jar &lt;/STRONG&gt;to&lt;STRONG&gt; &lt;/STRONG&gt;/usr/hdp/2.5.0.0-1245/spark/lib, but it is not working. Has anyone experienced this before? Thanks in advance&lt;/P&gt;</description>
      <pubDate>Tue, 29 Nov 2016 07:36:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Not-able-to-create-SparkR-dataframe-using-read-df/m-p/126613#M47394</guid>
      <dc:creator>mrizvi</dc:creator>
      <dc:date>2016-11-29T07:36:55Z</dc:date>
    </item>
    <item>
      <title>Re: Not able to create SparkR dataframe using read.df</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Not-able-to-create-SparkR-dataframe-using-read-df/m-p/126614#M47395</link>
      <description>&lt;P&gt;Please specify com.databricks:spark-csv_2.10:1.4.0 in the interpreter setting page&lt;/P&gt;</description>
      <pubDate>Mon, 05 Dec 2016 12:11:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Not-able-to-create-SparkR-dataframe-using-read-df/m-p/126614#M47395</guid>
      <dc:creator>jzhang</dc:creator>
      <dc:date>2016-12-05T12:11:41Z</dc:date>
    </item>
    <item>
      <title>Re: Not able to create SparkR dataframe using read.df</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Not-able-to-create-SparkR-dataframe-using-read-df/m-p/126615#M47396</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/453/jzhang.html" nodeid="453"&gt;@jzhang&lt;/A&gt;,should I add it in the livy interpreter?&lt;/P&gt;</description>
      <pubDate>Tue, 06 Dec 2016 03:52:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Not-able-to-create-SparkR-dataframe-using-read-df/m-p/126615#M47396</guid>
      <dc:creator>mrizvi</dc:creator>
      <dc:date>2016-12-06T03:52:12Z</dc:date>
    </item>
    <item>
      <title>Re: Not able to create SparkR dataframe using read.df</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Not-able-to-create-SparkR-dataframe-using-read-df/m-p/126616#M47397</link>
      <description>&lt;P&gt;I tried that, but it didn't work&lt;/P&gt;</description>
      <pubDate>Wed, 07 Dec 2016 04:17:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Not-able-to-create-SparkR-dataframe-using-read-df/m-p/126616#M47397</guid>
      <dc:creator>mrizvi</dc:creator>
      <dc:date>2016-12-07T04:17:29Z</dc:date>
    </item>
    <item>
      <title>Re: Not able to create SparkR dataframe using read.df</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Not-able-to-create-SparkR-dataframe-using-read-df/m-p/126617#M47398</link>
      <description>&lt;P&gt;Got it working finally, thanks to &lt;A rel="user" href="https://community.cloudera.com/users/2373/rhryniewicz.html" nodeid="2373"&gt;@Robert Hryniewicz&lt;/A&gt;. Go to interpreter settings page and add the new property under livy settings - &lt;STRONG&gt;livy.spark.jars.packages&lt;/STRONG&gt; and the value &lt;STRONG&gt;com.databricks:spark-csv_2.10:1.4.0. &lt;/STRONG&gt;Restart the interpreter and retry the query.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Dec 2016 06:34:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Not-able-to-create-SparkR-dataframe-using-read-df/m-p/126617#M47398</guid>
      <dc:creator>mrizvi</dc:creator>
      <dc:date>2016-12-07T06:34:29Z</dc:date>
    </item>
  </channel>
</rss>

