<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Spark SQL - Update Command in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136803#M23375</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3273/sridharbabu1138.html" nodeid="3273"&gt;@Sridhar Babu M&lt;/A&gt; you cannot update text files using sparksql. I would suggest saving/appending results using an RDD or DataFrame instead. &lt;/P&gt;</description>
    <pubDate>Mon, 28 Mar 2016 20:13:35 GMT</pubDate>
    <dc:creator>azeltov</dc:creator>
    <dc:date>2016-03-28T20:13:35Z</dc:date>
    <item>
      <title>Spark SQL - Update Command</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136799#M23371</link>
      <description>&lt;P&gt;I am trying to update the value of a record using spark sql in spark shell&lt;/P&gt;&lt;P&gt;I get executed the command Update tablename set age=20 where name=justin, and I am getting the following errors&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;scala&amp;gt; val teenagers = sqlContext.sql("UPDATE people SET age=20 WHERE name=Justin") &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;java.lang.RuntimeException: [1.1] failure: ``with'' expected but identifier UPDATE found

UPDATE people SET age=20 WHERE name=Justin
^&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;at scala.sys.package$.error(package.scala:27)
   at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:36)&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;......&lt;/STRONG&gt;
&lt;STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;Sridhar &lt;/P&gt;</description>
      <pubDate>Mon, 21 Mar 2016 11:16:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136799#M23371</guid>
      <dc:creator>sridharbabu1138</dc:creator>
      <dc:date>2016-03-21T11:16:07Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL - Update Command</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136800#M23372</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3273/sridharbabu1138.html" nodeid="3273"&gt;@Sridhar Babu M&lt;/A&gt; it depends on the datasource you are updating, not all sources can be updated. What is the backend ur using for the Dataframe People?&lt;/P&gt;&lt;P&gt;For example for hive: it’s possible to update data in Hive using ORC format
&lt;A href="https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-sql-hive-orc-example.html" target="_blank"&gt;https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-sql-hive-orc-example.html&lt;/A&gt;
&lt;/P&gt;</description>
      <pubDate>Tue, 22 Mar 2016 00:21:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136800#M23372</guid>
      <dc:creator>azeltov</dc:creator>
      <dc:date>2016-03-22T00:21:31Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL - Update Command</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136801#M23373</link>
      <description>&lt;P&gt;Backend as default is Spark Sql, in the spark-shell I will be executing the Spark SQL queries.&lt;/P&gt;&lt;P&gt;I have a people.txt file, which has data of names along with ages.&lt;/P&gt;&lt;P&gt;I want to change the age of a particular name to some value..........&lt;/P&gt;&lt;P&gt;Is it possible to change the value in a txt file, using Spark-SQL  query?&lt;/P&gt;&lt;P&gt;Is it possible to modify the value during map and reduce commands in Spark?&lt;/P&gt;&lt;P&gt;Note: I am not having HIVE installed.....&lt;/P&gt;</description>
      <pubDate>Fri, 25 Mar 2016 22:15:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136801#M23373</guid>
      <dc:creator>sridharbabu1138</dc:creator>
      <dc:date>2016-03-25T22:15:59Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL - Update Command</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136802#M23374</link>
      <description>&lt;P&gt;I don't think SparkSQL supports DML on text file datasource just yet. You need to create a DataFrame from the source file, register a table using the DataFrame, select with predicate to get the person whose age you want to update, apply a function to increment the age field, and then overwrite the old table with the new DataFrame. Here is the code:&lt;/P&gt;&lt;PRE&gt;import org.apache.spark.sql._

case class Person(name: String, age: Int)
var personRDD = sc.textFile("/user/spark/people.txt")
var personDF = personRDD.map(x=&amp;gt;x.split(",")).map(x=&amp;gt;Person(x(0),(x(1).trim.toInt))).toDF()
personDF.registerTempTable("people")
var personDF = sqlContext.sql("SELECT name, age FROM people WHERE age = 19")
personDF.show() 

var agedPerson = personDF.map(x=&amp;gt;Person(x.getAs[String]("name"), x.getAs[Int]("age")+2)).toDF()
agedPerson.registerTempTable("people")
var agedPeopleDF = sqlContext.sql("SELECT * FROM people")
agedPeopleDF.show
&lt;/PRE&gt;&lt;P&gt;This assumes that you have the SparkContext and SparkSQLContext, one person per line, file on HDFS at /user/spark/people.txt, and running shell as Spark-Client or Zeppelin.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Mar 2016 11:03:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136802#M23374</guid>
      <dc:creator>vvaks</dc:creator>
      <dc:date>2016-03-28T11:03:56Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL - Update Command</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136803#M23375</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3273/sridharbabu1138.html" nodeid="3273"&gt;@Sridhar Babu M&lt;/A&gt; you cannot update text files using sparksql. I would suggest saving/appending results using an RDD or DataFrame instead. &lt;/P&gt;</description>
      <pubDate>Mon, 28 Mar 2016 20:13:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136803#M23375</guid>
      <dc:creator>azeltov</dc:creator>
      <dc:date>2016-03-28T20:13:35Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL - Update Command</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136804#M23376</link>
      <description>&lt;P&gt;Yup, this code is working fine, but after the execution of the above lines, &lt;/P&gt;&lt;P&gt;the contents of /user/spark/people.txt still has age of justin as 19&lt;/P&gt;&lt;P&gt;justin, 19&lt;/P&gt;&lt;P&gt;Value is not modified.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Mar 2016 20:29:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136804#M23376</guid>
      <dc:creator>sridharbabu1138</dc:creator>
      <dc:date>2016-03-30T20:29:24Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL - Update Command</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136805#M23377</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/3273/sridharbabu1138.html"&gt;Sridhar Babu M&lt;/A&gt;&lt;/P&gt;&lt;P&gt;If you actually need to change the value in the file then you will need to export the resulting Data Frame to file. The save function that is part of DF class creates a files for each partition. If you need a single file you convert back to an RDD and use coalesce(1) to get everything down to a single partition so you get one file. Make sure that you add the dependency in &lt;/P&gt;&lt;P&gt;Zeppelin&lt;/P&gt;&lt;P&gt;%dep&lt;/P&gt;&lt;P&gt;z.load("com.databricks:spark-csv_2.10:1.4.0") &lt;/P&gt;&lt;P&gt;or&lt;/P&gt;&lt;P&gt;spark-shell --packages com.databricks:spark-csv_2.10:1.4.0 &lt;/P&gt;&lt;PRE&gt;import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SaveMode


case class Person(name: String, age: Int)
var personRDD = sc.textFile("/user/spark/people.txt")
var personDF = personRDD.map(x=&amp;gt;x.split(",")).map(x=&amp;gt;Person(x(0),(x(1).trim.toInt))).toDF()
personDF.registerTempTable("people")
var personeDF = sqlContext.sql("SELECT * FROM people")


var agedPerson = personDF.map(x=&amp;gt;if(x.getAs[String]("name")=="Justin"){Person(x.getAs[String]("name"), x.getAs[Int]("age")+2)}else{Person(x.getAs[String]("name"), x.getAs[Int]("age"))}).toDF()
agedPerson.registerTempTable("people")
var agedPeopleDF = sqlContext.sql("SELECT * FROM people")
agedPeopleDF.show
 
agedPeopleDF.select("name", "age").write.format("com.databricks.spark.csv").mode(SaveMode.Overwrite).save("agedPeople")
var agedPeopleRDD = agedPeopleDF.rdd
agedPeopleRDD.coalesce(1).saveAsTextFile("agedPeopleSingleFile")

&lt;/PRE&gt;</description>
      <pubDate>Wed, 30 Mar 2016 23:04:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136805#M23377</guid>
      <dc:creator>vvaks</dc:creator>
      <dc:date>2016-03-30T23:04:48Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL - Update Command</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136806#M23378</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3656/vvaks.html" nodeid="3656"&gt;@Vadim&lt;/A&gt; &lt;/P&gt;&lt;P&gt;I am getting the following errors after executing the statement&lt;/P&gt;&lt;PRE&gt;agedPeopleDF.select("name", "age").write.format("com.databricks.spark.csv").mode(SaveMode.Overwrite).save("agedPeople")&lt;/PRE&gt;&lt;P&gt;Output&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;STRONG&gt;java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$less$colon$less;
at com.databricks.spark.csv.util.CompressionCodecs$.&amp;lt;init&amp;gt;(CompressionCodecs.scala:29)
at com.databricks.spark.csv.util.CompressionCodecs$.&amp;lt;clinit&amp;gt;(CompressionCodecs.scala)
at com.databricks.spark.csv.DefaultSource.createRelation(DefaultSource.scala:189)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139) ..............&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;</description>
      <pubDate>Thu, 31 Mar 2016 07:38:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136806#M23378</guid>
      <dc:creator>sridharbabu1138</dc:creator>
      <dc:date>2016-03-31T07:38:59Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL - Update Command</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136807#M23379</link>
      <description>&lt;PRE&gt;&amp;lt;code&amp;gt;groupId: com.databricks
artifactId: spark-csv_2.10
version: 1.4.0&lt;/PRE&gt;
&lt;P&gt;&lt;A href="https://community.hortonworks.com/users/3273/sridharbabu1138.html"&gt;@Sridhar Babu M&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Make sure that you add the dependency in&lt;/P&gt;&lt;P&gt;Zeppelin&lt;/P&gt;&lt;P&gt;%dep&lt;/P&gt;&lt;P&gt;z.load("com.databricks:spark-csv_2.10:1.4.0")&lt;/P&gt;&lt;P&gt;or&lt;/P&gt;&lt;P&gt;spark-shell --packages com.databricks:spark-csv_2.10:1.4.0&lt;/P&gt;</description>
      <pubDate>Thu, 31 Mar 2016 08:21:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136807#M23379</guid>
      <dc:creator>vvaks</dc:creator>
      <dc:date>2016-03-31T08:21:26Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL - Update Command</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136808#M23380</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3656/vvaks.html" nodeid="3656"&gt;@Vadim&lt;/A&gt; &lt;/P&gt;&lt;P&gt;It is working fine if I execute the below command&lt;/P&gt;&lt;P&gt;spark-shell --packages com.databricks:spark-csv_2.10:1.3.0&lt;/P&gt;&lt;P&gt;by keeping the version as 1.3.0 and not 1.4.0&lt;/P&gt;</description>
      <pubDate>Thu, 31 Mar 2016 15:27:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136808#M23380</guid>
      <dc:creator>sridharbabu1138</dc:creator>
      <dc:date>2016-03-31T15:27:50Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL - Update Command</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136809#M23381</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/users/3273/sridharbabu1138.html"&gt;@Sridhar Babu M&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Glad it worked out. Would you mind accepting this answer and the one from the other thread?&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/24518/spark-sql-query-to-modify-values.html"&gt;https://community.hortonworks.com/questions/24518/spark-sql-query-to-modify-values.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Mar 2016 18:45:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-Update-Command/m-p/136809#M23381</guid>
      <dc:creator>vvaks</dc:creator>
      <dc:date>2016-03-31T18:45:17Z</dc:date>
    </item>
  </channel>
</rss>

