<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to Create/LOAD data into table through sparkQL with scala code only..? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159560#M121949</link>
    <description>&lt;P&gt;
This scala code will create a dataframe and load it into Hive. Hope this helps!&lt;/P&gt;&lt;PRE&gt;// Create dummy data and load it into a DataFrame
case class rowschema(id:Int, record:String)
val df = sqlContext.createDataFrame(Seq(rowschema(1,"record1"), rowschema(2,"record2"), rowschema(3,"record3")))
df.registerTempTable("tempTable")

// Create new Hive Table and load tempTable
sqlContext.sql("create table newHiveTable as select * from tempTable")&lt;/PRE&gt;</description>
    <pubDate>Mon, 19 Sep 2016 23:24:07 GMT</pubDate>
    <dc:creator>dzaratsian</dc:creator>
    <dc:date>2016-09-19T23:24:07Z</dc:date>
    <item>
      <title>How to Create/LOAD data into table through sparkQL with scala code only..?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159559#M121948</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I want to create and load data into Hive Table through sparkQL using scala code(i have to built jar and execute through spark-submit) please help me ,it's very thankful to me&lt;/P&gt;</description>
      <pubDate>Mon, 19 Sep 2016 19:35:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159559#M121948</guid>
      <dc:creator>ANJIREDDY</dc:creator>
      <dc:date>2016-09-19T19:35:34Z</dc:date>
    </item>
    <item>
      <title>Re: How to Create/LOAD data into table through sparkQL with scala code only..?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159560#M121949</link>
      <description>&lt;P&gt;
This scala code will create a dataframe and load it into Hive. Hope this helps!&lt;/P&gt;&lt;PRE&gt;// Create dummy data and load it into a DataFrame
case class rowschema(id:Int, record:String)
val df = sqlContext.createDataFrame(Seq(rowschema(1,"record1"), rowschema(2,"record2"), rowschema(3,"record3")))
df.registerTempTable("tempTable")

// Create new Hive Table and load tempTable
sqlContext.sql("create table newHiveTable as select * from tempTable")&lt;/PRE&gt;</description>
      <pubDate>Mon, 19 Sep 2016 23:24:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159560#M121949</guid>
      <dc:creator>dzaratsian</dc:creator>
      <dc:date>2016-09-19T23:24:07Z</dc:date>
    </item>
    <item>
      <title>Re: How to Create/LOAD data into table through sparkQL with scala code only..?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159561#M121950</link>
      <description>&lt;P&gt;Try with Zeppelin:&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/34894/apache-zeppelin-with-hive.html" target="_blank"&gt;https://community.hortonworks.com/questions/34894/apache-zeppelin-with-hive.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Makes it really easy to try things out.&lt;/P&gt;&lt;P&gt;See this article:   &lt;A href="https://community.hortonworks.com/content/kbentry/34784/data-ingest-with-apache-zeppelin-apache-spark-16-h.html" target="_blank"&gt;https://community.hortonworks.com/content/kbentry/34784/data-ingest-with-apache-zeppelin-apache-spark-16-h.html&lt;/A&gt;  &lt;/P&gt;</description>
      <pubDate>Mon, 19 Sep 2016 23:29:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159561#M121950</guid>
      <dc:creator>TimothySpann</dc:creator>
      <dc:date>2016-09-19T23:29:59Z</dc:date>
    </item>
    <item>
      <title>Re: How to Create/LOAD data into table through sparkQL with scala code only..?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159562#M121951</link>
      <description>&lt;P&gt;Hi &lt;A href="https://community.hortonworks.com/users/11452/reddyam3.html"&gt;AnjiReddy&lt;/A&gt;&lt;/P&gt;&lt;P&gt;
Like Dan said, using some type of notebook would probably make your life a lot easier.
It will allow you to work interactively with spark with visual outputs and work with. &lt;/P&gt;&lt;P&gt;
Check out : &lt;/P&gt;&lt;P&gt;
- Zeppelin - &lt;A href="https://zeppelin.apache.org/"&gt;https://zeppelin.apache.org/ &lt;/A&gt;
Note : You can find a pre-installed version of Zeppeling in the hdp sandbox (if you don't want to go throught the hassle of installing and configuring the tool) &lt;/P&gt;&lt;P&gt;- Jupyter - &lt;A href="http://jupyter.org/&amp;nbsp;"&gt;http://jupyter.org/ &lt;/A&gt;&lt;/P&gt;&lt;P&gt;   Note when working with spark, you'll probably want to install the spark (scala) Kernel &lt;A href="https://github.com/ibm-et/spark-kernel"&gt;https://github.com/ibm-et/spark-kernel&lt;/A&gt; &lt;/P&gt;&lt;P&gt;That said you can make it work using only the spark shell. 
I assume that there is a file already available in hdfs at /user/zeppelin/yahoo_stocks.csv
cvs &lt;/P&gt;&lt;P&gt;csv file : &lt;/P&gt;&lt;PRE&gt;date,open,high,low,close,volume,adj_close
2015-04-28,44.34,44.57,43.94,44.34,7188300,44.34
2015-04-27,44.65,45.10,44.25,44.36,10840900,44.36
2015-04-24,43.73,44.71,43.69,44.52,11267500,44.52
2015-04-23,43.92,44.06,43.58,43.70,14274900,43.70
2015-04-22,44.58,44.85,43.67,43.98,32241200,43.98
2015-04-21,45.15,45.18,44.45,44.49,16103700,44.49
2015-04-20,44.73,44.91,44.41,44.66,10052900,44.66
2015-04-17,45.30,45.44,44.25,44.45,13305700,44.45
2015-04-16,45.82,46.13,45.53,45.78,13800300,45.78
2015-04-15,45.46,45.83,45.23,45.73,15033500,45.73
2015-04-14,44.82,45.64,44.79,45.53,12365800,45.53
2015-04-13,45.25,45.59,44.72,44.77,8837300,44.77
2015-04-10,45.79,45.79,45.00,45.18,8436400,45.18
2015-04-09,45.70,46.17,45.16,45.63,13678000,45.63
2015-04-08,43.86,45.19,43.80,45.17,16071000,45.17
2015-04-07,43.79,44.22,43.56,43.61,11382000,43.61
2015-04-06,43.82,44.03,43.61,43.67,10717000,43.67
2015-04-02,44.24,44.36,43.68,44.15,12229400,44.15
2015-04-01,44.45,44.60,43.95,44.13,14722300,44.13
2015-03-31,44.82,45.20,44.42,44.44,10415500,44.44
2015-03-30,45.36,45.42,44.82,44.95,8884300,44.95
2015-03-27,45.20,45.67,45.01,45.10,20563500,45.10
2015-03-26,43.78,44.67,43.68,44.47,16162900,44.47
2015-03-25,44.59,44.93,44.13,44.20,14036900,44.20
2015-03-24,44.64,44.78,44.28,44.42,7559100,44.42
2015-03-23,45.25,45.54,44.71,44.72,8268800,44.72
2015-03-20,45.37,45.58,44.91,45.04,14194200,45.04
2015-03-19,44.90,45.45,44.81,44.98,14758000,44.98
2015-03-18,43.58,44.71,43.43,44.67,18919900,44.67


&lt;/PRE&gt;&lt;P&gt;Launch Spark shell &lt;/P&gt;&lt;P&gt;Note The spark csv parckage makes reading cvs files a lot easier - &lt;A href="https://github.com/databricks/spark-csv" target="_blank"&gt;https://github.com/databricks/spark-csv&lt;/A&gt; &lt;/P&gt;&lt;PRE&gt;spark-shell --master yarn --packages com.databricks:spark-csv_2.10:1.5.0 &lt;/PRE&gt;&lt;P&gt;Code :&lt;/P&gt;&lt;PRE&gt;// create RDD from file 
val input_df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("delimiter",",").load("hdfs://sandbox.hortonworks.com:8020/user/zeppelin/yahoo_stocks.csv") 

// save file to hive (the spark way) 
input_df.write.saveAsTable("%s.%s".format( "default" , "stockPrice" ))

// save the file the sql way 
input_df.registerTempTable("stock_price")
sqlContext.sql("create table hivestockprice as select * from stockPrice") 

//You can read back the data from Hive in the same way 
sqlContext.sql("Select * from hivestockprice").show(10)
&lt;/PRE&gt;&lt;P&gt; The Zeppelin way : &lt;/P&gt;&lt;P&gt;Paragraph 1 : Read data and save to Hive : &lt;/P&gt;&lt;PRE&gt;%spark 
//read file
val input_df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("delimiter",",").load("hdfs://sandbox.hortonworks.com:8020/user/zeppelin/yahoo_stocks.csv") 

//save to Hive
input_df.write.saveAsTable("%s.%s".format( "default" , "stockPrice" )) 
&lt;/PRE&gt;&lt;P&gt;Paragraphe 2 : Read back from Hive using SQL&lt;/P&gt;&lt;PRE&gt;%sql
Select * from stockPrice &lt;/PRE&gt;</description>
      <pubDate>Tue, 20 Sep 2016 04:25:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159562#M121951</guid>
      <dc:creator>mlamairesse</dc:creator>
      <dc:date>2016-09-20T04:25:44Z</dc:date>
    </item>
    <item>
      <title>Re: How to Create/LOAD data into table through sparkQL with scala code only..?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159563#M121952</link>
      <description>&lt;P&gt;This worked well after adding a ; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;val df = sqlContext.createDataFrame(Seq(rowschema(1,"record1"), rowschema(2,"record2"), rowschema(3,"record3")));&lt;/P&gt;</description>
      <pubDate>Thu, 22 Feb 2018 22:36:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159563#M121952</guid>
      <dc:creator>Hagedorn</dc:creator>
      <dc:date>2018-02-22T22:36:50Z</dc:date>
    </item>
    <item>
      <title>Re: How to Create/LOAD data into table through sparkQL with scala code only..?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159564#M121953</link>
      <description>&lt;P&gt;It looks like tempTable is not sticking around:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;scala&amp;gt; case class rowschema(id:Int, record:String){&lt;BR /&gt;  | val df = sqlContext.createDataFrame(Seq(rowschema(1,"record1"), rowschema(2,"record2"), rowschema(3,"record3")));{&lt;BR /&gt;  |  df.registerTempTable("tempTable")&lt;BR /&gt;  | }&lt;BR /&gt;  | }&lt;BR /&gt;defined class rowschema&lt;/P&gt;&lt;P&gt;scala&amp;gt; sqlContext.sql("create table newHiveTable as select * from tempTable")&lt;BR /&gt;org.apache.spark.sql.AnalysisException:&lt;STRONG&gt; Table not found: tempTable&lt;/STRONG&gt;; line 1 pos 43&lt;/P&gt;</description>
      <pubDate>Thu, 22 Feb 2018 23:49:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-Create-LOAD-data-into-table-through-sparkQL-with/m-p/159564#M121953</guid>
      <dc:creator>Hagedorn</dc:creator>
      <dc:date>2018-02-22T23:49:43Z</dc:date>
    </item>
  </channel>
</rss>

