<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Migrating from one hive table to another  hive table Using Spark,withe differend colum name and database with same cluster in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161263#M45107</link>
    <description>&lt;P&gt;um which version of Spark  ?&lt;/P&gt;&lt;P&gt;1.3.1 =&amp;gt; HDP 2.3.0 &lt;/P&gt;&lt;P&gt;1.4.1 =&amp;gt; HDP 2.3.2&lt;/P&gt;&lt;P&gt;1.5.2 =&amp;gt; HDP 2.3.4&lt;/P&gt;&lt;P&gt;I have a feeling it's spark 1.3, they made some major improvement in spark &amp;lt;=&amp;gt; Hive integration starting with spark 1.4.1.&lt;/P&gt;</description>
    <pubDate>Thu, 10 Nov 2016 01:23:08 GMT</pubDate>
    <dc:creator>mlamairesse</dc:creator>
    <dc:date>2016-11-10T01:23:08Z</dc:date>
    <item>
      <title>Migrating from one hive table to another  hive table Using Spark,withe differend colum name and database with same cluster</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161258#M45102</link>
      <description>&lt;P&gt;Hive Table: &lt;/P&gt;&lt;P&gt;Orginal table&lt;/P&gt;&lt;P&gt;Database Name : &lt;STRONG&gt;Student&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Tabe name : &lt;STRONG&gt;Student_detail &lt;/STRONG&gt;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;STRONG&gt;id&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;&lt;STRONG&gt;name&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;&lt;STRONG&gt;dept&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;siva&lt;/TD&gt;&lt;TD&gt;cse&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;Need Output :&lt;/P&gt;&lt;P&gt;Database Name : &lt;STRONG&gt;CSE&lt;/STRONG&gt;&lt;STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Tabe name : &lt;STRONG&gt;New_tudent_detail&lt;/STRONG&gt;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;STRONG&gt;s_id&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;&lt;STRONG&gt;s_name&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD&gt;&lt;STRONG&gt;s_dept&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;siva&lt;/TD&gt;&lt;TD&gt;cse&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;i want Migrate Student_detail hive table into New_tudent_detail without data lose&lt;STRONG&gt; using spark&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;
&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Different colum name&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Different database &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Different table &lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 02 Nov 2016 20:29:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161258#M45102</guid>
      <dc:creator>hadoopsmi</dc:creator>
      <dc:date>2016-11-02T20:29:46Z</dc:date>
    </item>
    <item>
      <title>Re: Migrating from one hive table to another  hive table Using Spark,withe differend colum name and database with same cluster</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161259#M45103</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/838/hadoopsmi.html" nodeid="838"&gt;@Sivasaravanakumar K&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Here's one way of going about this. &lt;/P&gt;&lt;P&gt;
Note the example below is based on the sample data available on the hortonworks sandbox. 
Just change the database, table and column name to suit you needs &lt;/P&gt;&lt;P&gt;0. Get database and table info&lt;/P&gt;&lt;PRE&gt;//show databases in Hive
sqlContext.sql("show databases").show

//show table in a database
sqlContext.sql("show tables in default").show

//read the table headers
sqlContext.sql("select * from default.sample_07").printSchema
&lt;/PRE&gt;&lt;P style="margin-left: 20px;"&gt;result &lt;/P&gt;&lt;PRE&gt;--------+
|  result|
+--------+
| default|
|foodmart|
|  xademo|
+--------+

+---------+-----------+
|tableName|isTemporary|
+---------+-----------+
|sample_07|      false|
|sample_08|      false|
+---------+-----------+

root
 |-- code: string (nullable = true)
 |-- description: string (nullable = true)
 |-- total_emp: integer (nullable = true)
 |-- salary: integer (nullable = true)
&lt;/PRE&gt;&lt;P&gt;1. Read table data into a DataFrame :&lt;/P&gt;&lt;PRE&gt;// read data from Hive
val df = sqlContext.sql("select * from default.sample_07")
//Show Table Schema 
df.printSchema&lt;/PRE&gt;&lt;P style="margin-left: 20px;"&gt;result &lt;/P&gt;&lt;PRE&gt;root
 |-- code: string (nullable = true)
 |-- description: string (nullable = true)
 |-- total_emp: integer (nullable = true)
 |-- salary: integer (nullable = true)
&lt;/PRE&gt;&lt;P&gt;2. Change column names&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;Change a single column name with the withColumnRenamed function&lt;/P&gt;&lt;PRE&gt;val df_renamed = df.withColumnRenamed("salary", "money") 
df_renamed.printSchema &lt;/PRE&gt;&lt;P style="margin-left: 20px;"&gt;Or all at once using a list of header&lt;/P&gt;&lt;PRE&gt;val newNames = Seq("code_1", "description_1", "total_emp_1", "money_1") 
val df_renamed = df.toDF(newNames: _*) 
df_renamed.printSchema &lt;/PRE&gt;&lt;P style="margin-left: 20px;"&gt;Note you can combine reading  toghether so as not to create 2 sets of data in memory&lt;/P&gt;&lt;PRE&gt;val newNames = Seq("code_1", "description_1", "total_emp_1", "money_1") 
val df = sqlContext.sql("select * from default.sample_07").toDF(newNames: _*)&lt;/PRE&gt;&lt;P style="margin-left: 20px;"&gt;&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;Or all at once using SQL alias (** preferred)&lt;/P&gt;&lt;PRE&gt;val df = sqlContext.sql("select code as code_1, description as description_1, total_emp as total_emp_1, salary as money from default.sample_07") 

df.printSchema&lt;/PRE&gt;&lt;P style="margin-left: 20px;"&gt;result (using SQL alias)&lt;/P&gt;&lt;PRE&gt;df: org.apache.spark.sql.DataFrame = [code_1: string, description_1: string, total_emp_1: int, money: int]
root
 |-- code_1: string (nullable = true)
 |-- description_1: string (nullable = true)
 |-- total_emp_1: integer (nullable = true)
 |-- money: integer (nullable = true)
&lt;/PRE&gt;&lt;P&gt;3.  Save back to hive &lt;/P&gt;&lt;PRE&gt;//write to Hive (in ORC format) 
df.write.format("orc").saveAsTable("default.sample_07_new_schema") 

//read back and check new_schema
sqlContext.sql("select * from default.sample_07_new_schema").printSchema
&lt;/PRE&gt;&lt;P&gt;result&lt;/P&gt;&lt;PRE&gt;root
 |-- code_1: string (nullable = true)
 |-- description_1: string (nullable = true)
 |-- total_emp_1: integer (nullable = true)
 |-- money: integer (nullable = true)
&lt;/PRE&gt;</description>
      <pubDate>Tue, 08 Nov 2016 03:41:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161259#M45103</guid>
      <dc:creator>mlamairesse</dc:creator>
      <dc:date>2016-11-08T03:41:37Z</dc:date>
    </item>
    <item>
      <title>Re: Migrating from one hive table to another  hive table Using Spark,withe differend colum name and database with same cluster</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161260#M45104</link>
      <description>&lt;P&gt;Hi @Matthieu Lamairesse&lt;/P&gt;&lt;P&gt;Error :&lt;/P&gt;&lt;P&gt;scala&amp;gt; df.write.format("orc").saveAsTable("default.sample_07_new_schema") &amp;lt;console&amp;gt;:33: error: value write is not a member of org.apache.spark.sql.DataFrame df.write.format("orc").saveAsTable("default.sample_07_new_schema")&lt;/P&gt;&lt;P&gt;                                                                                 ^&lt;/P&gt;</description>
      <pubDate>Tue, 08 Nov 2016 15:29:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161260#M45104</guid>
      <dc:creator>hadoopsmi</dc:creator>
      <dc:date>2016-11-08T15:29:50Z</dc:date>
    </item>
    <item>
      <title>Re: Migrating from one hive table to another  hive table Using Spark,withe differend colum name and database with same cluster</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161261#M45105</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/838/hadoopsmi.html" nodeid="838"&gt;@Sivasaravanakumar K&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I've simplified my answer a bit.
What version of spark are you using ? This was tested on Spark 1.6.2 on a HDP 2.5 sandbox&lt;/P&gt;&lt;P&gt;Note :  When using spark-shell did you import : &lt;/P&gt;&lt;PRE&gt;import org.apache.spark.sql.hive.orc._
import org.apache.spark.sql._
&lt;/PRE&gt;</description>
      <pubDate>Tue, 08 Nov 2016 19:37:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161261#M45105</guid>
      <dc:creator>mlamairesse</dc:creator>
      <dc:date>2016-11-08T19:37:23Z</dc:date>
    </item>
    <item>
      <title>Re: Migrating from one hive table to another  hive table Using Spark,withe differend colum name and database with same cluster</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161262#M45106</link>
      <description>&lt;P&gt;i already import&lt;/P&gt;&lt;PRE&gt;import org.apache.spark.sql.hive.orc._
import org.apache.spark.sql._&lt;/PRE&gt;&lt;P&gt;still i have the same issue i am using HDP 2.3&lt;/P&gt;</description>
      <pubDate>Wed, 09 Nov 2016 14:11:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161262#M45106</guid>
      <dc:creator>hadoopsmi</dc:creator>
      <dc:date>2016-11-09T14:11:41Z</dc:date>
    </item>
    <item>
      <title>Re: Migrating from one hive table to another  hive table Using Spark,withe differend colum name and database with same cluster</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161263#M45107</link>
      <description>&lt;P&gt;um which version of Spark  ?&lt;/P&gt;&lt;P&gt;1.3.1 =&amp;gt; HDP 2.3.0 &lt;/P&gt;&lt;P&gt;1.4.1 =&amp;gt; HDP 2.3.2&lt;/P&gt;&lt;P&gt;1.5.2 =&amp;gt; HDP 2.3.4&lt;/P&gt;&lt;P&gt;I have a feeling it's spark 1.3, they made some major improvement in spark &amp;lt;=&amp;gt; Hive integration starting with spark 1.4.1.&lt;/P&gt;</description>
      <pubDate>Thu, 10 Nov 2016 01:23:08 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161263#M45107</guid>
      <dc:creator>mlamairesse</dc:creator>
      <dc:date>2016-11-10T01:23:08Z</dc:date>
    </item>
    <item>
      <title>Re: Migrating from one hive table to another  hive table Using Spark,withe differend colum name and database with same cluster</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161264#M45108</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/838/hadoopsmi.html" nodeid="838"&gt;@Sivasaravanakumar K&lt;/A&gt;&lt;/P&gt;&lt;P&gt;The write function was implemented in 1.4.1... &lt;/P&gt;&lt;P&gt;Try simply : &lt;/P&gt;&lt;PRE&gt;df.saveAsTable("default.sample_07_new_schema") 
&lt;/PRE&gt;&lt;P&gt;It will be saved as Parquet (default format for Spark)&lt;/P&gt;</description>
      <pubDate>Thu, 10 Nov 2016 02:19:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Migrating-from-one-hive-table-to-another-hive-table-Using/m-p/161264#M45108</guid>
      <dc:creator>mlamairesse</dc:creator>
      <dc:date>2016-11-10T02:19:06Z</dc:date>
    </item>
  </channel>
</rss>

