<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Is there is any workaround to map csv columns to hive columns? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118790#M38769</link>
    <description>&lt;P&gt;You can consider using hive external table with the same column name and data types and then map the column names while loading from Source (hive external table) to Target (hive table).&lt;/P&gt;&lt;P&gt;You can get the examples here for creation of external table. &lt;/P&gt;&lt;P&gt;&lt;A href="https://www.dezyre.com/hadoop-tutorial/apache-hive-tutorial-tables" target="_blank"&gt;https://www.dezyre.com/hadoop-tutorial/apache-hive-tutorial-tables&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 25 Aug 2016 04:09:32 GMT</pubDate>
    <dc:creator>srai1</dc:creator>
    <dc:date>2016-08-25T04:09:32Z</dc:date>
    <item>
      <title>Is there is any workaround to map csv columns to hive columns?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118789#M38768</link>
      <description>&lt;PRE&gt;Consider the following scenario

Hive table with 5 columns (col1, col2, col3, col4, col5)
CSV file with 3 columns (col1, col3, col5)

Now I want to load CSV file data into hive table with exact csv to hive column mapping as follows.

hive 		csv data
col1     &amp;lt;-&amp;gt; 	col1
col2 		empty
col3     &amp;lt;-&amp;gt; 	col3
col5     &amp;lt;-&amp;gt; 	col5
col4 		empty

Any kind of help would be greatly appreciate.&lt;/PRE&gt;</description>
      <pubDate>Thu, 25 Aug 2016 03:55:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118789#M38768</guid>
      <dc:creator>girish02c</dc:creator>
      <dc:date>2016-08-25T03:55:58Z</dc:date>
    </item>
    <item>
      <title>Re: Is there is any workaround to map csv columns to hive columns?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118790#M38769</link>
      <description>&lt;P&gt;You can consider using hive external table with the same column name and data types and then map the column names while loading from Source (hive external table) to Target (hive table).&lt;/P&gt;&lt;P&gt;You can get the examples here for creation of external table. &lt;/P&gt;&lt;P&gt;&lt;A href="https://www.dezyre.com/hadoop-tutorial/apache-hive-tutorial-tables" target="_blank"&gt;https://www.dezyre.com/hadoop-tutorial/apache-hive-tutorial-tables&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 25 Aug 2016 04:09:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118790#M38769</guid>
      <dc:creator>srai1</dc:creator>
      <dc:date>2016-08-25T04:09:32Z</dc:date>
    </item>
    <item>
      <title>Re: Is there is any workaround to map csv columns to hive columns?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118791#M38770</link>
      <description>&lt;P&gt;My instict is that the default Hive SerDe would be used and would not automatically skip over the col2 value as you've shown in your example. A few options for you:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Ingest the raw CSV data into a 3 column temp Hive table. Perform an "Insert ... Select * from temp_hive_table" to push those three column values into your destination Hive table.&lt;/LI&gt;&lt;LI&gt;Write a brief Pig script to parse the CSV table and push to your destination Hive table&lt;/LI&gt;&lt;LI&gt;Write your own Hive SerDe - &lt;A href="https://cwiki.apache.org/confluence/display/Hive/SerDe#SerDe-Built-inandCustomSerDes" target="_blank"&gt;https://cwiki.apache.org/confluence/display/Hive/SerDe#SerDe-Built-inandCustomSerDes&lt;/A&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Cheers!&lt;/P&gt;&lt;P&gt;Reference: &lt;A href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RowFormat,StorageFormat,andSerDe" target="_blank"&gt;https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RowFormat,StorageFormat,andSerDe&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 25 Aug 2016 04:13:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118791#M38770</guid>
      <dc:creator>wfloyd</dc:creator>
      <dc:date>2016-08-25T04:13:35Z</dc:date>
    </item>
    <item>
      <title>Re: Is there is any workaround to map csv columns to hive columns?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118792#M38771</link>
      <description>&lt;P&gt;When creating table use the following:&lt;/P&gt;&lt;P&gt;TBLPROPERTIES ('serialization.null.format'='')&lt;/P&gt;&lt;P&gt;Then do INSERT INTO table_name (col1,col3, col5) select * from csvtable . Check the following. This should just work.&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/1216/techniques-for-dealing-with-malformed-data-hive.html" target="_blank"&gt;https://community.hortonworks.com/questions/1216/techniques-for-dealing-with-malformed-data-hive.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 25 Aug 2016 04:14:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118792#M38771</guid>
      <dc:creator>mqureshi</dc:creator>
      <dc:date>2016-08-25T04:14:49Z</dc:date>
    </item>
    <item>
      <title>Re: Is there is any workaround to map csv columns to hive columns?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118793#M38772</link>
      <description>&lt;P&gt;Your statement might look something like &lt;STRONG&gt;insert into csvinternal (col2) select col1 from cvsexternal;&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 25 Aug 2016 04:19:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118793#M38772</guid>
      <dc:creator>srai1</dc:creator>
      <dc:date>2016-08-25T04:19:18Z</dc:date>
    </item>
    <item>
      <title>Re: Is there is any workaround to map csv columns to hive columns?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118794#M38773</link>
      <description>&lt;DIV&gt;&lt;A rel="user" href="https://community.cloudera.com/users/170/wfloyd.html" nodeid="170"&gt;@Wes Floyd&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/10337/srai.html" nodeid="10337"&gt;@sra&lt;/A&gt;i &lt;A rel="user" href="https://community.cloudera.com/users/10969/mqureshi.html" nodeid="10969"&gt;@mqureshi&lt;/A&gt;  Thank you so much for your quick responses to my question.  As I already have staging table to orc table  implementation structure  I will try to use csv column headers to create staging tables and then I will load staging data to actual table. &lt;/DIV&gt;</description>
      <pubDate>Thu, 25 Aug 2016 04:26:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118794#M38773</guid>
      <dc:creator>girish02c</dc:creator>
      <dc:date>2016-08-25T04:26:58Z</dc:date>
    </item>
    <item>
      <title>Re: Is there is any workaround to map csv columns to hive columns?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118795#M38774</link>
      <description>&lt;P&gt;&lt;A href="http://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-hive/" target="_blank"&gt;http://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-hive/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Check out section 3.4 from the tutorial using the reg-ex. This would help you load the table with columns you need. &lt;/P&gt;&lt;P&gt;Also, Another way is like @srai said, create an external table, mapped it to the the csv file. Create a managed table and insert the data using insert into managed table select from external table, explicitly state the columns you want to load with the insert statement.&lt;/P&gt;</description>
      <pubDate>Thu, 25 Aug 2016 05:43:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118795#M38774</guid>
      <dc:creator>sbomma</dc:creator>
      <dc:date>2016-08-25T05:43:01Z</dc:date>
    </item>
    <item>
      <title>Re: Is there is any workaround to map csv columns to hive columns?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118796#M38775</link>
      <description>&lt;P&gt;Even though the main requirement is addressed based on the choice of selected answer, thought I should log this for reference in the future:&lt;/P&gt;&lt;PRE&gt;0: jdbc:hive2://node1.hortonworks.com:10000/d&amp;gt; select * from src;
+----------+------------+--+
| src.key  | src.value  |
+----------+------------+--+
| 1        | Value1     |
| 2        | Value2     |
+----------+------------+--+
2 rows selected (0.187 seconds)
0: jdbc:hive2://node1.hortonworks.com:10000/d&amp;gt; select * from tgt;
+----------+------------+--+
| tgt.key  | tgt.value  |
+----------+------------+--+
+----------+------------+--+
No rows selected (0.154 seconds)
0: jdbc:hive2://node1.hortonworks.com:10000/d&amp;gt; from (from src select transform(src.key,src.value) using '/bin/cat' as (tkey,tvalue) )tmap insert overwrite table tgt select tkey,tvalue;
INFO  : Tez session hasn't been created yet. Opening session
INFO  : Dag name: from (from src select transfor...tkey,tvalue(Stage-1)
INFO  : 


INFO  : Status: Running (Executing on YARN cluster with App id application_1471888656011_0009)


INFO  : Map 1: -/-	
INFO  : Map 1: 0/1	
INFO  : Map 1: 0/1	
INFO  : Map 1: 0(+1)/1	
INFO  : Map 1: 1/1	
INFO  : Loading data to table default.tgt from hdfs://node1.hortonworks.com:8020/apps/hive/warehouse/tgt/.hive-staging_hive_2016-08-25_21-51-10_715_1000932141605500109-1/-ext-10000
INFO  : Table default.tgt stats: [numFiles=1, numRows=2, totalSize=18, rawDataSize=16]
No rows affected (19.992 seconds)
0: jdbc:hive2://node1.hortonworks.com:10000/d&amp;gt; select * from tgt;
+----------+------------+--+
| tgt.key  | tgt.value  |
+----------+------------+--+
| 1        | Value1     |
| 2        | Value2     |
+----------+------------+--+
2 rows selected (0.197 seconds)
&lt;/PRE&gt;</description>
      <pubDate>Fri, 26 Aug 2016 04:50:47 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-is-any-workaround-to-map-csv-columns-to-hive/m-p/118796#M38775</guid>
      <dc:creator>srai1</dc:creator>
      <dc:date>2016-08-26T04:50:47Z</dc:date>
    </item>
  </channel>
</rss>

