<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question &amp;quot;Error parsing row: file&amp;quot; Table consists of multiple csv files in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/quot-Error-parsing-row-file-quot-Table-consists-of-multiple/m-p/286571#M212523</link>
    <description>&lt;P&gt;Dear community,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have created a new datatable by uploading a csv file (incl. header / the csv file contains data about a specific months) to HDFS (via Hue). Afterwards I have cleared the cache and uploaded the other csv files (all following csv files have the same column order BUT NO HEADER; average size of every monthly csv file: ~2-4 GB; number of columns: 54).&lt;/P&gt;
&lt;P&gt;Typical procedure after uploading a new csv file to the database: INVALIDATE METADATA database_xy&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When I send a query where every column shall be displayed I get the following Error Messages in Impala:&lt;/P&gt;
&lt;P&gt;Error converting column: 6 to TIMESTAMP&lt;BR /&gt;Error converting column: 8 to TIMESTAMP&lt;BR /&gt;Error converting column: 23 to TIMESTAMP&lt;BR /&gt;Error converting column: 50 to TIMESTAMP&lt;BR /&gt;Error converting column: 35 to TIMESTAMP&lt;BR /&gt;Error converting column: 43 to TIMESTAMP&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Information for this columns are available after 4 months. Till then there are only NULL values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Query to reproduce these error messages:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SELECT *&lt;BR /&gt;FROM database_xy&lt;/P&gt;
&lt;P&gt;LIMIT 100&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For a specific TIMESTAMP column:&lt;/P&gt;
&lt;P&gt;SELECT min(exp_date)&lt;BR /&gt;FROM database_xy&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Error (Just a sample of the log box in Hue):&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Error parsing row: file: hdfs://blabla/foo_042019.csv,&lt;BR /&gt;before offset: 2432696320&lt;BR /&gt;Error converting column: 21 to TIMESTAMP&lt;BR /&gt;Error parsing row: file: hdfs://blabla/foo_032019.csv,&lt;BR /&gt;before offset: 1895825408&lt;BR /&gt;Error converting column: 21 to TIMESTAMP&lt;BR /&gt;Error converting column: 21 to TIMESTAMP&lt;BR /&gt;Error parsing row: file: hdfs://blabla/foo_022019.csv,&lt;BR /&gt;before offset: 2969567232&lt;BR /&gt;Error converting column: 21 to TIMESTAMP&lt;BR /&gt;Error converting column: 21 to TIMESTAMP&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When I run the queries in Hive I get no error messages at all. How come? And how do i get rid of those error messages in Impala?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Information about how I created the csv-files locally:&lt;/P&gt;
&lt;P&gt;First CSV:&lt;/P&gt;
&lt;P&gt;Python (Pandas): Set Options: Separator: Pipe, (only for first csv:) header=True, index=False (so there is no additional useless index column)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Subsequent CSVs:&lt;/P&gt;
&lt;P&gt;Python (Pandas): Set Options: Separator: Pipe, header=False, index=False (so there is no additional useless index column)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When I created the table with the&amp;nbsp;&lt;STRONG&gt;first CSV&lt;/STRONG&gt; in Hue I selected the following options:&lt;/P&gt;
&lt;P&gt;Field Separator: Pipe&lt;/P&gt;
&lt;P&gt;Record Separator: New line&lt;/P&gt;
&lt;P&gt;Quote Character: Double Quote&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Afterwards I have uploaded all the other CSVs in the database's folder to add the new months and invalidated the metadata.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you for your help in advance! I hope you enjoyed the Christmas holidays and I wish you a happy New Year's Eve!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best&lt;/P&gt;
&lt;P&gt;somedatadude&lt;/P&gt;</description>
    <pubDate>Mon, 30 Dec 2019 15:08:59 GMT</pubDate>
    <dc:creator>somedatadude</dc:creator>
    <dc:date>2019-12-30T15:08:59Z</dc:date>
  </channel>
</rss>

