<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Creating Hive external table on specific files within folder in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Creating-Hive-external-table-on-specific-files-within-folder/m-p/211087#M173029</link>
    <description>&lt;P&gt;Thanks for the response Frank, I guess my question really was how to easily move these files into the correct folder structure without it being a manual process of using "hdfs dfs" commands.&lt;/P&gt;&lt;P&gt;The including all the data in the Hive table and then let hive control what can be selected/seen is an interesting concept, that might be a possible way of doing what we are after without having to adapt the underlying structure of the data in HDFS. We can then create views on top of this single hive table to split the data and then always insert into Hive internal tables if needed.&lt;/P&gt;</description>
    <pubDate>Wed, 26 Apr 2017 17:47:49 GMT</pubDate>
    <dc:creator>aaron_harris</dc:creator>
    <dc:date>2017-04-26T17:47:49Z</dc:date>
    <item>
      <title>Creating Hive external table on specific files within folder</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Creating-Hive-external-table-on-specific-files-within-folder/m-p/211085#M173027</link>
      <description>&lt;P&gt;I have some data being dropped into our HDFS file system on a daily basis into a single folder which contains multiple CSV files. Such as below;&lt;/P&gt;&lt;P&gt;/data/yyyy/mm/dd/file1.csv&lt;/P&gt;&lt;P&gt;/data/yyyy/mm/dd/file2.csv&lt;/P&gt;&lt;P&gt;Now I want to create a Hive external table on all the file1.csv files across all the folders under /data, now it doesn't seem it is currently possible to use a regex in the Hive external table command.&lt;/P&gt;&lt;P&gt;My next thought would be to copy the files into separate structures so Hive can parse this files individually, such as;&lt;/P&gt;&lt;P&gt;/data/file1/yyyy/mm/dd/file1.csv&lt;/P&gt;&lt;P&gt;/data/file2/yyyy/mm/dd/file2.csv&lt;/P&gt;&lt;P&gt;But I am not sure what the best way of doing this would be, whatever I choose to use would initially need to copy bulk data between this folder structures and then be able to be scheduled to copy files over on a daily basis when new folders are created.&lt;/P&gt;&lt;P&gt;Any help would be greatly appreciated, please let me know if any of the above is unclear.&lt;/P&gt;</description>
      <pubDate>Tue, 25 Apr 2017 21:18:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Creating-Hive-external-table-on-specific-files-within-folder/m-p/211085#M173027</guid>
      <dc:creator>aaron_harris</dc:creator>
      <dc:date>2017-04-25T21:18:48Z</dc:date>
    </item>
    <item>
      <title>Re: Creating Hive external table on specific files within folder</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Creating-Hive-external-table-on-specific-files-within-folder/m-p/211086#M173028</link>
      <description>&lt;P&gt;I am not sure about your use case. If you want just include file1 into hive table, you have to copy those files into separate folders. The alternative way might be you can including all data into the hive table, and let hive to control what data can be selected/seen etc. &lt;/P&gt;</description>
      <pubDate>Wed, 26 Apr 2017 00:01:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Creating-Hive-external-table-on-specific-files-within-folder/m-p/211086#M173028</guid>
      <dc:creator>ylu</dc:creator>
      <dc:date>2017-04-26T00:01:16Z</dc:date>
    </item>
    <item>
      <title>Re: Creating Hive external table on specific files within folder</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Creating-Hive-external-table-on-specific-files-within-folder/m-p/211087#M173029</link>
      <description>&lt;P&gt;Thanks for the response Frank, I guess my question really was how to easily move these files into the correct folder structure without it being a manual process of using "hdfs dfs" commands.&lt;/P&gt;&lt;P&gt;The including all the data in the Hive table and then let hive control what can be selected/seen is an interesting concept, that might be a possible way of doing what we are after without having to adapt the underlying structure of the data in HDFS. We can then create views on top of this single hive table to split the data and then always insert into Hive internal tables if needed.&lt;/P&gt;</description>
      <pubDate>Wed, 26 Apr 2017 17:47:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Creating-Hive-external-table-on-specific-files-within-folder/m-p/211087#M173029</guid>
      <dc:creator>aaron_harris</dc:creator>
      <dc:date>2017-04-26T17:47:49Z</dc:date>
    </item>
  </channel>
</rss>

