<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How  to merge multiple part files while creating hive ORC files using &amp;quot;insert overwrite directory&amp;quot; in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-merge-multiple-part-files-while-creating-hive-ORC/m-p/176331#M138583</link>
    <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/18374/saravananp1.html" nodeid="18374"&gt;@saravanan p&lt;/A&gt;&lt;/P&gt;&lt;P&gt;one way of doing it is modify the job to run it in single reducer so that the output will be a single file. &lt;/P&gt;&lt;P&gt;Use this property to change the reducer to one. set mapred.reduce.tasks=1;&lt;/P&gt;&lt;P&gt;By default the no of files inserted in a hive table depends on the size of a file, size of map job, size of reducer job. Based on the size no of files inserted in a hive table varies.&lt;/P&gt;&lt;P&gt;max(mapreduce.input.fileinputformat.split.minsize, min(mapreduce.input.fileinputformat.split.maxsize, dfs.block.size))&lt;/P&gt;&lt;P&gt; If you have reducers running, then you should also look at&lt;/P&gt;&lt;P&gt;hive.exec.max.created.files, mapred.reduce.tasks, hive.exec.reducers.bytes.per.reducer&lt;/P&gt;&lt;P&gt;Hope it helps!&lt;/P&gt;</description>
    <pubDate>Wed, 17 May 2017 17:34:53 GMT</pubDate>
    <dc:creator>balavignesh_nag</dc:creator>
    <dc:date>2017-05-17T17:34:53Z</dc:date>
  </channel>
</rss>

