<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Does hadoop run dfs -du automatically when a new job starts ? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Does-hadoop-run-dfs-du-automatically-when-a-new-job-starts/m-p/231298#M193142</link>
    <description>&lt;P&gt;Perhaps you can provide some context on why you think a hdfs dfs -du is needed at the start of each job?&lt;BR /&gt;Anyway I am sure that Spark will not run hdfs dfs -du automatically at job start, as a Spark job doesn't necessarily access hdfs, Spark can also be operated without hdfs.&lt;/P&gt;</description>
    <pubDate>Mon, 26 Feb 2018 21:33:57 GMT</pubDate>
    <dc:creator>arald</dc:creator>
    <dc:date>2018-02-26T21:33:57Z</dc:date>
    <item>
      <title>Does hadoop run dfs -du automatically when a new job starts ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Does-hadoop-run-dfs-du-automatically-when-a-new-job-starts/m-p/231297#M193141</link>
      <description>&lt;P&gt;Hi ,&lt;/P&gt;&lt;P&gt;I am using HDP 2.6 with Spark 2.1 ( also Spark 1.6) with Yarn as resource manager . I am trying out TeraSort benchmarking jobs on a experimental cluster.&lt;/P&gt;&lt;P&gt;I want to run  'hdfs dfs -du'  or   'hdfs fs -du'  command every time before starting a Spark  job to analyse  available disk space in data nodes. &lt;/P&gt;&lt;P&gt;From the following question I understand that running these commands is expensive on cluster &lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/92214/can-hdfs-dfsadmin-and-hdfs-dsfs-du-be-taxing-on-my.html" target="_blank"&gt;https://community.hortonworks.com/questions/92214/can-hdfs-dfsadmin-and-hdfs-dsfs-du-be-taxing-on-my.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;So I wanted to know whether hadoop automatically runs dfs -du command in the background, whenever a new Spark job is started. Or do I need to run manually ?&lt;/P&gt;&lt;P&gt;Thanks,&lt;BR /&gt;Steev&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2018 18:53:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Does-hadoop-run-dfs-du-automatically-when-a-new-job-starts/m-p/231297#M193141</guid>
      <dc:creator>steevan_rodrigu</dc:creator>
      <dc:date>2018-02-26T18:53:05Z</dc:date>
    </item>
    <item>
      <title>Re: Does hadoop run dfs -du automatically when a new job starts ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Does-hadoop-run-dfs-du-automatically-when-a-new-job-starts/m-p/231298#M193142</link>
      <description>&lt;P&gt;Perhaps you can provide some context on why you think a hdfs dfs -du is needed at the start of each job?&lt;BR /&gt;Anyway I am sure that Spark will not run hdfs dfs -du automatically at job start, as a Spark job doesn't necessarily access hdfs, Spark can also be operated without hdfs.&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2018 21:33:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Does-hadoop-run-dfs-du-automatically-when-a-new-job-starts/m-p/231298#M193142</guid>
      <dc:creator>arald</dc:creator>
      <dc:date>2018-02-26T21:33:57Z</dc:date>
    </item>
    <item>
      <title>Re: Does hadoop run dfs -du automatically when a new job starts ?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Does-hadoop-run-dfs-du-automatically-when-a-new-job-starts/m-p/231299#M193143</link>
      <description>&lt;P&gt;Thank you for the information. &lt;/P&gt;&lt;P&gt;The need for dfs -du  is to check how much disk space is  available (before starting the job) and check how the job is generating data (how much data)  &lt;/P&gt;</description>
      <pubDate>Tue, 27 Feb 2018 14:11:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Does-hadoop-run-dfs-du-automatically-when-a-new-job-starts/m-p/231299#M193143</guid>
      <dc:creator>steevan_rodrigu</dc:creator>
      <dc:date>2018-02-27T14:11:48Z</dc:date>
    </item>
  </channel>
</rss>

