<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: PySpark Logging to HDFS instead of local filesystem in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/PySpark-Logging-to-HDFS-instead-of-local-filesystem/m-p/52716#M58186</link>
    <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/16903"&gt;@aj&lt;/a&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You can achive this by giving fully qualified path.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## To use HDFS path&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;hdfs://&amp;lt;cluster-node&amp;gt;:8020/user/&amp;lt;path&amp;gt; &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## To use Local path&lt;/STRONG&gt;&lt;BR /&gt;file:///home/&amp;lt;path&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Some additional Notes: It is not recommended to have logs in HDFS for two reasons&lt;/P&gt;&lt;P&gt;1. HDFS maintains 3 replication factors by default.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. If HDFS goes down, you cannot check the logs&lt;/P&gt;</description>
    <pubDate>Mon, 27 Mar 2017 19:30:34 GMT</pubDate>
    <dc:creator>saranvisa</dc:creator>
    <dc:date>2017-03-27T19:30:34Z</dc:date>
    <item>
      <title>PySpark Logging to HDFS instead of local filesystem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/PySpark-Logging-to-HDFS-instead-of-local-filesystem/m-p/52699#M58185</link>
      <description>&lt;P&gt;I would like to use Pythons Logging library, but want the output of the logs to land in HDFS instead of the local file system for the worker node.&amp;nbsp; Is there a way to do that?&amp;nbsp;&lt;/P&gt;&lt;P&gt;My code for setting up logging is below:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;import logging&lt;BR /&gt;logging.basicConfig(filename='/var/log/DataFramedriversRddConvert.log',level=logging.DEBUG)&lt;BR /&gt;logging.basicConfig(format='%(asctime)s %(message)s')&lt;BR /&gt;logging.info('++++Started DataFramedriversRddConvert++++')&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 11:20:47 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/PySpark-Logging-to-HDFS-instead-of-local-filesystem/m-p/52699#M58185</guid>
      <dc:creator>aj</dc:creator>
      <dc:date>2022-09-16T11:20:47Z</dc:date>
    </item>
    <item>
      <title>Re: PySpark Logging to HDFS instead of local filesystem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/PySpark-Logging-to-HDFS-instead-of-local-filesystem/m-p/52716#M58186</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/16903"&gt;@aj&lt;/a&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You can achive this by giving fully qualified path.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## To use HDFS path&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;hdfs://&amp;lt;cluster-node&amp;gt;:8020/user/&amp;lt;path&amp;gt; &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## To use Local path&lt;/STRONG&gt;&lt;BR /&gt;file:///home/&amp;lt;path&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Some additional Notes: It is not recommended to have logs in HDFS for two reasons&lt;/P&gt;&lt;P&gt;1. HDFS maintains 3 replication factors by default.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. If HDFS goes down, you cannot check the logs&lt;/P&gt;</description>
      <pubDate>Mon, 27 Mar 2017 19:30:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/PySpark-Logging-to-HDFS-instead-of-local-filesystem/m-p/52716#M58186</guid>
      <dc:creator>saranvisa</dc:creator>
      <dc:date>2017-03-27T19:30:34Z</dc:date>
    </item>
    <item>
      <title>Re: PySpark Logging to HDFS instead of local filesystem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/PySpark-Logging-to-HDFS-instead-of-local-filesystem/m-p/302534#M58187</link>
      <description>&lt;P&gt;This is not working.&amp;nbsp; Please let me know how to use full path&lt;/P&gt;</description>
      <pubDate>Tue, 08 Sep 2020 17:33:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/PySpark-Logging-to-HDFS-instead-of-local-filesystem/m-p/302534#M58187</guid>
      <dc:creator>KGF</dc:creator>
      <dc:date>2020-09-08T17:33:46Z</dc:date>
    </item>
  </channel>
</rss>

