Created on 03-27-2017 08:23 AM - edited 09-16-2022 04:20 AM
I would like to use Pythons Logging library, but want the output of the logs to land in HDFS instead of the local file system for the worker node. Is there a way to do that?
My code for setting up logging is below:
import logging
logging.basicConfig(filename='/var/log/DataFramedriversRddConvert.log',level=logging.DEBUG)
logging.basicConfig(format='%(asctime)s %(message)s')
logging.info('++++Started DataFramedriversRddConvert++++')
Created 03-27-2017 12:30 PM
You can achive this by giving fully qualified path.
## To use HDFS path
hdfs://<cluster-node>:8020/user/<path>
## To use Local path
file:///home/<path>
Some additional Notes: It is not recommended to have logs in HDFS for two reasons
1. HDFS maintains 3 replication factors by default.
2. If HDFS goes down, you cannot check the logs
Created 03-27-2017 12:30 PM
You can achive this by giving fully qualified path.
## To use HDFS path
hdfs://<cluster-node>:8020/user/<path>
## To use Local path
file:///home/<path>
Some additional Notes: It is not recommended to have logs in HDFS for two reasons
1. HDFS maintains 3 replication factors by default.
2. If HDFS goes down, you cannot check the logs
Created 09-08-2020 10:33 AM
This is not working. Please let me know how to use full path