Log! Don't Print! Use the Python logging library

January 11, 2022

Python has become one of the most popular application languages in IT, shadow-IT, and data science. Python developers continually improve their systems by iterating from example patterns to best practices.

The Python logging package should be used wherever print() statements were in the past. The logging package makes it possible to classify output at different severities. logging has the ability to enable and disable the generation of output at those different levels. This means you can create debug-level statements that are useful to programmers without letting those statements bleed into a production application. The referenced GitHub project shows how to load logging configurations, and how to change where logging goes based on those configurations.

Classify output by severity
Filter output generation by severity
Send data to different sinks based on the program module and the severity

Logging Design

Logging is routed through loggers that are instantiated in each python module. Those loggers are configured with one or more handlers. Each handler writes to its own log sink based on the filter criteria it was initialized with. This means a log message can be written out by 0 or more handlers depending on the logging level of the message and the filter message of the handlers.

Watch the video or read the Python logging documentation for an explanation of how levels work. I'm too tired to write it up right now.

Video Walkthrough

References

Python logging documentation

https://docs.python.org/3/howto/logging.html

The sample code for this blog came from

https://github.com/freemansoft/Network-intrusion-dataset-creator

Logging configuration - logging_config.yaml

Our log output format, handler configurations, and logger configurations are all stored in a yaml file. This file shows a variety of formatters and handlers. It contains a custom configuration for each python module that intends to log. Small programs can probably just run with the console logger and the root configuration. Larger or production programs will probably have something more detailed like this example.

version: 1
disable_existing_loggers: true
formatters:
  brief:
    format: "%(levelname)-8s; %(name)-15s; %(message)s"
  single_line:
    format: "%(asctime)s; %(levelname)-8s; %(name)-15s; %(funcName)-15s;%(lineno)4d; %(message)s"
  multi_thread:
    format: "%(asctime)s; %(levelname)-8s; [%(thread;)s]; %(name)-15s; %(funcName)-15s;%(lineno)-4d; %(message)s"
  multi_process:
    format: "%(asctime)s; %(levelname)-8s; [%(process)d]; %(name)-15s; %(funcName)-15s;%(lineno)-4d; %(message)s"
handlers:
  console:
    level: INFO
    class: logging.StreamHandler
    formatter: multi_process
    stream: ext://sys.stdout
  info_file_handler:
    level: INFO
    class: logging.handlers.WatchedFileHandler
    formatter: multi_process
    filename: ./info.log
    # file behavior a=append w=overwrite
    mode: w
    encoding: utf-8
  error_file_handler:
    level: WARNING
    class: logging.handlers.WatchedFileHandler
    formatter: multi_process
    filename: ./errors.log
    # file behavior a=append w=overwrite
    mode: w
    encoding: utf-8
  debug_file_handler:
    level: DEBUG
    class: logging.handlers.WatchedFileHandler
    formatter: multi_process
    filename: ./debug.log
    # file behavior a=append w=overwrite
    mode: w
    encoding: utf-8
loggers:
  capture:
    #level: INFO
    handlers: [error_file_handler, info_file_handler, debug_file_handler]
  services:
    #level: INFO
    handlers: [error_file_handler, info_file_handler, debug_file_handler]
  detectors:
    #level: INFO
    handlers: [error_file_handler, info_file_handler, debug_file_handler]
  counts:
    #level: INFO
    handlers: [error_file_handler, info_file_handler, debug_file_handler]
  __main__:
    #level: INFO
    handlers: [error_file_handler, debug_file_handler]

root:
  level: INFO
  handlers:
    - console

Logging initialization - loggingconfig.py

The logging module must be initialized one time. Here we have a function that we can call on program startup to initialize the logging configuration.

import logging
import logging.config
import yaml


def load_logging():
    with open("logging_config.yaml", "r") as f:
        config = yaml.safe_load(f.read())
        logging.config.dictConfig(config)

Creating a logger in main()

Load the logging module one time on startup. Here we demonstrate loading the config at the top of main()

import logging
from loggingconfig import load_logging

def main():

load_logging()

    logger = logging.getLogger(__name__)

Creating a logger in a Multi-processor Class

Each python file will need access to a logger. The sample program is a bunch of multi-processing modules. We can create a logger in the __init__() function. Non-class python files can just do it at the top.

class PacketCapture(multiprocessing.Process):
    def __init__(
        self, name, tshark_program, input_file_name, interface, how_long, outQ
    ):
        multiprocessing.Process.__init__(self)
        self.name = name
        self.logger = logging.getLogger(__name__)

Creating a logger in main()

Non-class python files also need to create a logger. In this case the module will actually be __main__ . Note that this example shows both initializing the logging system and creating the logger for this file.

if __name__ == "__main__":

    load_logging()
    logger = logging.getLogger(__name__)

Logging a message and deferring string construction

It is very important that you do not pre-construct logging strings. The logger will format the logging string only if the message is actually going to be sent to one of the sinks. A log of logging, especially debug(), will never make it past the handlers. The type of string formatting shown below will insure that the debug() log strings would never be created in those situations.

self.logger.debug("Working with line %s", line)

Created 2021 01

Blog de Joe Freeman