Cloud and Software Architecture, Soft skills, IOT and embedded
Log! Don't Print! Use the Python logging library
Get link
Facebook
X
Pinterest
Email
Other Apps
Python has become one of the most popular application languages in IT, shadow-IT, and data science. Python developers continually improve their systems by iterating from example patterns to best practices.
The Python logging package should be used wherever print() statements were in the past. The logging package makes it possible to classify output at different severities. logging has the ability to enable and disable the generation of output at those different levels. This means you can create debug-level statements that are useful to programmers without letting those statements bleed into a production application. The referenced GitHub project shows how to load logging configurations, and how to change where logging goes based on those configurations.
Classify output by severity
Filter output generation by severity
Send data to different sinks based on the program module and the severity
Logging Design
Logging is routed through loggers that are instantiated in each python module. Those loggers are configured with one or more handlers. Each handler writes to its own log sink based on the filter criteria it was initialized with. This means a log message can be written out by 0 or more handlers depending on the logging level of the message and the filter message of the handlers.
Watch the video or read the Python logging documentation for an explanation of how levels work. I'm too tired to write it up right now.
Our log output format, handler configurations, and logger configurations are all stored in a yaml file. This file shows a variety of formatters and handlers. It contains a custom configuration for each python module that intends to log. Small programs can probably just run with the console logger and the root configuration. Larger or production programs will probably have something more detailed like this example.
The logging module must be initialized one time. Here we have a function that we can call on program startup to initialize the logging configuration.
importlogging
importlogging.config
importyaml
defload_logging():
withopen("logging_config.yaml", "r") asf:
config = yaml.safe_load(f.read())
logging.config.dictConfig(config)
Creating a logger in main()
Load the logging module one time on startup. Here we demonstrate loading the config at the top of main()
importlogging
fromloggingconfigimportload_logging
def main():
load_logging()
logger = logging.getLogger(__name__)
Creating a logger in a Multi-processor Class
Each python file will need access to a logger. The sample program is a bunch of multi-processing modules. We can create a logger in the __init__() function. Non-class python files can just do it at the top.
Non-class python files also need to create a logger. In this case the module will actually be __main__ . Note that this example shows both initializing the logging system and creating the logger for this file.
if__name__ == "__main__":
load_logging()
logger = logging.getLogger(__name__)
Logging a message and deferring string construction
It is very important that you do not pre-construct logging strings. The logger will format the logging string only if the message is actually going to be sent to one of the sinks. A log of logging, especially debug(), will never make it past the handlers. The type of string formatting shown below will insure that the debug() log strings would never be created in those situations.
I do a lot of my development and configuration via ssh into my Raspberry Pi Zero over the RNDIS connection. Some models of the Raspberry PIs can be configured with gadget drivers that let the Raspberry pi emulate different devices when plugged into computers via USB. My favorite gadget is the network profile that makes a Raspberry Pi look like an RNDIS-attached network device. All types of network services travel over an RNDIS device without knowing it is a USB hardware connection. A Raspberry Pi shows up as a Remote NDIS (RNDIS) device when you plug the Pi into a PC or Mac via a USB cable. The gadget in the Windows Device Manager picture shows this RNDIS Gadget connectivity between a Windows machine and a Raspberry Pi. The Problem Windows 11 and Windows 10 no longer auto-installs the RNDIS driver that makes magic happen. Windows recognizes that the Raspberry Pi is some type of generic USB COM device. Manually running W indows Update or Update Driver does not install the RNDI
The Windows Subsystem for Linux operates as a virtual machine that can dynamically grow the amount of RAM to a maximum set at startup time. Microsoft sets a default maximum RAM available to 50% of the physical memory and a swap-space that is 1/4 of the maximum WSL RAM. You can scale those numbers up or down to allocate more or less RAM to the Linux instance. The first drawing shows the default WSL memory and swap space sizing. The images below show a developer machine that is running a dev environment in WSL2 and Docker Desktop. Docker Desktop has two of its own WSL modules that need to be accounted for. You can see that the memory would actually be oversubscribed, 3 x 50% if every VM used its maximum memory. The actual amount of memory used is significantly smaller allowing every piece to fit. Click to Enlarge The second drawing shows the memory allocation on my 64GB laptop. WSL Linux defaults to a maximum RAM size of 5
The Apache Tika project provides a library capable of parsing and extracting data and meta data from over 1000 file types. Tika is available as a single jar file that can be included inside applications or as a deployable jar file that runs Tika as a standalone service. This blog describes deploying the Tika jar as an auto-scale service in Amazon AWS Elastic Beanstalk. I selected Elastic Beanstalk because it supports jar based deployments without any real Infrastructure configuration. Elastic Beanstalk auto-scale should take care of scaling up and down for for the number of requests you get. Tika parses documents and extracts their text completely in memory. Tika was deployed for this blog using EC2 t2.micro instances available in the AWS free tier. t2.micro VMs are 1GB which means that you are restricted in document complexity and size. You would size your instances appropriately for your largest documents. Preconditions An AWS account. AWS access id and secret key.
Comments
Post a Comment