Recording history, picking the perfect Logging system

Thefuturebymob
7 min readOct 24, 2021

Logging is a crucial part of any application. Be it a back end web service, GUI application, command line script, any program can benefit from having some form of logging.

This quick guide to logging, provides an overview to help you choose and use a very critical part of your system.

Logging — the benefits include — easier to root cause, helps to find intermittent failures ….

Uses of Logging

Using logging we can

  • Record the state of the system at a given time
  • Record errors
  • Record interesting events
  • Log unexpected states or events
  • Persist log messages
  • Stream messages to stdout or stderr

Logging can help us during development, it can also help to root cause issues, and also to debug issues. If we persist the data from logging, it can also be used to analyze historical trends, allowing us to watch for changes in behaviour over time, things like performance degradation, or monitoring stability etc.

There are a few basic points to consider when choosing your logging implementation. This is a quick guide.

Basic Functionality

A good logger typically includes these basic features

  • Supports writing the messages to stdout, stderr and backing up all messages to a file
  • Supports different levels of messages. Typical levels are Debug, Info, Warning, Error, Fatal
  • Stamps each log message with date and time
  • Supports variable argument strings
  • Can easily add stack traces, exception messages
  • Can be enabled and disabled dynamically
  • Supports a configuration file with runtime settings
  • May even support a configuration update thread, that allows for checking the config file periodically, allowing the logging behaviour to be truely runtime dynamic

The logging in your code will look something like this

LogInstance.LogMessage(INFO, “Just entered %d Interrupt Service Routine”, INT_TIMEOUT)

There will be a log instance (if supported), and a log function that takes a log level, the message, and some variable length arguments. For some languages a static/global function is used, and some loggers have a function for each type of log level, therefore not needing the log level type passed in.

Logging Levels

Setting the minimum logging level allows you to control the granularity of messages that will appear in your log. So, setting the lowest log level of Debug, will mean Debug, Info, Warn, Error and Fatal will be logged. Setting the minimum level to Warn means Warn, Error and Fatal. And setting to Fatal will cause only Fatal messages to be logged.

You’ll probably add many Debug level log messages during development. These are to record anything that maybe of interest to do with any system states, variables etc. This type of logging is the most verbose, and is often enabled to help root cause or fix bugs. This state is often disabled when the system is developed and shipped, and is running normally in production.

For recording events that are not at the Debug level, so for events that can be enabled without full debug logging, the lowest level is the Info level. This is set when we want verbose but not Debug logging.

You can use the Warn level for logging low severity errors or exceptions, normally these are for unexpected behaviours, perhaps even low severity errors that you can recover from.

The Error and Fatal levels are self-explanatory.

The minimum log level can be set so all levels of messages are logged. When you’re developing your application you probably want to have full verbose logging enabled. But this will cause the runtime performance to be relatively very slow.

Once development is complete, your project maybe going through testing (regression/release testing), so it’s not quite ready for relase yet. In this case if you are trying to root cause any issues, or fix bugs, you will have the Debug level enabled.

Once your system is in production, to optimise for speed and log space, the minimum level will probably be Error. Fatal should always be enabled (or the highest level if Fatal is not supported).

Threading

If your application code runs on a single thread, or if it at least has a main thread that handles most of your code execution, you don’t want to slow the thread down with logging.

Consider using a separate thread when logging. So initialize your logger so it has it’s own thread, then call the logging API from your main thread, and have that call the log thread to do the logging.

Opening and Closing the log file

Some logging systems need to open and close the log file to ensure the write buffer to that file is flushed. If this doesn’t happen and the application terminates abnormally, no code will execute to close the file. Therefore your log file will miss the last messages, and ironically these are probably the most important last messages that could explaining why the application shut down.

Experiment to make sure you are capturing the last messages, to be on the safe side you may have to open and close the file every time you log a message, to avoid losing any messages.

Cache Messages

If you are opening and closing the log file each time a log event occurs, this is extremely slow. File I/O is often the bottleneck in any system. You can use a cache to speed things up, simply set the size of your cache in terms of log messages. Then store your log messages in memory (a list or vector/array). Only open the log file to write, and then close, when your cache is full. Again, doing the logging this way will mean if your application crashes and there were messages in the cache, the log file will not include them. Caching can be enabled or disabled with a configuration file setting. This can ensure all messages are in the log file. If at runtime you set the cache size to 1 for example, this would mean all log messages should be written to the log file.

Human readable vs Optimum size

Log messages have 2 states. The first state is the static string i.e. a string that doesn’t change (consider this the type of the log message)

“Just detected the error %s”

State 2 is the runtime dynamic state (consider this the instance of a log message)

“Just detected the error No such file

Our code contains all the static log message types, and at runtime these are instantiated, and have the parameters filled in.

When logging at runtime, you can either use full strings in your log, or you can use just the type of log message, and also the runtime parameters. So our log file could contain the text

“Just detected the error No such file”

Or it could contain the log message type (541), and the logged dynamic parameter value — like this

541, No such file

In this case 541 is the log type that corresponds to the log message “Just detected the error %s”.

Using log types/numbers/tokens saves a lot of space in the log file, rather than writing in the log the full string, we just write the token/numerical type. For embedded systems this is quite common. The downside to this approach is you can not read the log without some processing first. The log needs to be parsed, and constructed from the log type numbers back into the corresponding string. This can sometimes make it hard to watch live logging, as there needs to be some processing before the message can be seen. On systems with a large number of log message types live-logging can be very very slow. There are other positive aspects to using log types rather than verbose strings, not just the log size, namely it allows support for multiple languages, and it can obfuscate your log.

Multi Language Support

If you use the token logging approach above you can provide a log parser with any language you like to substitute the strings. You just need to have the log Id strings translated to as many languages as you need. For international customer support, where log files are used, this can be quite helpful.

Security

Log files are a treasure trove for hackers. If you want to know how a system works, get the log file. If you need your system to be secure (so if you are using any secure practises), you must obfuscate your logs. A good start is the technique above. However, to be even more secure consider encrypting the log. This again slows things down. For an embedded device logs can be encrypted as each log message is created, which will really slow down the logging. Or, it can be done when the log file is downloaded from the device — which slows down log extraction.

Log wrapping

You must set a maximum log file size, especially for storage constrained systems, i.e. embedded devices. You’ll also need to consider what to do when the size is reached, either wrap around — losing the start of your log, or create a new log file. But either way, logs can not be left to grow indefinitely. Many an embedded device has failed letting logs grow, without wrapping around.

Multiple Log Files

If your system creates a new log file at start up, consider if you want to overwrite or backup any previous log files. You can use a folder for all log files, and it also helps to consider if you’ll need a maximum size for such a folder. As mentioned above, you can set maximum log file sizes, and then each new system start up can just append to the last non-full log file.

There’s nothing more disappointing to finally see a system fail from an intermittent issue, knowing you finally caught it with full logging enabled, and then see only one log file with a few lines in it.

Hopefully I’ve gone through most logging considerations. If you have any other points you feel are important to consider, please add them in the comments, Thanks for reading !

--

--