Home » Benefits of Structured Logging vs basic logging

Benefits of Structured Logging vs basic logging

Solutons:


There are two fundamental advances with the structured approach that can’t be emulated using text logs without (sometimes extreme levels of) additional effort.

Event Types

When you write two events with log4net like:

log.Debug("Disk quota {0} exceeded by user {1}", 100, "DTI-Matt");
log.Debug("Disk quota {0} exceeded by user {1}", 150, "nblumhardt");

These will produce similar text:

Disk quota 100 exceeded by user DTI-Matt
Disk quota 150 exceeded by user nblumhardt

But, as far as machine processing is concerned, they’re just two lines of different text.

You may wish to find all “disk quota exceeded” events, but the simplistic case of looking for events like 'Disk quota%' will fall down as soon as another event occurs looking like:

Disk quota 100 set for user DTI-Matt

Text logging throws away the information we initially have about the source of the event, and this has to be reconstructed when reading the logs usually with more and more elaborate match expressions.

By contrast, when you write the following two Serilog events:

log.Debug("Disk quota {Quota} exceeded by user {Username}", 100, "DTI-Matt");
log.Debug("Disk quota {Quota} exceeded by user {Username}", 150, "nblumhardt");

These produce similar text output to the log4net version, but behind the scenes, the "Disk quota {Quota} exceeded by user {Username}" message template is carried by both events.

With an appropriate sink, you can later write queries where MessageTemplate="Disk quota {Quota} exceeded by user {Username}" and get exactly the events where the disk quota was exceeded.

It’s not always convenient to store the entire message template with every log event, so some sinks hash the message template into a numeric EventType value (e.g. 0x1234abcd), or, you can add an enricher to the logging pipeline to do this yourself.

It’s more subtle than the next difference below, but a massively powerful one when dealing with large log volumes.

Structured Data

Again considering the two events about disk space usage, it may be easy enough using text logs to query for a particular user with like 'Disk quota' and like 'DTI-Matt'.

But, production diagnostics aren’t always so straightforward. Imagine it’s necessary to find events where the disk quota exceeded was below 125 MB?

With Serilog, this is possible in most sinks using a variant of:

Quota < 125

Constructing this kind of query from a regular expression is possible, but it gets tiring fast and usually ends up being a measure of last resort.

Now add to this an event type:

Quota < 125 and EventType = 0x1234abcd

You start to see here how these capabilities combine in a straightforward way to make production debugging with logs feel like a first-class development activity.

One further benefit, perhaps not as easy to prevent up front, but once production debugging has been lifted out of the land of regex hackery, developers start to value logs a lot more and exercise more care and consideration when writing them. Better logs -> better quality applications -> more happiness all around.

When you are collecting logs for processing, be it for parsing into some database and/or searching through the processed logs later, using structured logging makes some of the processing easier/more efficient. The parser can take advantage of the known structure (e.g. JSON, XML, ASN.1, whatever) and use state machines for parsing, as opposed to regular expressions (which can be computationally expensive (relatively) to compile and execute). Parsing of free-form text, such as that suggested by your coworker, tends to rely on regular expressions, and to rely on that text not changing. This can make parsing free-form text rather fragile (i.e. parsing is tightly coupled to the exact text in the code).

Consider also the search/lookup case, e.g.:

SELECT text FROM logs WHERE text LIKE "Disk quota";

LIKE conditions require comparisons with every text row value; again, this is relatively computationally expensive, particularly so when wildcards are used:

SELECT text FROM logs WHERE text LIKE "Disk %";

With structured logging, your disk-error related log message might look like this in JSON:

{ "level": "DEBUG", "user": "username", "error_type": "disk", "text": "Disk quota ... exceeded by user ..." }

The fields of this kind of structure can map pretty easily to e.g. SQL table column names, which turn means the lookup can be more specific/granular:

SELECT user, text FROM logs WHERE error_type = "disk";

You can place indexes on the columns whose values you expect to search/lookup frequently, as long as you don’t use LIKE clauses for those column values. The more you can break down your log message into specific categories, the more targeted you can make your lookup. For example, in addition to the error_type field/column in the example above, you could make even be "error_category": "disk", "error_type": "quota" or somesuch.

The more structure you have in your log messages, the more your parsing/searching systems (such as fluentd, elasticsearch, kibana) can take advantage of that structure, and perform their tasks with greater speed and less CPU/memory.

Hope this helps!

You won’t find much benefit from structured logging when your app creates a few hundred log messages per day. You definitely will when you have a few hundred log messages per second coming from many different deployed apps.

Related, the setup where log messages end up in the ELK Stack is also appropriate for scale where logging to SQL becomes a bottleneck.

I have seen the setup of “basic logging and searching” with SQL select .. like and regexps pushed to its limits where it falls apart – there are false positives, omissions, horrible filter code with knwon bugs that’s hard to maintain and no-one wants to touch, new log messages that don’t follow the filter’s assumptions, reluctance to touch logging statements in code lest they break reports, etc.

So several software packages are emerging to deal with this problem in a better way. There is Serilog, I hear that the NLog team is looking at it, and we wrote StructuredLogging.Json for Nlog, I also see that the new ASP.Net core logging abstractions “make it possible for logging providers to implement … structured logging”.

An example with StructuredLogging. You log to an NLog logger like this:

logger.ExtendedError("Order send failed", new { OrderId = 1234, RestaurantId = 4567 } );

This structured data goes to kibana. The value 1234 is stored in the OrderId field of the log entry. You can then search using kibana query syntax for e.g. all log entries where @LogType:nlog AND Level:Error AND OrderId:1234.

Message and OrderId are now just fields that can be searched for exact or inexact matches as you need, or aggregated for counts. This is powerful and flexible.

From the StructuredLogging best practices:

The message logged should be the same every time. It should be a
constant string, not a string formatted to contain data values such as
ids or quantities. Then it is easy to search for.

The message logged
should be distinct i.e. not the same as the message produced by an
unrelated log statement. Then searching for it does not match
unrelated things as well.

Related Solutions

Extract file from docker image?

You can extract files from an image with the following commands: docker create $image # returns container ID docker cp $container_id:$source_path $destination_path docker rm $container_id According to the docker create documentation, this doesn't run the...

Transfer files using scp: permission denied

Your commands are trying to put the new Document to the root (/) of your machine. What you want to do is to transfer them to your home directory (since you have no permissions to write to /). If path to your home is something like /home/erez try the following:...

What’s the purpose of DH Parameters?

What exactly is the purpose of these DH Parameters? These parameters define how OpenSSL performs the Diffie-Hellman (DH) key-exchange. As you stated correctly they include a field prime p and a generator g. The purpose of the availability to customize these...

How to rsync multiple source folders

You can pass multiple source arguments. rsync -a /etc/fstab /home/user/download bkp This creates bkp/fstab and bkp/download, like the separate commands you gave. It may be desirable to preserve the source structure instead. To do this, use / as the source and...

Interfaces vs Types in TypeScript

2019 Update The current answers and the official documentation are outdated. And for those new to TypeScript, the terminology used isn't clear without examples. Below is a list of up-to-date differences. 1. Objects / Functions Both can be used to describe the...

Get total as you type with added column (append) using jQuery

One issue if that the newly-added column id's are missing the id number. If you look at the id, it only shows "price-", when it should probably be "price-2-1", since the original ones are "price-1", and the original ones should probably be something like...

Determining if a file is a hard link or symbolic link?

Jim's answer explains how to test for a symlink: by using test's -L test. But testing for a "hard link" is, well, strictly speaking not what you want. Hard links work because of how Unix handles files: each file is represented by a single inode. Then a single...

How to restrict a Google search to results of a specific language?

You can do that using the advanced search options: http://www.googleguide.com/sharpening_queries.html I also found this, which might work for you: http://www.searchenginejournal.com/how-to-see-google-search-results-for-other-locations/25203/ Just wanted to add...

Random map generation

Among the many other related questions on the site, there's an often linked article for map generation: Polygonal Map Generation for Games you can glean some good strategies from that article, but it can't really be used as is. While not a tutorial, there's an...

How to prettyprint a JSON file?

The json module already implements some basic pretty printing in the dump and dumps functions, with the indent parameter that specifies how many spaces to indent by: >>> import json >>> >>> your_json = '["foo", {"bar":["baz", null,...

How can I avoid the battery charging when connected via USB?

I have an Android 4.0.3 phone without root access so can't test any of this but let me point you to /sys/class/power_supply/battery/ which gives some info/control over charging issues. In particular there is charging_enabled which gives the current state (0 not...

How to transform given dataset in python? [closed]

From your expected result, it appears that each "group" is based on contiguous id values. For this, you can use the compare-cumsum-groupby pattern, and then use agg to get the min and max values. # Sample data. df = pd.DataFrame( {'id': [1, 2, 2, 2, 2, 2, 1, 1,...

Output of the following C++ Program [closed]

It works exactly like this non-recursive translation: int func_0() { return 2; } int func_1() { return 3; } int func_2() { return func_1() + func_0(); } // Returns 3 + 2 = 5 int func_3() { return func_2() + func_1(); } // Returns 5 + 3 = 8 int func_4() { return...

Making a circle out of . (periods) [closed]

Here's the maths and even an example program in C: http://pixwiki.bafsoft.com/mags/5/articles/circle/sincos.htm (link no longer exists). And position: absolute, left and top will let you draw: http://www.w3.org/TR/CSS2/visuren.html#choose-position Any further...

Should I use a code converter (Python to C++)?

Generally it's an awful way to write code, and does not guarantee that it will be any faster. Things which are simple and fast in one language can be complex and slow in another. You're better off either learning how to write fast Python code or learning C++...

tkinter: cannot concatenate ‘str’ and ‘float’ objects

This one line is more than enough to cause the problem: text="რეგულარი >> "+2.23+ 'GEL' 2.23 is a floating-point value; 'GEL' is a string. What does it mean to add an arithmetic value and a string of letters? If you want the string label 'რეგულარი...