Home » Understand logging in Linux

Understand logging in Linux


Simplified, it goes more or less like this:

The kernel logs messages (using the printk() function) to a ring buffer in kernel space. These messages are made available to user-space applications in two ways: via the /proc/kmsg file (provided that /proc is mounted), and via the sys_syslog syscall.

There are two main applications that read (and, to some extent, can control) the kernel’s ring buffer: dmesg(1) and klogd(8). The former is intended to be run on demand by users, to print the contents of the ring buffer. The latter is a daemon that reads the messages from /proc/kmsg (or calls sys_syslog, if /proc is not mounted) and sends them to syslogd(8), or to the console. That covers the kernel side.

In user space, there’s syslogd(8). This is a daemon that listens on a number of UNIX domain sockets (mainly /dev/log, but others can be configured too), and optionally to the UDP port 514 for messages. It also receives messages from klogd(8) (syslogd(8) doesn’t care about /proc/kmsg). It then writes these messages to some files in /log, or to named pipes, or sends them to some remote hosts (via the syslog protocol, on UDP port 514), as configured in /etc/syslog.conf.

User-space applications normally use the libc function syslog(3) to log messages. libc sends these messages to the UNIX domain socket /dev/log (where they are read by syslogd(8)), but if an application is chroot(2)-ed the messages might end up being written to other sockets, f.i. to /var/named/dev/log. It is, of course, essential for the applications sending these logs and syslogd(8) to agree on the location of these sockets. For these reason syslogd(8) can be configured to listen to additional sockets aside from the standard /dev/log.

Finally, the syslog protocol is just a datagram protocol. Nothing stops an application from sending syslog datagrams to any UNIX domain socket (provided that its credentials allows it to open the socket), bypassing the syslog(3) function in libc completely. If the datagrams are correctly formatted syslogd(8) can use them as if the messages were sent through syslog(3).

Of course, the above covers only the “classic” logging theory. Other daemons (such as rsyslog and syslog-ng, as you mention) can replace the plain syslogd(8), and do all sorts of nifty things, like send messages to remote hosts via encrypted TCP connections, provide high resolution timestamps, and so on. And there’s also systemd, that is slowly phagocytosing the UNIX part of Linux. systemd has its own logging mechanisms, but that story would have to be told by somebody else. 🙂

Differences with the *BSD world:

On *BSD there is no klogd(8), and /proc either doesn’t exist (on OpenBSD) or is mostly obsolete (on FreeBSD and NetBSD). syslogd(8) reads kernel messages from the character device /dev/klog, and dmesg(1) uses /dev/kmem to decode kernel names. Only OpenBSD has a /dev/log. FreeBSD uses two UNIX domain sockets /var/run/log and var/rub/logpriv instead, and NetBSD has a /var/run/log.

The other answer explains, as its author says, “classic logging” in Linux. That’s not how things work in a lot of systems nowadays.

The kernel

The kernel mechanisms have changed.

The kernel generates output to an in-memory buffer. Application softwares can access this in two ways. The logging subsystem usually accesses it as a pseudo-FIFO named /proc/kmsg. This source of log information cannot usefully be shared amongst log readers, because it is read-once. If multiple processes share it, they each get only a part of the kernel log data stream. It is also read-only.

The other way to access it is the newer /dev/kmsg character device. This is a read-write interface that is shareable amongst multiple client processes. If multiple processes share it, they all read the same complete data stream, unaffected by one another. If they open it for write access, they can also inject messages into the kernel’s log stream, as if they were generated by the kernel.

/proc/kmsg and /dev/kmsg provide log data in a non-RFC-5424 form.


Applications have changed.

The GNU C library’s syslog() function in the main attempts to connect to an AF_LOCAL datagram socket named /dev/log and write log entries to it. (The BSD C library’s syslog() function nowadays uses /var/run/log as the socket name, and tries /var/run/logpriv first.) Applications can of course have their own code to do this directly. The library function is just code (to open, connect, write to, and close a socket) executing in the application’s own process context, after all.

Applications can also send RFC 5424 messages via UDP to a local RFC 5426 server, if one is listening on an AF_INET/AF_INET6 datagram socket on the machine.

Thanks to pressure from the daemontools world over the past two decades, a lot of dæmons support running in a mode where they don’t use the GNU syslog() C library function, or UDP sockets, but just spit their log data out to standard error in the ordinary Unix fashion.

log management with nosh and the daemontools family in general

With the daemontools family of toolsets there’s a lot of flexibility in logging. But in general across the whole family the idea is that each “main” dæmon has an associated “logging” dæmon. “main” dæmons work just like non-dæmon processes and write their log messages to standard error (or standard output), which the service management subsystem arranges to have connected via a pipe (which it holds open so that log data are not lost over a service restart) to the standard input of the “logging” dæmon.

All of the “logging” dæmons run a program that logs somewhere. Generally this program is something like multilog or cyclog that reads from its standard input and writes (nanosecond timestamped) log files in a strictly size-capped, automatically rotated, exclusive-write, directory. Generally, too, these dæmons all run under the aegises of individual dedicated unprivileged user accounts.

So one ends up with a largely distributed logging system, with each service’s log data processed separately.

One can run something like klogd or syslogd or rsyslogd under a daemontools-family service management. But the daemontools world realized many years ago that the service management structure with “logging” dæmons lends itself quite neatly to doing things in a simpler fashion. There’s no need to fan all of the log streams into one giant mish-mash, parse the log data, and then fan the streams back out to separate log files; and then (in some cases) bolt an unreliable external log rotation mechanism on the side. The daemontools-family structure as part of its standard log management already does the log rotation, the logfile writing, and the stream separation.

Furthermore: The chain-loading model of dropping privileges with tools common across all services means that the logging programs have no need of superuser privileges; and the UCSPI model means that they only need to care about differences such as stream versus datagram transports.

The nosh toolset exemplifies this. Whilst one can run rsyslogd under it, out of the box, and just manage kernel, /run/log, and UDP log input in the old way; it also provides more “daemontools native” ways of logging these things:

  • a klogd service that reads from /proc/kmsg and simply writes that log stream to its standard error. This is done by a simple program named klog-read. The associated logging dæmon feeds the log stream on its standard input into a /var/log/sv/klogd log directory.
  • a local-syslog-read service that reads datagrams from /dev/log (/run/log on the BSDs) and simply writes that log stream to its standard error. This is done by a program named syslog-read. The associated logging dæmon feeds the log stream on its standard input into a /var/log/sv/local-syslog-read log directory.
  • a udp-syslog-read service that listens on the UDP syslog port, reads what it is sent to it and simply writes that log stream to its standard error. Again, the program is syslog-read. The associated logging dæmon feeds the log stream on its standard input into a /var/log/sv/udp-syslog-read log directory.
  • (on the BSDs) a local-priv-syslog-read service that reads datagrams from /run/logpriv and simply writes that log stream to its standard error. Again, the program is syslog-read. The associated logging dæmon feeds the log stream on its standard input into a /var/log/sv/local-priv-syslog-read log directory.

The toolset also comes with an export-to-rsyslog tool that can monitor one or several log directories (using a system of non-intrusive log cursors) and send new entries in RFC 5424 form over the network to a designated RFC 5426 server.

log management with systemd

systemd has a single monolithic log management program, systemd-journald. This runs as a service managed by systemd.

  • It reads /dev/kmsg for kernel log data.
  • It reads /dev/log (a symbolic link to /run/systemd/journal/dev-log) for application log data from the GNU C library’s syslog() function.
  • It listens on the AF_LOCAL stream socket at /run/systemd/journal/stdout for log data coming from systemd-managed services.
  • It listens on the AF_LOCAL datagram socket at /run/systemd/journal/socket for log data coming from programs that speak the systemd-specific journal protocol (i.e. sd_journal_sendv() et al.).
  • It mixes these all together.
  • It writes to a set of system-wide and per-user journal files, in /run/log/journal/ or /var/log/journal/.
  • If it can connect (as a client) to an AF_LOCAL datagram socket at /run/systemd/journal/syslog it writes journal data there, if forwarding to syslog is configured.
  • If configured, it writes journal data to the kernel buffer using the writable /dev/kmsg mechanism.
  • If configured, it writes journal data to terminals and the console device as well.

Bad things happen system-wide if this program crashes, or the service is stopped.

systemd itself arranges for the standard outputs and errors of (some) services to be attached to the /run/systemd/journal/stdout socket. So dæmons that log to standard error in the normal fashion have their output sent to the journal.

This completely supplants klogd, syslogd, syslog-ng, and rsyslogd.

These are now required to be systemd-specific. On a systemd system they don’t get to be the server end of /dev/log. Instead, they take one of two approaches:

  • They get to be the server end of /run/systemd/journal/syslog, which (if you remember) systemd-journald attempts to connect and write journal data to. A couple of years ago, one would have configured rsyslogd’s imuxsock input method to do this.
  • They read directly from the systemd journal, using a systemd-specific library that understands the binary journal format and that can monitor the journal files and directory for new entries being added. Nowadays, one configures rsyslogd’s imjournal input method to do this.

Related Solutions

Joining bash arguments into single string with spaces

[*] I believe that this does what you want. It will put all the arguments in one string, separated by spaces, with single quotes around all: str="'$*'" $* produces all the scripts arguments separated by the first character of $IFS which, by default, is a space....

AddTransient, AddScoped and AddSingleton Services Differences

TL;DR Transient objects are always different; a new instance is provided to every controller and every service. Scoped objects are the same within a request, but different across different requests. Singleton objects are the same for every object and every...

How to download package not install it with apt-get command?

Use --download-only: sudo apt-get install --download-only pppoe This will download pppoe and any dependencies you need, and place them in /var/cache/apt/archives. That way a subsequent apt-get install pppoe will be able to complete without any extra downloads....

What defines the maximum size for a command single argument?

Answers Definitely not a bug. The parameter which defines the maximum size for one argument is MAX_ARG_STRLEN. There is no documentation for this parameter other than the comments in binfmts.h: /* * These are the maximum length and maximum number of strings...

Bulk rename, change prefix

I'd say the simplest it to just use the rename command which is common on many Linux distributions. There are two common versions of this command so check its man page to find which one you have: ## rename from Perl (common in Debian systems -- Ubuntu, Mint,...

Output from ls has newlines but displays on a single line. Why?

When you pipe the output, ls acts differently. This fact is hidden away in the info documentation: If standard output is a terminal, the output is in columns (sorted vertically) and control characters are output as question marks; otherwise, the output is...

mv: Move file only if destination does not exist

mv -vn file1 file2. This command will do what you want. You can skip -v if you want. -v makes it verbose - mv will tell you that it moved file if it moves it(useful, since there is possibility that file will not be moved) -n moves only if file2 does not exist....

Is it possible to store and query JSON in SQLite?

SQLite 3.9 introduced a new extension (JSON1) that allows you to easily work with JSON data . Also, it introduced support for indexes on expressions, which (in my understanding) should allow you to define indexes on your JSON data as well. PostgreSQL has some...

Combining tail && journalctl

You could use: journalctl -u service-name -f -f, --follow Show only the most recent journal entries, and continuously print new entries as they are appended to the journal. Here I've added "service-name" to distinguish this answer from others; you substitute...

how can shellshock be exploited over SSH?

One example where this can be exploited is on servers with an authorized_keys forced command. When adding an entry to ~/.ssh/authorized_keys, you can prefix the line with command="foo" to force foo to be run any time that ssh public key is used. With this...

Why doesn’t the tilde (~) expand inside double quotes?

The reason, because inside double quotes, tilde ~ has no special meaning, it's treated as literal. POSIX defines Double-Quotes as: Enclosing characters in double-quotes ( "" ) shall preserve the literal value of all characters within the double-quotes, with the...

What is GNU Info for?

GNU Info was designed to offer documentation that was comprehensive, hyperlinked, and possible to output to multiple formats. Man pages were available, and they were great at providing printed output. However, they were designed such that each man page had a...

Set systemd service to execute after fstab mount

a CIFS network location is mounted via /etc/fstab to /mnt/ on boot-up. No, it is not. Get this right, and the rest falls into place naturally. The mount is handled by a (generated) systemd mount unit that will be named something like mnt-wibble.mount. You can...

Merge two video clips into one, placing them next to each other

To be honest, using the accepted answer resulted in a lot of dropped frames for me. However, using the hstack filter_complex produced perfectly fluid output: ffmpeg -i left.mp4 -i right.mp4 -filter_complex hstack output.mp4 ffmpeg -i input1.mp4 -i input2.mp4...

How portable are /dev/stdin, /dev/stdout and /dev/stderr?

It's been available on Linux back into its prehistory. It is not POSIX, although many actual shells (including AT&T ksh and bash) will simulate it if it's not present in the OS; note that this simulation only works at the shell level (i.e. redirection or...

How can I increase the number of inodes in an ext4 filesystem?

It seems that you have a lot more files than normal expectation. I don't know whether there is a solution to change the inode table size dynamically. I'm afraid that you need to back-up your data, and create new filesystem, and restore your data. To create new...

Why doesn’t cp have a progress bar like wget?

The tradition in unix tools is to display messages only if something goes wrong. I think this is both for design and practical reasons. The design is intended to make it obvious when something goes wrong: you get an error message, and it's not drowned in...