Home » How can I find the implementations of Linux kernel system calls?

How can I find the implementations of Linux kernel system calls?

Solutons:


System calls aren’t handled like regular function calls. It takes special code to make the transition from user space to kernel space, basically a bit of inline assembly code injected into your program at the call site. The kernel side code that “catches” the system call is also low-level stuff you probably don’t need to understand deeply, at least at first.

In include/linux/syscalls.h under your kernel source directory, you find this:

asmlinkage long sys_mkdir(const char __user *pathname, int mode);

Then in /usr/include/asm*/unistd.h, you find this:

#define __NR_mkdir                              83
__SYSCALL(__NR_mkdir, sys_mkdir)

This code is saying mkdir(2) is system call #83. That is to say, system calls are called by number, not by address as with a normal function call within your own program or to a function in a library linked to your program. The inline assembly glue code I mentioned above uses this to make the transition from user to kernel space, taking your parameters along with it.

Another bit of evidence that things are a little weird here is that there isn’t always a strict parameter list for system calls: open(2), for instance, can take either 2 or 3 parameters. That means open(2) is overloaded, a feature of C++, not C, yet the syscall interface is C-compatible. (This is not the same thing as C’s varargs feature, which allows a single function to take a variable number of arguments.)

To answer your first question, there is no single file where mkdir() exists. Linux supports many different file systems and each one has its own implementation of the “mkdir” operation. The abstraction layer that lets the kernel hide all that behind a single system call is called the VFS. So, you probably want to start digging in fs/namei.c, with vfs_mkdir(). The actual implementations of the low-level file system modifying code are elsewhere. For instance, the ext4 implementation is called ext4_mkdir(), defined in fs/ext4/namei.c.

As for your second question, yes there are patterns to all this, but not a single rule. What you actually need is a fairly broad understanding of how the kernel works in order to figure out where you should look for any particular system call. Not all system calls involve the VFS, so their kernel-side call chains don’t all start in fs/namei.c. mmap(2), for instance, starts in mm/mmap.c, because it’s part of the memory management (“mm”) subsystem of the kernel.

I recommend you get a copy of “Understanding the Linux Kernel” by Bovet and Cesati.

This probably doesn’t answer your question directly, but I’ve found strace to be really cool when trying to understand the underlying system calls, in action, that are made for even the simplest shell commands. e.g.

strace -o trace.txt mkdir mynewdir

The system calls for the command mkdir mynewdir will be dumped to trace.txt for your viewing pleasure.

A good place to read the Linux kernel source is the Linux cross-reference (LXR)¹. Searches return typed matches (functions prototypes, variable declarations, etc.) in addition to free text search results, so it’s handier than a mere grep (and faster too).

LXR doesn’t expand preprocessor definitions. System calls have their name mangled by the preprocessor all over the place. However, most (all?) system calls are defined with one of the SYSCALL_DEFINEx families of macros. Since mkdir takes two arguments, a search for SYSCALL_DEFINE2(mkdir leads to the declaration of the mkdir syscall:

SYSCALL_DEFINE2(mkdir, const char __user *, pathname, int, mode)
{
    return sys_mkdirat(AT_FDCWD, pathname, mode);
}

ok, sys_mkdirat means it’s the mkdirat syscall, so clicking on it only leads you to the declaration in include/linux/syscalls.h, but the definition is just above.

The main job of mkdirat is to call vfs_mkdir (VFS is the generic filesystem layer). Cliking on that shows two search results: the declaration in include/linux/fs.h, and the definition a few lines above. The main job of vfs_mkdir is to call the filesystem-specific implementation: dir->i_op->mkdir. To find how this is implemented, you need to turn to the implementation of the individual filesystem, and there’s no hard-and-fast rule — it could even be a module outside the kernel tree.

¹ LXR is an indexing program. There are several websites that provide an interface to LXR, with slightly different sets of known versions and slightly different web interfaces. They tend to come and go, so if the one you’re used to isn’t available, do a web search for “linux cross-reference” to find another.

Related Solutions

Extract file from docker image?

You can extract files from an image with the following commands: docker create $image # returns container ID docker cp $container_id:$source_path $destination_path docker rm $container_id According to the docker create documentation, this doesn't run the...

Transfer files using scp: permission denied

Your commands are trying to put the new Document to the root (/) of your machine. What you want to do is to transfer them to your home directory (since you have no permissions to write to /). If path to your home is something like /home/erez try the following:...

What’s the purpose of DH Parameters?

What exactly is the purpose of these DH Parameters? These parameters define how OpenSSL performs the Diffie-Hellman (DH) key-exchange. As you stated correctly they include a field prime p and a generator g. The purpose of the availability to customize these...

How to rsync multiple source folders

You can pass multiple source arguments. rsync -a /etc/fstab /home/user/download bkp This creates bkp/fstab and bkp/download, like the separate commands you gave. It may be desirable to preserve the source structure instead. To do this, use / as the source and...

Benefits of Structured Logging vs basic logging

There are two fundamental advances with the structured approach that can't be emulated using text logs without (sometimes extreme levels of) additional effort. Event Types When you write two events with log4net like: log.Debug("Disk quota {0} exceeded by user...

Interfaces vs Types in TypeScript

2019 Update The current answers and the official documentation are outdated. And for those new to TypeScript, the terminology used isn't clear without examples. Below is a list of up-to-date differences. 1. Objects / Functions Both can be used to describe the...

Get total as you type with added column (append) using jQuery

One issue if that the newly-added column id's are missing the id number. If you look at the id, it only shows "price-", when it should probably be "price-2-1", since the original ones are "price-1", and the original ones should probably be something like...

Determining if a file is a hard link or symbolic link?

Jim's answer explains how to test for a symlink: by using test's -L test. But testing for a "hard link" is, well, strictly speaking not what you want. Hard links work because of how Unix handles files: each file is represented by a single inode. Then a single...

How to restrict a Google search to results of a specific language?

You can do that using the advanced search options: http://www.googleguide.com/sharpening_queries.html I also found this, which might work for you: http://www.searchenginejournal.com/how-to-see-google-search-results-for-other-locations/25203/ Just wanted to add...

Random map generation

Among the many other related questions on the site, there's an often linked article for map generation: Polygonal Map Generation for Games you can glean some good strategies from that article, but it can't really be used as is. While not a tutorial, there's an...

How to prettyprint a JSON file?

The json module already implements some basic pretty printing in the dump and dumps functions, with the indent parameter that specifies how many spaces to indent by: >>> import json >>> >>> your_json = '["foo", {"bar":["baz", null,...

How can I avoid the battery charging when connected via USB?

I have an Android 4.0.3 phone without root access so can't test any of this but let me point you to /sys/class/power_supply/battery/ which gives some info/control over charging issues. In particular there is charging_enabled which gives the current state (0 not...

How to transform given dataset in python? [closed]

From your expected result, it appears that each "group" is based on contiguous id values. For this, you can use the compare-cumsum-groupby pattern, and then use agg to get the min and max values. # Sample data. df = pd.DataFrame( {'id': [1, 2, 2, 2, 2, 2, 1, 1,...

Output of the following C++ Program [closed]

It works exactly like this non-recursive translation: int func_0() { return 2; } int func_1() { return 3; } int func_2() { return func_1() + func_0(); } // Returns 3 + 2 = 5 int func_3() { return func_2() + func_1(); } // Returns 5 + 3 = 8 int func_4() { return...

Making a circle out of . (periods) [closed]

Here's the maths and even an example program in C: http://pixwiki.bafsoft.com/mags/5/articles/circle/sincos.htm (link no longer exists). And position: absolute, left and top will let you draw: http://www.w3.org/TR/CSS2/visuren.html#choose-position Any further...

Should I use a code converter (Python to C++)?

Generally it's an awful way to write code, and does not guarantee that it will be any faster. Things which are simple and fast in one language can be complex and slow in another. You're better off either learning how to write fast Python code or learning C++...

tkinter: cannot concatenate ‘str’ and ‘float’ objects

This one line is more than enough to cause the problem: text="რეგულარი >> "+2.23+ 'GEL' 2.23 is a floating-point value; 'GEL' is a string. What does it mean to add an arithmetic value and a string of letters? If you want the string label 'რეგულარი...