How to determine the maximum number to pass to make -j option?


nproc gives the number of CPU cores/threads available, e.g. 8 on a quad-core CPU supporting two-way SMT.

The number of jobs you can run in parallel with make using the -j option depends on a number of factors:

  • the amount of available memory
  • the amount of memory used by each make job
  • the extent to which make jobs are I/O- or CPU-bound

make -j$(nproc) is a decent place to start, but you can usually use higher values, as long as you don’t exhaust your available memory and start thrashing.

For really fast builds, if you have enough memory, I recommend using a tmpfs, that way most jobs will be CPU-bound and make -j$(nproc) will work as fast as possible.

The most straight-foward way is to use nproc like so:

make -j`nproc`

The command nproc will return the number of cores on your machine. By wrapping it in the ticks, the nproc command will execute first, return a number and that number will be passed into make.

You may have some anecdotal experience where doing core-count + 1 results in faster compile times. This has more to do with factors like I/O delays, other resource delays and other availability of resource constraints.

To do this with nproc+1, try this:

make -j$((`nproc`+1))

Unfortunately even different portions of the same build may be optimal with conflicting j factor values, depending on what’s being built, how, which of the system resources are the bottleneck at that time, what else is happening on the build machine, what’s going on in the network (if using distributed build techniques), status/location/performance of the many caching systems involved in a build, etc.

Compiling 100 tiny C files may be faster than compiling a single huge one, or viceversa. Building small highly convoluted code can be slower than building huge amounts of straight-forward/linear code.

Even the context of the build matters – using a j factor optimized for builds on dedicated servers fine tuned for exclusive, non-overlapping builds may yield very dissapointing results when used by developers building in parallel on the same shared server (each such build may take more time than all of them combined if serialized) or on servers with different hardware configurations or virtualized.

There’s also the aspect of correctness of the build specification. Very complex builds may have race conditions causing intermittent build failures with occurence rates that can vary wildly with the increase or decrease of the j factor.

I can go on and on. The point is that you have to actually evaluate your build in your very context for which you want the j factor optimized. @Jeff Schaller’s comment applies: iterate until you find your best fit. Personally I’d start from the nproc value, try upwards first and downwards only if the upwards attempts show immediate degradation.

Might be a good idea to first measure several identical builds in supposedly identical contexts just to get an idea of the variability of your measurements – if too high it could jeopardise your entire optimisation effort (a 20% variability would completely eclipse a 10% improvement/degradation reading in the j factor search).

Lastly, IMHO it’s better to use an (adaptive) jobserver if supported and available instead of a fixed j factor – it consistently provides a better build performance across wider ranges of contexts.

Related Solutions

What does __all__ mean in Python?

Linked to, but not explicitly mentioned here, is exactly when __all__ is used. It is a list of strings defining what symbols in a module will be exported when from <module> import * is used on the module. For example, the following code in a foo.py...

Is the linux kernel ported to JavaScript yet?

Javascript is not a systems programming language, it is not appropriate for a kernel. Additionally, the kernel is a very large body of code, and "porting" it to another language is not something that can be done easily, and would likely take years. If the...

How to insert (file) data into a PostgreSQL bytea column?

as superuser: create or replace function bytea_import(p_path text, p_result out bytea) language plpgsql as $$ declare l_oid oid; begin select lo_import(p_path) into l_oid; select lo_get(l_oid) INTO p_result; perform lo_unlink(l_oid); end;$$; lo_get was...

What is the best color combination for on screen reading?

Legibility depends on high contrast between foreground and background, so black-and-white is the safest bet. See for example: Hall RH & Hanna H 2003. The Impact of Web Page Text-Background Color Combinations on Readability, Retention, Aesthetics, and...

MATCH FULL vs MATCH SIMPLE in foreign key constraints

Check the CREATE TABLE page of the manual: There are three match types: MATCH FULL, MATCH PARTIAL, and MATCH SIMPLE (which is the default). MATCH FULL will not allow one column of a multicolumn foreign key to be null unless all foreign key columns are null; if...

JavaScript set object key by variable

You need to make the object first, then use [] to set it. var key = "happyCount"; var obj = {}; obj[key] = someValueArray; myArray.push(obj); UPDATE 2021: Computed property names feature was introduced in ECMAScript 2015 (ES6) that allows you to dynamically...

What is the difference between const and readonly in C#?

Apart from the apparent difference of having to declare the value at the time of a definition for a const VS readonly values can be computed dynamically but need to be assigned before the constructor exits. After that it is frozen. const's are implicitly...

How should I index a UUID in Postgres?

Use PostgreSQL's built-in uuid data type, and create a regular b-tree index on it. There is no need to do anything special. This will result in an optimal index, and will also store the uuid field in as compact a form as is currently practical. (Hash indexes in...

Top level domain/domain suffix for private network?

Since the previous answers to this question were written, there have been a couple of RFCs that alter the guidance somewhat. RFC 6761 discusses special-use domain names without providing specific guidance for private networks. RFC 6762 still recommends not...

Getting last modification date of a PostgreSQL database table

There is no reliable, authorative record of the last modified time of a table. Using the relfilenode is wrong for a lot of reasons: Writes are initially recorded to the write-head log (WAL), then lazily to the heap (the table files). Once the record is in WAL,...

How do I make this sed script a “one liner”?

An ANSI C string -- with $'' -- can contain backslash escapes, like \n -- so you can have a newline in sed's arguments while still having the shell command invoking sed be only one line. sed -i $'/INTERPRETER_PYTHON_DISTRO_MAP/,/version_added/ {\n /default/a\\...

How to OCR a PDF file and get the text stored within the PDF?

ocrmypdf does a good job and can be used like this: ocrmypdf in.pdf out.pdf To install: pip install ocrmypdf or sudo apt install ocrmypdf # ubuntu sudo dnf -y install ocrmypdf # fedora After learning that Tesseract can now also produce searchable PDFs, I found...

If …Else If wont work [closed]

First of all, put the javascript code between a <script></script> tag because javascript code will not run in an html <div></div> tag. Then, instead of x == 0||9||2, use x == 0 || x == 9 || x == 2. Kindly indent your code for easier...

How to convert the object of character to string

Your object of characters is already almost an array. It has numeric indices, but is missing the .length property. If you add that it, it will be an "array like" object, which can then be passed to Array.from to get a proper array. Once you have a real array,...

How can I write the approximate value of PI?

Because your { and } is wrong. I think brackets will be as given below If the formula is PI = 4/1 - 4/3 + 4/5 - 4/7 + ... ( Leibniz's Series ) then you can formalate as given below #include <iostream> using namespace std; int main() { double n, i; //...