Home » Recommended # of iterations when using PBKDF2-SHA256?

Recommended # of iterations when using PBKDF2-SHA256?

Solutons:


You should use the maximum number of rounds which is tolerable, performance-wise, in your application. The number of rounds is a slowdown factor, which you use on the basis that under normal usage conditions, such a slowdown has negligible impact for you (the user will not see it, the extra CPU cost does not imply buying a bigger server, and so on). This heavily depends on the operational context: what machines are involved, how many user authentications per second… so there is no one-size-fits-all response.

The wide picture goes thus:

  • The time to verify a single password is v on your system. You can adjust this time by selecting the number of rounds in PBKDF2.
  • A potential attacker can gather f times more CPU power than you (e.g. you have a single server, and the attacker has 100 big PCs, each being twice faster than your server: this leads to f=200).
  • The average user has a password of entropy n bits (this means that trying to guess a user password, with a dictionary of “plausible passwords”, will take on average 2n-1 tries).
  • The attacker will find your system worth attacking if the average password can be cracked in time less than p (that’s the attacker’s “patience”).

Your goal is to make the average cost to break a single password exceed the attacker’s patience, so that he does not even try, and goes on to concentrate on another, easier target. With the notations detailed above, this means that you want:

v·2n-1 > f·p

p is beyond your control; it can be estimated with regards to the value of the data and systems protected by the user passwords. Let’s say that p is one month (if it takes more than one month, the attacker will not bother trying). You can make f smaller by buying a bigger server; on the other hand, the attacker will try to make f bigger by buying bigger machines. An aggravating point is that password cracking is an embarrassingly parallel task, so the attacker will get a large boost by using a GPU which supports general programming; so a typical f will still range in the order of a few hundreds.

n relates to the quality of the passwords, which you can somehow influence through a strict password-selection policy, but realistically you will have a hard time getting a value of n beyond, say, 32 bits. If you try to enforce stronger passwords, users will begin to actively fight you, with workarounds such as reusing passwords from elsewhere, writing passwords on sticky notes, and so on.

So the remaining parameter is v. With f = 200 (an attacker with a dozen good GPU), a patience of one month, and n = 32, you need v to be at least 241 milliseconds (note: I initially wrote “8 milliseconds” here, which is wrong — this is the figure for a patience of one day instead of one month). So you should set the number of rounds in PBKDF2 such that computing it over a single password takes at least that much time on your server. You will still be able to verify four passwords per second with a single core, so the CPU impact is probably negligible(*). Actually, it is safer to use more rounds than that, because, let’s face it, getting 32 bits worth of entropy out of the average user password is a bit optimistic; on the other hand, not many attacks will devote dozens of PC for one full month to the task of cracking a single password, so maybe an “attacker’s patience” of one day is more realistic, leading to a password verification cost of 8 milliseconds.

So you need to make a few benchmarks. Also, the above works as long as your PBKDF2/SHA-256 implementation is fast. For instance, if you use a fully C#/Java-based implementation, you will get the typical 2 to 3 slowdown factor (compared to C or assembly) for CPU-intensive tasks; in the notations above, this is equivalent to multiplying f by 2 or 3. As a comparison baseline, a 2.4 GHz Core2 CPU can perform about 2.3 millions of elementary SHA-256 computations per second (with a single core), so this would imply, on that CPU, about 20000 rounds to achieve the “8 milliseconds” goal.


(*) Take care that making password verification more expensive also makes your server more vulnerable to Denial-of-Service attacks. You should apply some basic countermeasures, such as temporarily blacklisting client IP addresses that send too many requests per second. You need to do that anyway, to thwart online dictionary attacks.

Run openssl speed on the command line to get an idea of how fast message digest functions are. I can calculate about 1.6 million sha256 hashes a second or about 145 Billion guesses per day on my quad core 2.2ghz sandy bridge. If someone has a password that is in the English dictionary and used one round of sha256 then it would take longer to load the word list off the disk than to iterate over the list to break the hash. If you did PKBDF2-SHA256 with a a few hundred thousand rounds it would take a few few minutes to break. Enforcing a strong password policy helps, a lot.

The real answer: How much time do you have to burn?

Thomas’ answer provides a helpful baseline model and data. But the goal that he posits doesn’t make sense to me. A typical attacker won’t know your iteration count until after actually hacking the site and grabbing the database of hashes. Having done that, they won’t move on just because you use a large iteration count. They’ll try to crack as many as they can, and quite possibly publicize the hashes so others will continue trying to crack them for years to come with more and more powerful hardware. So “p” and “f” will both continue to increase long after the hack.

Also, real user passwords aren’t well modeled by a complexity measure like 32 bits of entropy. The article Reusable Security: New Paper on Password Security Metrics is helpful in this regard, and documents what we’ve long known: many users pick easy-to-guess passwords, and there is a long tail. This also means attackers will always find some if they try hard enough.

I’d say that a more likely goal would be to protect as large a percentage of your users as possible from having their passwords cracked. E.g. table 4.2.1 of the PDF shows that if you managed to limit your attacker during some attack campaign from an average of 1 million attempts per hash down to 500,000 attempts, you might protect the passwords of 5% of your users (assuming a mix with more 7-character passwords than 8+ character passwords, thus reducing the cracked percentage from 35% to 30%). Of course the exact shape of the curve and where they are on it will vary widely.

So I’d just go for the maximum iteration count that you can budget for, so long as it doesn’t delay real users doing normal logins. And you should increase the value as your compute capacity grows over the years.

Related Solutions

Extract file from docker image?

You can extract files from an image with the following commands: docker create $image # returns container ID docker cp $container_id:$source_path $destination_path docker rm $container_id According to the docker create documentation, this doesn't run the...

Transfer files using scp: permission denied

Your commands are trying to put the new Document to the root (/) of your machine. What you want to do is to transfer them to your home directory (since you have no permissions to write to /). If path to your home is something like /home/erez try the following:...

What’s the purpose of DH Parameters?

What exactly is the purpose of these DH Parameters? These parameters define how OpenSSL performs the Diffie-Hellman (DH) key-exchange. As you stated correctly they include a field prime p and a generator g. The purpose of the availability to customize these...

How to rsync multiple source folders

You can pass multiple source arguments. rsync -a /etc/fstab /home/user/download bkp This creates bkp/fstab and bkp/download, like the separate commands you gave. It may be desirable to preserve the source structure instead. To do this, use / as the source and...

Benefits of Structured Logging vs basic logging

There are two fundamental advances with the structured approach that can't be emulated using text logs without (sometimes extreme levels of) additional effort. Event Types When you write two events with log4net like: log.Debug("Disk quota {0} exceeded by user...

Interfaces vs Types in TypeScript

2019 Update The current answers and the official documentation are outdated. And for those new to TypeScript, the terminology used isn't clear without examples. Below is a list of up-to-date differences. 1. Objects / Functions Both can be used to describe the...

Get total as you type with added column (append) using jQuery

One issue if that the newly-added column id's are missing the id number. If you look at the id, it only shows "price-", when it should probably be "price-2-1", since the original ones are "price-1", and the original ones should probably be something like...

Determining if a file is a hard link or symbolic link?

Jim's answer explains how to test for a symlink: by using test's -L test. But testing for a "hard link" is, well, strictly speaking not what you want. Hard links work because of how Unix handles files: each file is represented by a single inode. Then a single...

How to restrict a Google search to results of a specific language?

You can do that using the advanced search options: http://www.googleguide.com/sharpening_queries.html I also found this, which might work for you: http://www.searchenginejournal.com/how-to-see-google-search-results-for-other-locations/25203/ Just wanted to add...

Random map generation

Among the many other related questions on the site, there's an often linked article for map generation: Polygonal Map Generation for Games you can glean some good strategies from that article, but it can't really be used as is. While not a tutorial, there's an...

How to prettyprint a JSON file?

The json module already implements some basic pretty printing in the dump and dumps functions, with the indent parameter that specifies how many spaces to indent by: >>> import json >>> >>> your_json = '["foo", {"bar":["baz", null,...

How can I avoid the battery charging when connected via USB?

I have an Android 4.0.3 phone without root access so can't test any of this but let me point you to /sys/class/power_supply/battery/ which gives some info/control over charging issues. In particular there is charging_enabled which gives the current state (0 not...

How to transform given dataset in python? [closed]

From your expected result, it appears that each "group" is based on contiguous id values. For this, you can use the compare-cumsum-groupby pattern, and then use agg to get the min and max values. # Sample data. df = pd.DataFrame( {'id': [1, 2, 2, 2, 2, 2, 1, 1,...

Output of the following C++ Program [closed]

It works exactly like this non-recursive translation: int func_0() { return 2; } int func_1() { return 3; } int func_2() { return func_1() + func_0(); } // Returns 3 + 2 = 5 int func_3() { return func_2() + func_1(); } // Returns 5 + 3 = 8 int func_4() { return...

Making a circle out of . (periods) [closed]

Here's the maths and even an example program in C: http://pixwiki.bafsoft.com/mags/5/articles/circle/sincos.htm (link no longer exists). And position: absolute, left and top will let you draw: http://www.w3.org/TR/CSS2/visuren.html#choose-position Any further...

Should I use a code converter (Python to C++)?

Generally it's an awful way to write code, and does not guarantee that it will be any faster. Things which are simple and fast in one language can be complex and slow in another. You're better off either learning how to write fast Python code or learning C++...

tkinter: cannot concatenate ‘str’ and ‘float’ objects

This one line is more than enough to cause the problem: text="რეგულარი >> "+2.23+ 'GEL' 2.23 is a floating-point value; 'GEL' is a string. What does it mean to add an arithmetic value and a string of letters? If you want the string label 'რეგულარი...