This attack is supposed to be presented 10 days from now, but my guess is that they use compression.
SSL/TLS optionally supports data compression. In the
ClientHello message, the client states the list of compression algorithms that it knows of, and the server responds, in the
ServerHello, with the compression algorithm that will be used. Compression algorithms are specified by one-byte identifiers, and TLS 1.2 (RFC 5246) defines only the
null compression method (i.e. no compression at all). Other documents specify compression methods, in particular RFC 3749 which defines compression method 1, based on DEFLATE, the LZ77-derivative which is at the core of the GZip format and also modern Zip archives. When compression is used, it is applied on all the transferred data, as a long stream. In particular, when used with HTTPS, compression is applied on all the successive HTTP requests in the stream, header included. DEFLATE works by locating repeated subsequences of bytes.
For this example, we suppose that the cookie in each HTTP request looks like this:
The attacker knows the
Cookie: secret=0. The HTTP request will look like this:
POST / HTTP/1.1 Host: thebankserver.com (...) Cookie: secret=7xc89f+94/wa (...) Cookie: secret=0
When DEFLATE sees that, it will recognize the repeated
Cookie: secret= sequence and represent the second instance with a very short token (one which states “previous sequence has length 15 and was located n bytes in the past); DEFLATE will have to emit an extra token for the ‘0’.
The request goes to the server. From the outside, the eavesdropping part of the attacker sees an opaque blob (SSL encrypts the data) but he can see the blob length (with byte granularity when the connection uses RC4; with block ciphers there is a bit of padding, but the attacker can adjust the contents of his requests so that he may phase with block boundaries, so, in practice, the attacker can know the length of the compressed request).
Now, the attacker tries again, with
Cookie: secret=1 in the request body. Then,
Cookie: secret=2, and so on. All these requests will compress to the same size (almost — there are subtleties with Huffman codes as used in DEFLATE), except the one which contains
Cookie: secret=7, which compresses better (16 bytes of repeated subsequence instead of 15), and thus will be shorter. The attacker sees that. Therefore, in a few dozen requests, the attacker has guessed the first byte of the secret value.
He then just has to repeat the process (
Cookie: secret=71, and so on) and obtain, byte by byte, the complete secret.
What I describe above is what I thought of when I read the article, which talks about “information leak” from an “optional feature”. I cannot know for sure that what will be published as the CRIME attack is really based upon compression. However, I do not see how the attack on compression cannot work. Therefore, regardless of whether CRIME turns out to abuse compression or be something completely different, you should turn off compression support from your client (or your server).
Note that I am talking about compression at the SSL level. HTTP also includes optional compression, but this one applies only to the body of the requests and responses, not the header, and thus does not cover the
Cookie: header line. HTTP-level compression is fine.
Edit 2012/09/12: The attack above can be optimized a bit by doing a dichotomy. Imagine that the secret value is in Base64, i.e. there are 64 possible values for each unknown character. The attacker can make a request containing 32 copies of
Cookie: secret=X (for 32 variants of the
X character). If one of them matches the actual cookie, the total compressed length with be shorter than otherwise. Once the attacker knows which half of his alphabet the unknown byte is part of, he can try again with a 16/16 split, and so on. In 6 requests, this homes in the unknown byte value (because 26 = 64). If the secret value is in hexadecimal, the 6 requests become 4 requests (24 = 16). Dichotomy explains this recent twit of Juliano Rizzo.
Edit 2012/09/13: IT IS CONFIRMED. The CRIME attack abuses compression, in a way similar to what is explained above. The actual “body” in which the attacker inserts presumed copies of the cookie can actually be the path in a simple request which can be triggered by a most basic
<img> tag; no need for fancy exploits of the same-origin-policy.
To add to Thomas Pornin’s outstanding answer, I wanted to point out some prior work on the subject of compression and cryptography. Take a look at the following research paper:
- John Kelsey. Compression and Information Leakage of Plaintext. FSE 2002.
That paper describes chosen-plaintext attacks against systems that (a) compress data before encrypting it, and (b) where an eavesdrop can observe the length of the resulting ciphertexts.
The attacks that are conceptually vaguely similar to what Thomas Pornin describes. The paper even mentions that TLS uses optional compression before encryption. However, at the time I don’t think anyone realized that this enables an attack on HTTP over TLS, or that an attacker could learn the value of secret cookies sent over a TLS-encrypted connection. The paper looks at attacks on compression mainly in the abstract, rather than in the specific context of the web, and is pretty theoretical. So, CRIME (or Thomas Pornin’s attack) is still a significant novel extension of these ideas.
Nonetheless, this is an interesting paper that anticipates the general sort of attack at issue here, even if it did not realize the consequences for web security. It is interesting that the general sort of issue was first described in the research literature 10 years ago, yet it took that long for the security community to fully appreciate the practical consequences of this work. Crypto sure ain’t easy, is it?
Just to add to great Thomas answer, it seems that to successfully leak the cookie value the actual POST body sent should not only be the:
but should contain much more text from the POST header, like so:
POST / HTTP/1.1 Host: thebankserver.com Connection: keep-alive User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/22.0.1207.1 Safari/537.1 Accept: */* Referer: https://thebankserver.com/ Cookie: secret=...
navigator object), so that is not a problem in the attack scenario.
In practice, attacker can mutate the body even more, e.g. by putting the cookie value multiple times, putting multiple Cookie headers, using only parts of the POST headers in the body etc.
Based on @xorninja code I’ve constructed the adaptive algorithm that duplicates the whole request header in the body, and tries to shorten it iteratively if the results for the next cookie character are unclear.
Results are promising, at least 8 characters are detected now.. When no character can be detected this way, the request body is shortened by removing one header and the process continues. It can successfully leak arbitrary cookie values. Feel free to improve.