Ben Laurie blathering

Debian and OpenSSL: The Aftermath

There have been an astonishing number of comments on my post about the Debian OpenSSL debacle, clearly this is a subject people have strong feelings about. But there are some points raised that need addressing, so here we go.

Firstly, many, many people seem to think that I am opposed to removing the use of uninitialised memory. I am not. As has been pointed out, this leads to undefined behaviour – and whilst that’s probably not a real issue given the current state of compiler technology, I can certainly believe in a future where compilers are clever enough to work out that on some calls the memory is not initialised and take action that might be unfortunate. I would also note in passing that my copy of K&R (second edition) does not discuss this issue, and ISO/IEC 9899, which some have quoted in support, rather post-dates the code in OpenSSL. To be clear, I am now in favour of addressing this issue correctly.

And this leads me to the second point. Many people seem to be confused about what change was actually made. There were, in fact, two changes. The first concerned a function called ssleay_rand_add(). As a developer using OpenSSL you would never call this function directly, but it is usually (unless a custom PRNG has been substituted, as happens in FIPS mode, for example) called indirectly via RAND_add(). This call is the only way entropy can be added to the PRNG’s pool. OpenSSL calls RAND_add() on buffers that may not have been initialised in a couple of places, and this is the cause of the valgrind warnings. However, rather than fix the calls to RAND_add(), the Debian maintainer instead removed the code that added the buffer handed to ssleay_rand_add() to the pool. This meant that the pool ended up with essentially no entropy. Clearly this was a very bad idea.

The second change was in ssleay_rand_bytes(), a function that extracts randomness from the pool into a buffer. Again, applications would access this via RAND_bytes() rather than directly. In this function, the contents of the buffer before it is filled are added to the pool. Once more, this could be uninitialised. The Debian developer also removed this call, and that is fine.

The third point: several people have come to the conclusion that OpenSSL relies on uninitialised memory for entropy. This is not so. OpenSSL gets its entropy from a variety of platform-dependent sources. Uninitialised memory is merely a bonus source of potential entropy, and is not counted as “real” entropy.

Fourthly, I said in my original post that if the Debian maintainer had asked the developers, then we would have advised against such a change. About 50% of the comments on my post point to this conversation on the openssl-dev mailing list. In this thread, the Debian maintainer states his intention to remove for debugging purposes a couple of lines that are “adding an unintialiased buffer to the pool”. In fact, the first line he quotes is the first one I described above, i.e. the only route to adding anything to the pool. Two OpenSSL developers responded, the first saying “use -DPURIFY” and the second saying “if it helps with debugging, I’m in favor of removing them”. Had they been inspired to check carefully what these lines of code actually were, rather than believing the description, then they would, indeed, have noticed the problem and said something, I am sure. But their response can hardly be taken as unconditional endorsement of the change.

Fifthly, I said that openssl-dev was not the way to ensure you had the attention of the OpenSSL team. Many have pointed out that the website says it is the place to discuss the development of OpenSSL, and this is true, it is what it says. But it is wrong. The reality is that the list is used to discuss application development questions and is not reliably read by the development team.

Sixthly, my objection to the fix Debian put in place has been misunderstood. The issue is not that they did not fully reverse their previous patch – as I say above, the second removal is actually fine. My issue is that it was committed to a public repository five days before an advisory was issued. Only a single attacker has to notice that and realise its import in order to start exploiting vulnerable systems – and I will be surprised if that has not happened.

I think that’s about enough clarification. The question is: what should we do to avoid this happening again? Firstly, if package maintainers think they are fixing a bug, then they should try to get it fixed upstream, not fix it locally. Had that been done in this case, there is no doubt none of this would have happened. Secondly, it seems clear that we (the OpenSSL team) need to find a way that people can reliably communicate with us in these kinds of cases.

The problem with the second is that there are a lot of people who think we should assist them, and OpenSSL is spectacularly underfunded compared to most other open source projects of its importance. No-one that I am aware of is paid by their employer to work full-time on it. Despite the widespread use of OpenSSL, almost no-one funds development on it. And, indeed, many commercial companies who absolutely depend on it refuse to even acknowledge publicly that they use it, despite the requirements of the licence, let alone contribute towards it in any way.

I welcome any suggestions to improve this situation.

Incidentally, some of the comments are not exactly what I would consider appropriate, and there’s a lot of repetition. I moderate comments on my blog, but only to remove spam (and the occasional cockup, such as people posting twice, not realising they are being moderated). I do not censor the comments, so don’t blame me for their content!


  1. Somebody ought to tell developers about not trying to be the proverbial apprentice that messes with crypto software. It’s way too easy to break. And the rest of the world eels the consequences. Best solution is not to follow those (Linux) distributions that add their own patches to things like OpenSSL, PGP, … and/or pull crypto components directly from the source.

    As for debian taking the blame: they made the bad change, nobody else did, if they didn’t find a channel to get authoritative answers: that’s their problem. You cannot expect the very few crypto experts there are on this planet to be reachable for every wannabe.

    Crypto software is different!

    FD: Not affiliated with OpenSSL, just sane enough to know where I can mess and where it’s dangerous.

    Comment by anonymous coward — 20 May 2008 @ 16:59

  2. Many people have stated that the problem is that OpenSSL reads unitialized memory. But isn’t it (I haven’t read it — correct me if I’m wrong) that OpenSSL’s docs clearly state that the buffer will be read and mixed with entropy pool and so responsibility for doing something with the whole uninitialized-memory-access problem is on the caller?

    Comment by Robert Obryk — 26 May 2008 @ 21:22

  3. Firstly, a Valgrind warning can hardly be regarded as a bug that needs to be fixed, claims that it is so reveals a lack of understanding that the warning is harmless. Someone with that level of misunderstanding of the code (somehow managing to miss the big red #ifdef PURIFY hints), I don’t want near any crypto library I’ll be using in production code, lesson learned, Debian can’t be trusted.

    The SSL team can’t be blamed for this fuckup, the blame for making the change, and shipping it, lies squarely with the *person who made the change*.

    Stock upstream OpenSSL is secure for this particular issue, and Debian OpenSSL is (was) not.

    You can’t whine that upstream didn’t respond to you in time. Upstream isn’t a commercial organization, do you have an SLA or support contract with them? No? Well then, you introduced this fuckup, you carry the can for it. You can’t shift blame whatsoever.

    You fucked up, admit it, eat humble pie, and maybe consider changing the way you fork almost everyone’s software to “improve” it.

    Comment by James Wu — 27 May 2008 @ 23:08

  4. […] security flaw was introduced when some poorly informed developer (see here for lots of info, also here) modified a grand total of two (2) lines of codes. (and if you’re curious how this poor […]

    Pingback by Overheard In Providence » Blog Archive » By chance, flawed — 28 May 2008 @ 4:20

  5. For what it’s worth, I am paid to work — part-time — on OpenSSL in my company’s product. I’d be glad to have some of that work go back into OpenSSL; this is how most people are “paid to work on” most open-source projects.

    Generally speaking I have found attempts to propose any kind of nontrivial change to OpenSSL very frustrating. Either they’re ignored by the developers (e.g. one poor soul’s *two years* of attempts to get the developers’ acceptance of a change, with several iterations of a patch, to fix, compatibly, an API botch in shutdown for nonblocking sockets) or if one goes to quite a bit of effort to attract the developers’ attention, large, baroque alternative solutions are proposed but not, ultimately, checked in; so stasis is maintained in the situation where corporations who need to customize OpenSSL for use in their products keep their changes privately even if they _want_ to feed them back.

    I was interested — and then dismayed — to see a sudden burst of email activity from the OpenSSL developers over the past weeks bashing the Debian developers (justly, I should add) and various others who got caught in the crossfire (perhaps justly, perhaps not). I was interested because, frankly, this is the most activity from the OpenSSL developers *on their own mailing lists* I’ve seen in years of observation. Propose a patch to fix a problem that makes the non-blocking version of the API unusable without incurring a low-probability race condition? Expect to wait months or years with no developer feedback. Give up in frustration and don’t try to discuss your changes with the OpenSSL team? Well, if you *break* OpenSSL, expect to hear plenty from the same people whose attention you couldn’t get to save your life before.

    Given this I think it is really really unreasonable to complain that companies use OpenSSL but don’t contribute their work back. Contributing work back to OpenSSL is extremely difficult because of the poor communication around most if not all aspects of the project for outsiders. If that could be fixed, a lot of resulting problems might get much better.

    Comment by Tls — 30 May 2008 @ 20:12

  6. I welcome any suggestions to improve this situation.

    How about improving the code quality? Any time you do something unusual, like deliberately using uninitialized data, it ought to be commented to explain why you’re doing it.

    If the code had been commented adequately, perhaps the poor sap trying to maintain it wouldn’t have removed the critical lines. (I note that as it stands, the code doesn’t even bother to say what ssleay_rand_add is supposed to do or what its parameters represent.)

    Comment by mathew — 12 Jul 2008 @ 0:32

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress