There have been an astonishing number of comments on my post about the Debian OpenSSL debacle, clearly this is a subject people have strong feelings about. But there are some points raised that need addressing, so here we go.
Firstly, many, many people seem to think that I am opposed to removing the use of uninitialised memory. I am not. As has been pointed out, this leads to undefined behaviour – and whilst that’s probably not a real issue given the current state of compiler technology, I can certainly believe in a future where compilers are clever enough to work out that on some calls the memory is not initialised and take action that might be unfortunate. I would also note in passing that my copy of K&R (second edition) does not discuss this issue, and ISO/IEC 9899, which some have quoted in support, rather post-dates the code in OpenSSL. To be clear, I am now in favour of addressing this issue correctly.
And this leads me to the second point. Many people seem to be confused about what change was actually made. There were, in fact, two changes. The first concerned a function called
ssleay_rand_add(). As a developer using OpenSSL you would never call this function directly, but it is usually (unless a custom PRNG has been substituted, as happens in FIPS mode, for example) called indirectly via
RAND_add(). This call is the only way entropy can be added to the PRNG’s pool. OpenSSL calls
RAND_add() on buffers that may not have been initialised in a couple of places, and this is the cause of the valgrind warnings. However, rather than fix the calls to
RAND_add(), the Debian maintainer instead removed the code that added the buffer handed to
ssleay_rand_add() to the pool. This meant that the pool ended up with essentially no entropy. Clearly this was a very bad idea.
The second change was in
ssleay_rand_bytes(), a function that extracts randomness from the pool into a buffer. Again, applications would access this via
RAND_bytes() rather than directly. In this function, the contents of the buffer before it is filled are added to the pool. Once more, this could be uninitialised. The Debian developer also removed this call, and that is fine.
The third point: several people have come to the conclusion that OpenSSL relies on uninitialised memory for entropy. This is not so. OpenSSL gets its entropy from a variety of platform-dependent sources. Uninitialised memory is merely a bonus source of potential entropy, and is not counted as “real” entropy.
Fourthly, I said in my original post that if the Debian maintainer had asked the developers, then we would have advised against such a change. About 50% of the comments on my post point to this conversation on the openssl-dev mailing list. In this thread, the Debian maintainer states his intention to remove for debugging purposes a couple of lines that are “adding an unintialiased buffer to the pool”. In fact, the first line he quotes is the first one I described above, i.e. the only route to adding anything to the pool. Two OpenSSL developers responded, the first saying “use -DPURIFY” and the second saying “if it helps with debugging, I’m in favor of removing them”. Had they been inspired to check carefully what these lines of code actually were, rather than believing the description, then they would, indeed, have noticed the problem and said something, I am sure. But their response can hardly be taken as unconditional endorsement of the change.
Fifthly, I said that openssl-dev was not the way to ensure you had the attention of the OpenSSL team. Many have pointed out that the website says it is the place to discuss the development of OpenSSL, and this is true, it is what it says. But it is wrong. The reality is that the list is used to discuss application development questions and is not reliably read by the development team.
Sixthly, my objection to the fix Debian put in place has been misunderstood. The issue is not that they did not fully reverse their previous patch – as I say above, the second removal is actually fine. My issue is that it was committed to a public repository five days before an advisory was issued. Only a single attacker has to notice that and realise its import in order to start exploiting vulnerable systems – and I will be surprised if that has not happened.
I think that’s about enough clarification. The question is: what should we do to avoid this happening again? Firstly, if package maintainers think they are fixing a bug, then they should try to get it fixed upstream, not fix it locally. Had that been done in this case, there is no doubt none of this would have happened. Secondly, it seems clear that we (the OpenSSL team) need to find a way that people can reliably communicate with us in these kinds of cases.
The problem with the second is that there are a lot of people who think we should assist them, and OpenSSL is spectacularly underfunded compared to most other open source projects of its importance. No-one that I am aware of is paid by their employer to work full-time on it. Despite the widespread use of OpenSSL, almost no-one funds development on it. And, indeed, many commercial companies who absolutely depend on it refuse to even acknowledge publicly that they use it, despite the requirements of the licence, let alone contribute towards it in any way.
I welcome any suggestions to improve this situation.
Incidentally, some of the comments are not exactly what I would consider appropriate, and there’s a lot of repetition. I moderate comments on my blog, but only to remove spam (and the occasional cockup, such as people posting twice, not realising they are being moderated). I do not censor the comments, so don’t blame me for their content!