Ben Laurie blathering

10 Jan 2009

Jabber Pain

Filed under: Programming,Troubleshooting — Ben @ 17:59

For a while, its been apparent to me that Jabber was occasionally dropping messages. Last week I finally got annoyed enough to investigate it in earnest.

Unfortunately, I started off on entirely the wrong track, and blamed GTalk (sorry, guys!) – but much investigation later, with help from some very patient friends (you know who you are: thanks!), I found that it was my own Jabber server that was to blame.

However, it was not an easy journey. First of all, how do you tell messages are being dropped? I am pretty certain my server has been dropping messages since before Christmas – i.e. at least four weeks, and I am fairly certain it has been doing it ever since I first built it – which must be a year or two now. Could it be that it could drop messages for that long and no-one noticed? It seems to me, in retrospect, that it could! A wise friend of mine once said, “you know, 90% of what we say to each other could be completely different and it would make no difference”. This is even more true for IM. We send messages out. Sometimes we get answers. Sometimes we don’t. If we don’t, well, the other guy was away, or not interested, or got busy and forgot to respond. It’s fine. It was probably one of the 90%. When it’s one of the 10%, well, then we say it again. And this time we get an answer, and we’re both happy. So, it you can go on for years and not notice that stuff is missing.

It wasn’t until I started badgering my friends to tell me when they thought messages were going missing that it became clear that they were, indeed. And not just a few – a lot! I now know that it was dropping about 50% of incoming messages (i.e. messages sent to me) and no outgoing messages. God knows what kind of rude bastard my friends think I am by now! An interesting feature is that it would drop them in batches – i.e. drop for 5 minutes, forward for 5 minutes, drop for 5 minutes and so on. If it had been every second message it would have been apparent sooner, I suspect, because the conversation would be quite choppy.

But even knowing that messages were being dropped was not the end of the story. How do you figure out what is to blame? In the typical scenario, because I run my own server, there are at least 3 connections and 4 pieces of software that could be at fault.

  1. The other guy’s client,
  2. the connection from that to their server,
  3. their server,
  4. the connection from their server to my server,
  5. my server,
  6. the connection from my server to my client,
  7. my client

As I said above, I started at the wrong end – with GTalk. With some help, it became apparent that GTalk was unlikely to be to blame (and because it was upstream from the other guy’s client, we could eliminate that, too). So the next easiest target to look at was my server – which I did, with the help of tcpdump and Wireshark, though investigation was complicated by both OTR and SSL, which make it very hard to interpret and track messages. Luckily the server-to-server connection was in plain text (which is one reason I use OTR), so it could be done, with difficulty – particularly since it turned out that my jabber daemon was the culprit – so I could see messages coming in in the traces, and no corresponding activity in the server-to-client connection. Sometimes.

To cut a long story short, after much poking at my existing jabber server, which was jabberd14, I decided to replace it with jabberd2. But before I did that, I wanted to be really sure that jabberd14 was to blame, and that jabberd2 would fix it. So, I wanted the Jabber equivalent of ping. To my amazement, there appears to be no such thing! There is a Jabber ping extension but I can’t find anything that uses it. Which is the final reason I am writing this blog post – I wrote a pair of scripts that will do a Jabber ping test, and, feel free to use them. And if you are using jabberd14, I’d really like to know if you, too, get message drops…

I was planning to make them count and produce statistics and such, but I got lazy. Since you can see both ends, eyeballing them is enough to let you know what’s going on – Ping does count how many it got back, though, so you can leave it running without watching it all the time. To run them, you need two Jabber accounts, one on the suspect server and the other elsewhere. You can run them like this:

./ account1 password1
./ account2 password2

Pong will actually answer multiple Pings running simultaneously. Ping pings every 10 seconds. Output should be reasonably obvious. Because Jabber does store-and-forward, Ping will ignore Pongs from a different session. And because they use different resources, you can use the same account at both ends, if you want. Like I say, I’d be really interested to hear from anyone that experiences drops – a couple of hundred pings was always enough to show them when I was testing.

Oh yeah, and the good news: jabberd2 has now answered over 500 pings without a single drop. So, if you felt ignored, I hope things will improve!

7 Jan 2009

Yet Another Serious Bug That’s Been Around Forever

Filed under: Crypto,Open Source,Programming,Rants,Security — Ben @ 17:13

Late last year the Google Security Team found a bug in OpenSSL that’s been there, well, forever. That is, nearly 10 years in OpenSSL and, I should think, for as long as SSLeay existed, too. This bug means that anyone can trivially fake DSA and ECDSA signatures, which is pretty damn serious. What’s even worse, numerous other packages copied (or independently invented) the same bug.

I’m not sure what to say about this, except to reiterate that it seems people just aren’t very good at writing or reviewing security-sensitive code. It seems to me that we need better static analysis tools to catch this kind of obvious error – and so I should bemoan, once more, that there’s really no-one working seriously on static analysis in the open source world, which is a great shame. I’ve even offered to pay real money to anyone (credible) that wants to work in this area, and still, nothing. The closed source tools aren’t that great, either – OpenSSL is using Coverity’s free-for-open-source service, and it gets a lot of false positives. And didn’t find this rather obvious (and, obviously staticly analysable) bug.

Oh, I should also say that we (that is, the OpenSSL Team) worked with oCERT for the first time on coordinating a response with other affected packages. It was a very easy and pleasant experience, I recommend them highly.

29 Oct 2008

Yahoo, Caja, OpenSocial

Filed under: Caja,Capabilities,Open Source,Programming,Security — Ben @ 13:01

I’m very excited that Yahoo! have launched their gadget platforms, including an OpenSocial platform. Why am I excited? Because Yahoo! require all gadgets to use Caja so that they can be sure the gadgets behave themselves without review. Caja allows the container (i.e. Yahoo!’s platform, in this case) to completely confine the untrusted Javascript (i.e. the gadget, in this case), only allowing it to perform “safe” operations. All other platforms either have to manually review gadgets or take the risk that the gadgets will do something evil to their users.

19 Oct 2008


Filed under: Crypto,Open Source,Programming,Security — Ben @ 19:12

Just for fun, I wrote a demo implementation of J-PAKE in C, using OpenSSL for the crypto, of course. I’ve pushed it into the OpenSSL CVS tree; you can find it in demos/jpake. For your convenience, there’s also a copy here.

I’ve tried to write the code so the data structures reflect the way a real implementation would work, so there’s a structure representing what each end of the connection knows (JPakeUser), one for the zero-knowledge proofs (JPakeZKP) and one for each step of the protocol (JPakeStep1 and JPakeStep2). Normally there should be a third step, where each end proves knowledge of the shared key (for example, by Alice sending Bob H(H(K)) and Bob sending Alice H(K)), since differing secrets do not break any of the earlier steps, but because both ends are in the same code I just compare the resulting keys.

The code also implements the protocol steps in a modular way, except that communications happen by magic. This will get cleaned up when I implement J-PAKE as a proper OpenSSL library component.

The cryptographic implementation differs from the Java demo (which I used for inspiration) in a few ways. I think only one of them really matters: the calculation of the hash for the Schnorr signature used in the zero-knowledge proofs – the Java implementation simply concatenates a byte representation of the various parameters. This is a security flaw, as it can be subjected to a “moving goalposts” attack. That is, the attacker could use parameters that gave the same byte representation, but with different boundaries between the parameters. I avoid this attack by including a length before each parameter. Note that I do not claim this attack is feasible, but why gamble? It worked on PGP, after all.

The code and data structures are completely different, though. Also, because of the cryptographic difference, the two implementations would not interoperate.

19 Jul 2008

Caja Security Review

Filed under: Programming,Security — Ben @ 16:00

A few weeks ago, we invited a group of external security experts to come and spend a week trying to break Caja. As we expected, they did. Quite often. In fact, I believe a team member calculated that they filed a new issue every 5 minutes throughout the week.

The good news, though, was that nothing they found was too hard to fix. Also, their criticism has led to some rethinking about some aspects of our approach which we hope will make the next security review easier as well as Caja more robust.

You can read a summary of their findings.

22 May 2008

Preprint: Access Control

Filed under: Capabilities,Programming,Security — Ben @ 17:07

I have three currently unpublished papers that may be of interest. This one has been submitted but not yet accepted. As you can guess from the title, it’s about access control, particularly in the area of mashups, gadgets and web applications.

This is the introduction:

Access control is central to computer security. Traditionally, we wish to restrict the user to exactly what he should be able to do, no more and no less.

You might think that this only applies to legitimate users: where do attackers fit into this worldview? Of course, an attacker is a user whose access should be limited just like any other. Increasingly, of course, computers expose services that are available to anyone — in other words, anyone can be a a legitimate user.

As well as users there are also programs we would like to control. For example, the program that keeps the clock correctly set on my machine should be allowed to set the clock and talk to other time-keeping programs on the Internet, and probably nothing else\footnote{Perhaps it should also be allowed a little long-term storage, for example to keep its calculation of the drift of the native clock.}.

Increasingly we are moving towards an environment where users choose what is installed on their machines, where their trust in what is installed is highly variable\footnote{A user probably trusts their
operating system more than their browser, their browser more than the pages they browse to and some pages more than others.} and where “installation” of software is an increasingly fluid concept,
particularly in the context of the Web, where merely viewing a page can cause code to run.

In this paper I explore an alternative to the traditional mechanisms of roles and access control lists. Although I focus on the use case of web pages, mashups and gadgets, the technology is appliable to all access control.

And the paper is here.

Regular readers will not be surprised to hear I am talking about capabilities.

15 May 2008

Debian and OpenSSL: The Last Word?

Filed under: Open Source,Programming,Security — Ben @ 15:59

I am reliably informed that, despite my previous claim, at least one member of the OpenSSL team does read openssl-dev religiously. For which he should be commended. I read it sometimes, too, but not religiously.

So, forget I said that you don’t reach the OpenSSL developers by posting on openssl-dev.

14 May 2008

Debian and OpenSSL: The Aftermath

Filed under: Open Source,Programming,Security — Ben @ 10:09

There have been an astonishing number of comments on my post about the Debian OpenSSL debacle, clearly this is a subject people have strong feelings about. But there are some points raised that need addressing, so here we go.

Firstly, many, many people seem to think that I am opposed to removing the use of uninitialised memory. I am not. As has been pointed out, this leads to undefined behaviour – and whilst that’s probably not a real issue given the current state of compiler technology, I can certainly believe in a future where compilers are clever enough to work out that on some calls the memory is not initialised and take action that might be unfortunate. I would also note in passing that my copy of K&R (second edition) does not discuss this issue, and ISO/IEC 9899, which some have quoted in support, rather post-dates the code in OpenSSL. To be clear, I am now in favour of addressing this issue correctly.

And this leads me to the second point. Many people seem to be confused about what change was actually made. There were, in fact, two changes. The first concerned a function called ssleay_rand_add(). As a developer using OpenSSL you would never call this function directly, but it is usually (unless a custom PRNG has been substituted, as happens in FIPS mode, for example) called indirectly via RAND_add(). This call is the only way entropy can be added to the PRNG’s pool. OpenSSL calls RAND_add() on buffers that may not have been initialised in a couple of places, and this is the cause of the valgrind warnings. However, rather than fix the calls to RAND_add(), the Debian maintainer instead removed the code that added the buffer handed to ssleay_rand_add() to the pool. This meant that the pool ended up with essentially no entropy. Clearly this was a very bad idea.

The second change was in ssleay_rand_bytes(), a function that extracts randomness from the pool into a buffer. Again, applications would access this via RAND_bytes() rather than directly. In this function, the contents of the buffer before it is filled are added to the pool. Once more, this could be uninitialised. The Debian developer also removed this call, and that is fine.

The third point: several people have come to the conclusion that OpenSSL relies on uninitialised memory for entropy. This is not so. OpenSSL gets its entropy from a variety of platform-dependent sources. Uninitialised memory is merely a bonus source of potential entropy, and is not counted as “real” entropy.

Fourthly, I said in my original post that if the Debian maintainer had asked the developers, then we would have advised against such a change. About 50% of the comments on my post point to this conversation on the openssl-dev mailing list. In this thread, the Debian maintainer states his intention to remove for debugging purposes a couple of lines that are “adding an unintialiased buffer to the pool”. In fact, the first line he quotes is the first one I described above, i.e. the only route to adding anything to the pool. Two OpenSSL developers responded, the first saying “use -DPURIFY” and the second saying “if it helps with debugging, I’m in favor of removing them”. Had they been inspired to check carefully what these lines of code actually were, rather than believing the description, then they would, indeed, have noticed the problem and said something, I am sure. But their response can hardly be taken as unconditional endorsement of the change.

Fifthly, I said that openssl-dev was not the way to ensure you had the attention of the OpenSSL team. Many have pointed out that the website says it is the place to discuss the development of OpenSSL, and this is true, it is what it says. But it is wrong. The reality is that the list is used to discuss application development questions and is not reliably read by the development team.

Sixthly, my objection to the fix Debian put in place has been misunderstood. The issue is not that they did not fully reverse their previous patch – as I say above, the second removal is actually fine. My issue is that it was committed to a public repository five days before an advisory was issued. Only a single attacker has to notice that and realise its import in order to start exploiting vulnerable systems – and I will be surprised if that has not happened.

I think that’s about enough clarification. The question is: what should we do to avoid this happening again? Firstly, if package maintainers think they are fixing a bug, then they should try to get it fixed upstream, not fix it locally. Had that been done in this case, there is no doubt none of this would have happened. Secondly, it seems clear that we (the OpenSSL team) need to find a way that people can reliably communicate with us in these kinds of cases.

The problem with the second is that there are a lot of people who think we should assist them, and OpenSSL is spectacularly underfunded compared to most other open source projects of its importance. No-one that I am aware of is paid by their employer to work full-time on it. Despite the widespread use of OpenSSL, almost no-one funds development on it. And, indeed, many commercial companies who absolutely depend on it refuse to even acknowledge publicly that they use it, despite the requirements of the licence, let alone contribute towards it in any way.

I welcome any suggestions to improve this situation.

Incidentally, some of the comments are not exactly what I would consider appropriate, and there’s a lot of repetition. I moderate comments on my blog, but only to remove spam (and the occasional cockup, such as people posting twice, not realising they are being moderated). I do not censor the comments, so don’t blame me for their content!

13 May 2008

Vendors Are Bad For Security

Filed under: Open Source,Programming,Security — Ben @ 14:09

I’ve ranted about this at length before, I’m sure – even in print, in O’Reily’s Open Sources 2. But now Debian have proved me right (again) beyond my wildest expectations. Two years ago, they “fixed” a “problem” in OpenSSL reported by valgrind[1] by removing any possibility of adding any entropy to OpenSSL’s pool of randomness[2].

The result of this is that for the last two years (from Debian’s “Etch” release until now), anyone doing pretty much any crypto on Debian (and hence Ubuntu) has been using easily guessable keys. This includes SSH keys, SSL keys and OpenVPN keys.

What can we learn from this? Firstly, vendors should not be fixing problems (or, really, anything) in open source packages by patching them locally – they should contribute their patches upstream to the package maintainers. Had Debian done this in this case, we (the OpenSSL Team) would have fallen about laughing, and once we had got our breath back, told them what a terrible idea this was. But no, it seems that every vendor wants to “add value” by getting in between the user of the software and its author.

Secondly, if you are going to fix bugs, then you should install this maxim of mine firmly in your head: never fix a bug you don’t understand. I’m not sure I’ve ever put that in writing before, but anyone who’s worked with me will have heard me say it multiple times.

Incidentally, while I am talking about vendors who are bad for security, it saddens me to have to report that FreeBSD, my favourite open source operating system, are also guilty. Not only do they have local patches in their ports system that should clearly be sent upstream, but they also install packages without running the self-tests. This has bitten me twice by installing broken crypto, most recently in the py-openssl package.

[1] Valgrind is a wonderful tool, I recommend it highly.

[2] Valgrind tracks the use of uninitialised memory. Usually it is bad to have any kind of dependency on uninitialised memory, but OpenSSL happens to include a rare case when its OK, or even a good idea: its randomness pool. Adding uninitialised memory to it can do no harm and might do some good, which is why we do it. It does cause irritating errors from some kinds of debugging tools, though, including valgrind and Purify. For that reason, we do have a flag (PURIFY) that removes the offending code. However, the Debian maintainers, instead of tracking down the source of the uninitialised memory instead chose to remove any possibility of adding memory to the pool at all. Clearly they had not understood the bug before fixing it.

P.S. I’d link to the offending patch in Debian’s source repository. If I could find a source repository. But I can’t.


Thanks to Cat Okita, I have now found the repo. Here’s the offending patch. But I have to admit to being astonished again by the fix, which was committed five days before the advisory! Do these guys have no clue whatsoever?

16 Apr 2008

Nice Review of Caja

Filed under: Capabilities,Open Source,Programming,Security — Ben @ 1:41

Tim Oren posted about Caja.

…this adds up to a very good chance that something that’s right now fairly obscure could turn into a major force in Web 2.0 within months, not years. Because Caja modifies the de facto definition of JavaScript, it would have an immediate impact on any scripts and sites that are doing things regarded as unsafe in the new model. If you’ve got a Web 2.0 based site, get ready for a project to review for ‘Caja-safety’. If the Caja model spreads, then the edges of the sandbox are going to get blurry. Various users and sites will be able to make choices to allow more powerful operations, and figuring out which ones are significant and allow enhanced value could be a fairly torturous product management challenge, and perhaps allow market entry chances for more powerful forms of widgets and Facebook-style ‘apps’.

End of message.

24 Mar 2008

Fun With Dot

Filed under: Programming — Ben @ 14:59

As I’ve mentioned before, people don’t really talk much about the experience of writing and debugging code, so here’s another installation in my occasional series on doing just that.

Over the Easter weekend the weather has been pretty horrible, so, instead of having fun on my motorbike, I’ve been amusing myself in various ways: trying to finish up a paper I started last year, doing further work on OpenSSL and Deputy, cooking, playing with Tahoe (on which more later), updating my FreeBSD machines, and messing around with Graphviz.

Graphviz is one of my favourite toys. Basically, it lets you specify how a bunch of things are connected, and then it will draw them for you. A project I’d long had in my head was to take all the RFCs, work out which ones reference which other ones and have Graphviz draw it for me. Getting the data turned out to be pretty easy, but unfortunately the resulting dataset proves to be too much for poor old Graphviz, exposing all sorts of bugs in its drawing engine, which led to core dumps and/or complete garbage in the output files. Shame, early experiments promised some quite pretty output. Anyway, after banging my head against that for many hours, I gave up and instead did something I do every few months: got my various FreeBSD machines up-to-date. As part of that process, I had to look at the stuff that FreeBSD runs at startup, configured in /etc/rc.conf (and /etc/defaults/rc.conf), and actually done by scripts in /etc/rc.d and /usr/local/etc/rc.d.

This reminded me that these scripts expose their dependencies in comments, like this (from /etc/rc.d/pfsync)

# PROVIDE: pfsync

So, I thought it would be fun to graph those dependencies – then at least I’d have one pretty thing to show for the weekend. Then, since it only took 15 minutes to do, I thought it might make an interesting subject for a post on how I go about coding such things.

So, first things first, I like some instant gratification, so step one is to eat the rc files and see if I can parse them. I wasn’t quite sure if /etc/rc.d had subdirectories, but since I had some code already to read all files in a directory and all its subdirectories (from my failed attempt at the RFCs) I just grabbed that and edited it slightly:

sub findRCs {
    my $dir = shift;

    opendir(D, $dir) || croak "Can't open $dir: $!";
    while(my $f = readdir D) {
	next if $f eq '.' || $f eq '..';
	my $file="$dir/$f";
	if(-d $file) {

This will call readRC for each file in the provided directory. My first version of readRC looked like this:

sub readRC {
    my $file = shift;

    my $rc = read_file($file);

    my($provide) = $rc =~ /^\# PROVIDE: (\S+)$/m;
    croak "Can't find PROVIDE in $file" if !$provide;

    print "$file: $provide\n";

Note that I assume that each file PROVIDEs only one thing, since I match \S+ (i.e. 1 or more non-whitespace characters), and force the matched string to span a whole line. This starts off well

/etc/rc.d/accounting: accounting
/etc/rc.d/amd: amd
/etc/rc.d/addswap: addswap

but ends

Can't find PROVIDE in /etc/rc.d/NETWORKING at ./ line 13
main::readRC('/etc/rc.d/NETWORKING') called at ./ line 30
main::findRCs('/etc/rc.d') called at ./ line 35

oops. If we look at the offending file, we see

# REQUIRE: netif netoptions routing network_ipv6 isdnd ppp
# REQUIRE: routed mrouted route6d mroute6d resolv

OK, so it provides two things, it seems. Fair enough, I can fix that, I just have to elaborate the matching slightly

my($provide) = $rc =~ /^\# PROVIDE: (.+)$/m;
croak "Can't find PROVIDE in $file" if !$provide;

my @provide = split /\s/,$provide;

print "$file: ", join (', ',@provide), "\n";

In other words, match everything after PROVIDE: and then split it on whitespace. Notice that this file also has multiple REQUIRE lines – lucky I noticed that, it could easily have escaped my attention. Anyway, after this modification, I can read the whole of /etc/rc.d. Now I need to match the requirements, which I do like this

my(@lrequire) = $rc =~ /^# REQUIRE: (.+)$/mg;
my @require = split /\s/, join(' ', @lrequire);

Another test, just printing what I extracted (print ' ', join (', ',@require), "\n";) and this seems to work fine. So far I’ve only been testing with /etc/rc.d, but now I’m almost ready to start graphing, I also test /usr/local/etc/rc.d

Can't find PROVIDE in /usr/local/etc/rc.d/ at ./ line 13
main::readRC('/usr/local/etc/rc.d/') called at ./ line 35
main::findRCs('/usr/local/etc/rc.d') called at ./ line 40

OK, so this is a very old rc file of my own and it has no require/provides stuff. In fact, it totally departs from the spec. Whatever … I decide to just skip files that don’t include REQUIRE

    if($rc !~ /PROVIDE/) {
	print STDERR "Skipping $file\n";

A quick test confirms that it only skips that one file, and now everything works. OK, so time to graph! All I need to do is generate a file in a format Graphviz can read, which is amazingly easy. First I have to output a header

print "digraph rcs {\n";
print " node [fontname=\"Courier\"];\n";

then a line for each dependency

    foreach my $p (@provide) {
	foreach my $r (@require) {
	    print "  $r -> $p; \n";

and finally a trailer

print "}\n";

This produces a file that looks like this

digraph rfcs {
  node [fontname="Courier"];
  mountcritremote -> accounting; 
  rpcbind -> amd; 
  ypbind -> amd; 
  nfsclient -> amd; 
  cleanvar -> amd; 

which I can just feed to dot (one of the Graphviz programs), like so

dot -v -Tpng -o ~/tmp/rc.png /tmp/xx

and I get a lovely shiny graph. But while I’m admiring it, I notice that ramdisk has a link to itself, which seems a bit rum. On closer inspection, /etc/rc.d/ramdisk says

# PROVIDE: ramdisk
# REQUIRE: localswap

which doesn’t include a self-reference. Odd. Looking at the output from my script I notice

ramdisk -> ramdisk-own;

Guessing wildly that dot doesn’t like the “-“, I modify the output slightly

    foreach my $p (@provide) {
	foreach my $r (@require) {
	    print "  \"$r\" -> \"$p\"; \n";

and bingo, it works. Putting it all together, here’s the final script in full

#!/usr/bin/perl -w

use strict;
use File::Slurp;
use Carp;

sub readRC {
    my $file = shift;

    my $rc = read_file($file);

    if($rc !~ /PROVIDE/) {
	print STDERR "Skipping $file\n";

    my($provide) = $rc =~ /^\# PROVIDE: (.+)$/m;
    croak "Can't find PROVIDE in $file" if !$provide;
    my @provide = split /\s/, $provide;

    my(@lrequire) = $rc =~ /^# REQUIRE: (.+)$/mg;
    my @require = split /\s/, join(' ', @lrequire);

    foreach my $p (@provide) {
	foreach my $r (@require) {
	    print "  \"$r\" -> \"$p\"; \n";

sub findRCs {
    my $dir = shift;

    opendir(D, $dir) || croak "Can't open $dir: $!";
    while(my $f = readdir D) {
	next if $f eq '.' || $f eq '..';
	my $file="$dir/$f";
	if(-d $file) {

print "digraph rfcs {\n";
print "  node [fontname=\"Courier\"];\n";

while(my $dir=shift) {

print "}\n";

and running it

./ /etc/rc.d /usr/local/etc/rc.d > /tmp/xx
dot -v -Tpng -o ~/tmp/rc.png /tmp/xx

And finally, here’s the graph. Interesting that randomness is at the root!

RC dependencies

13 Mar 2008

Microsoft’s Open Specification Promise

Filed under: Open Source,Programming — Ben @ 20:11

The Software Freedom Law Centre has published an analysis of the OSP. I don’t really care whether the OSP is compatible with the GPL, but their other points are a concern for everyone relying on the OSP, whether they write free software or not.

5 Feb 2008

Caja in the News

Filed under: Capabilities,Open Source,Programming,Security — Ben @ 20:07

It seems MySpace’s developer launch today is causing Caja to get splattered all over the place.

27 Jan 2008

Open Source Is Just Economics

Filed under: General,Open Source,Programming — Ben @ 5:39

A number of conversations I have had recently indicate to me that a lot of the world still doesn’t get what’s behind open source. It’s easy: economics.

The first thing you can trivially explain is why people work on open source at all. This has been a source of a vast amount of speculation, particularly irritatingly by sociologists. Ben Hyde has a fantastic list to which I will only add the explanation I love to hate: geek pride. We do it just to show off to each other.

Nope, it’s all bollocks – the motivation is simple: by solving your common problem together, you reduce your costs. There is absolutely no point in financing five different companies to produce five different products that don’t quite do what you want – far better to tweak the open source thing to do exactly what you need (often expressed as “scratching your itch” around the ASF).

Some people whine that, because this is an option open only to geeks, open source is not really available to completely open participation. Well, kinda. If you aren’t a geek yourself, you can always hire one. What do you mean, you don’t want to spend your money on free stuff? Why not? We all spend our time on it. Time that we could convert into money, if we so chose.

So why don’t we? Because participating in the open source projects we participate in is worth more to us, in purely monetary terms, in the long run. This is why I no longer have much to do with Apache: it does what I need. I have no itch to scratch.

This leads me into the second easily explainable fact. People complain that open source projects don’t care about users. It’s true. They don’t – they care about people who are participating in the costs of producing the software. If you aren’t contributing, why would your voice matter?

Of course, you have to be careful when applying these obvious truths to what you see around you. For example, the presence of companies like Red Hat in the market complicates analysis. They have their own set of economic drivers, including the needs of their customers, which they then apply to the calculation around their participation in various projects. As the reach of open source extends, so do end users actually start to get an indirect say in what happens. But it costs them. Money.

Back in the good old days, it was so much simpler. All it cost me then was time.

25 Jan 2008

Caja, Shindig and OpenSocial

Filed under: Open Source,Programming,Security — Ben @ 23:07

Its been a while since I wrote about Caja but we’ve been working hard on it and it has come along in leaps and bounds, thanks to my excellent team at Google.

Today I’m very pleased to be able to point you at a test gadget container which supports Cajoling of gadgets. This is based on the open source OpenSocial container, Shindig.

Here’s the announcement, and there’s also some documentation on how to get things working with Caja. We’ve even included a couple of malicious gadgets which are defeated by Caja.

Feedback, as always, welcome.

19 Jan 2008

Deputy, Delta and Type Checking in C

Filed under: Programming,Security — Ben @ 19:28

Another thing I never write about but am very interested in is static analysis. For the non-geeks amongst my readers, static analysis is all about looking at code to see what you can figure out about it. For example, you might try to find input values that cause a buffer overflow. Or you might check to see that strings are correctly escaped before being posted to a Web page (that is, the bug that is at the heart of cross-site scripting has been avoided).

Of course static analysis is usually done by programs, perhaps with the assistance of the programmer, rather than by people, so I am always on the look out for new approaches and new software. Unfortunately, as in many areas of academia, the gap between theory and practice is rather large so I do not find myself exactly overwhelmed with choice.

So far the only thing that I’ve been even a little happy with is Coverity. This still gets it wrong about half the time, but that’s a pretty tolerable ratio given how painful a manual audit would be. In contrast, some of the other tools I have tried over the years have false positive rates well over 99.9%.
Most of them just plain don’t work. Pretty much all of them are not supported. And those that are, like Coverity, cost a fortune.

If I was not a convert to the cause of static analysis, I would despair. As it is, I do occasionally feel tempted to sit down for a year or two and tackle the problem myself but sanity soon prevails and I put that idea off for another decade. So, I was happy to come across Deputy recently, after an animated thread or two on the Robust Open Source mailing list (which I am shocked to discover I have been on since it started, way back in 1998 – archives here and here).

Deputy attempts to provide type safety in C programmes. This is, of course, impossible … but it has a good attempt at it. Although ordinary programmers might not think so, to the academic type safety means enforcing things like array lengths, so our favourite C security problem, the buffer overflow, would be a thing of the past if we had typesafe programs.

Anyone who had read the code I wrote for safe stacks in OpenSSL or the module API in Apache 2.0 will know that I am a big fan of type safety in C. Both of these try to ensure that if you get confused about what type you should be using, you will get a compile-time failure. Unfortunately C provides the programmer with a plethora of ways to both deliberately and accidentally avoid any safety nets you might put out for him. The idea behind Deputy is to make it possible to do the type checking rather more rigorously. In order to allow this, you have to provide deputy with extra clues.

The syntax is a little idiosyncratic, but generally the annotation is quite straightforward, for example

void * (DALLOC(n) malloc)(size_t n);

would tell it that malloc is a memory allocator that returns n bytes of memory. Deputy catches many errors at compile time, but those it can’t it will attempt to catch at runtime instead, by injecting extra code to make sure pointers stay within bounds, for example. I haven’t got that far, though, because my benchmark for these projects is to use them on something real, like OpenSSL. I am pleased to report, though, that Deputy has so far built several OpenSSL source files without driving me completely crazy. But more on that later.

In the course of using Deputy I have been reminded of two things worth mentioning in passing. One is a trick we use in OpenSSL to do type checking. If you want to ensure that something is of type T, then you can write this

(1 ? x : (T)0)

weird, huh? How it works is that both sides of a ? : operator must have the same type, so if x is not of type T, then you will get a compile-time error. Very handy in macros, especially where you are abusing types heavily – for example when you are implementing a generic stack, but you wan to ensure that any particular stack consists only of one type of object (see safestack.h in OpenSSL for an example).

The other is delta. Delta is a very cute tool that cuts down a file with an “interesting” feature to a smaller one with the same feature. For example, suppose (as happened to me) I have an error that I can’t reproduce in a small example. Now what? Delta to the rescue. Today I had a problem with Deputy wanting me to add extra annotation that seems unnecessary. Small examples of essentially the same code did not show the same issue. What do do? Delta reduced the original offending source from 2424 lines to just 18 that produce the same bug. And it did it in about 5 minutes.

For interest, here are the 18 lines

typedef unsigned int __uint32_t;
typedef __uint32_t __size_t;
typedef __size_t size_t;
void *malloc(size_t);
void *memcpy(void * , const void * , size_t);
# 77 "mem.c"
static void * (DALLOC(n) *malloc_func)(size_t n) = malloc;
static void *default_malloc_ex(size_t num, const char *file, int line) {
return malloc_func(num);
static void *(*malloc_ex_func)(size_t, const char *file, int line) = default_malloc_ex;
void *CRYPTO_realloc_clean(void *str, int old_len, int num, const char *file, int line) {
void *ret = ((void *)0);
if(ret) {

Funnily enough delta was created to assist in debugging another static analysis system, Oink. So far, I’ve never used it for anything else.

24 Dec 2007

Handling Private Data with Capabilities

Filed under: Anonymity/Privacy,Capabilities,Programming,Security — Ben @ 7:10

A possibility I’ve been musing about that Caja enables is to give gadgets capabilities to sensitive (for example, personal) data which are opaque to the gadgets but nevertheless render appropriately when shown to the user.

This gives rise to some interesting, perhaps non-obvious consequences. One is that a sorted list of these opaque capabilities would itself have to be opaque, otherwise the gadget might be able to deduce things from the order. That is, the capabilities held in the sorted list would have to be unlinkable to the original capabilities (I think that’s the minimum requirement). This is because sort order reveals data – say the capabilities represented age or sexual preference and the gadget knows, for some other reason, what that is for one member of the list. It would then be able to deduce information about people above or below that person in the list.

Interestingly, you could allow the gadget to do arbitrary processing on the contents of the opaque capabilities, so long as it gave you (for example) a piece of code that could be confined only to do processing and no communication. Modulo wall-banging, Caja could make that happen. Although it might initially sound a bit pointless, this would allow the gadget to produce output that could be displayed to the user, despite the gadget itself not being allowed to know that output.

Note that because of covert channels, it should not be thought that this prevents the leakage of sensitive data – to do that, you would have to forbid any processing by the gadget of the secret data. But what this does do is prevent inadvertent leakage of data by (relatively) benign gadgets, whilst allowing them a great deal of flexibility in what they do with that data from the user’s point of view.

3 Dec 2007

Caja and OpenSocial

Filed under: Capabilities,Open Source,Programming,Security — Ben @ 2:03

An obvious place to use Caja is, of course, in OpenSocial. So, a bunch of us at Google have been experimenting with this use case and the first outcome is an update to the container sample which allows you to try running your gadget Caja-ised (gotta think of a better name for that). We even have instructions on how to Caja-ise your gadget.

We haven’t tried many gadgets yet, but the good news is the example gadgets worked with (almost[1]) no change. It seems clear that more complex gadgets are not likely to survive without at least some change but we don’t yet know how hard that’s going to be. Feedback, as always, welcome! And don’t forget to join the mailing list to discuss it.

[1] Right now, because Caja-ised code gets pushed into its own sandbox, you have to export any functions that need to be visible to the rest of the page (for example functions that get called when you click a button) – right now, you have to explicitly perform that export but we expect to be able to remove that requirement.

16 Nov 2007

Quilt and SVN: A Slightly Unhappy Marriage

Filed under: Open Source,Programming — Ben @ 5:48

Now that Caja is out in the wild, and I can’t use Google’s internal development tools, I find quilt is coming in handy (why not mercurial queues? I’d prefer it, but the version I can easily install is too old, currently). But, surprisingly for a tool that was designed to assist in open source development, it turns out quilt is a bit weird about co-existing with version control systems.

The issue comes when you finally get approval for your patch and you commit it to the tree. At this point, you want to delete it from the patch series – but quilt won’t let you, because it is applied. If you pop it, then you’ll undo what you’ve just committed. So, what to do? Here’s my ad-hoc recipe

quilt pop -a
patch -p1 < patches/the-bottom-patch svn ci quilt delete the-bottom-patch

and there you are, done. You can even do this retroactively if you forgot to do it as you go along – just miss out the svn ci step. Once you’re back up-to-date you should find that you are still in sync with the head of the tree (assuming no-one committed in the meantime).

14 Nov 2007

Caja Code is Available

Filed under: Capabilities,Open Source,Programming,Security — Ben @ 15:51

Yesterday we put the initial (incomplete) version of the Caja code up at

From now on, all development will be done out in the open. External developers are welcome to come and play, too. Join the mailing list. Write code! Find bugs! Laugh at my mistakes! Have fun!

« Previous PageNext Page »

Powered by WordPress