Links

Ben Laurie blathering

28 Apr 2012

Using Capsicum For Sandboxing

Filed under: Capabilities,General,Programming,Security — Ben @ 18:07

FreeBSD 9.0, released in January 2012, has experimental Capsicum support in the kernel, disabled by default. In FreeBSD 10, Capsicum will be enabled by default.

But unless code uses it, we get no benefit. So far, very little code uses Capsicum, mostly just experiments we did for our paper. I figured it was time to start changing that. Today, I’ll describe my first venture – sandboxing bzip2. I chose bzip2 partly because Ilya Bakulin had already done some of the work for me, but mostly because a common failure mode in modern software is mistakes made in complicated bit twiddling code, such as decompressors and ASN.1 decoders.

These can often lead to buffer overflows or integer over/underflows – and these often lead to remote code execution. Which is bad. bzip2 is no stranger to this problem: CVE-2010-0405 describes an integer overflow that could lead to remote code execution. The question is: would Capsicum have helped – and if it would, how practical is it to convert bzip2 to use Capsicum?

The answers are, respectively, “yes” and “fairly practical”.

First of all, how does Capsicum mitigate this problem? The obvious way to defend a decompressor is to run the decompression engine in a separate process with no privilege beyond that needed to get its job done – which is the ability to read the input and write the output. In Capsicum, this is easy to achieve: once the appropriate files are open, fork the process and enter capability mode in the child. Discard all permissions except the ability to read the input and write the output (in Capsicum, this means close all other file descriptors and limit those two to read and write), and then go ahead and decompress. Should there be a bug in the decompressor, what does the attacker get? Well, pretty much what he had already: the ability to read the input file (he supplied it, so no news there!) and the ability to write arbitrary content to the output file (he already had that, since he could have chosen arbitrary input and compressed it). He also gets to burn CPU and consume memory. But that’s it – no access to your files, the network, any other running process, or anything else interesting.

I think that’s pretty neat.

But how hard is it to do? I answer that question in a series of diffs on GitHub, showing a step-by-step transformation of bzip2 into the desired form. I used a technique I like to call error-driven development; the idea is you attempt to make changes that will cause compilation to fail until you have completely accomplished your goal. This is a useful way to reassure yourself that you have made all necessary updates and there’s nothing hiding away you didn’t take care of. If you follow along by building the various stages, you’ll see how it works.

It turns out that in bzip2 this matters – it isn’t very beautifully written, and the code that looks like it might cleanly just take an input file and an output file and do the work in isolation, actually interacts with the rest of the code through various function calls and globals. This causes a problem: once you’ve forked, those globals and functions are now in the wrong process (i.e. the child) and so it is necessary to use RPC to bridge any such things back to the parent process. Error-driven development assures us that we have caught and dealt with all such cases.

So how did this work out in practice? Firstly, it turns out we have to give the compressor a little more privilege: it writes to stderr if there are problems, so we need to also grant write on stderr (note that we could constrain what it writes with a bit more effort). The callbacks we have to provide do not, I think, let it do anything interesting: cause the program to exit, make the output file’s permissions match the input file’s, and remove the input or output files (ok, removing the input file is slightly interesting – but note that bzip2 does this anyway).

Secondly, because we have not yet decided on an RPC mechanism, this particular conversion involves quite a bit of boilerplate: wrapping and unwrapping arguments for RPCs, wiring them up and all that, all of which would be vastly reduced by a proper RPC generator. Try not to let it put you off 🙂

Finally, the system has (at least) one global, errno. I did not deal with that so far, which means some errors will report the wrong error – but it is not particularly hard to do so.

So, on to the diffs. This is something of an experimental way to present a piece of development, so I’d be interested in feedback. Here they are, in order:

And there you are: bzip2 is now rendered safe from decompressor exploits, and it was only a few hours work. As we refine the support infrastructure, it will be even less work.

2 Jul 2011

Decentralised Currencies Are Probably Impossible (But Let’s At Least Make Them Efficient)

Filed under: General — Ben @ 20:04

How time flies. Following my admittedly somewhat rambling posts on Bitcoin, I decided to write a proper paper about the problem. So, here’s a preprint of “Decentralised Currencies Are Probably Impossible (But Let’s At Least Make Them Efficient)”. It’s short! Enjoy.

I may submit this to a conference, I haven’t decided yet. Suggestions of where are welcome.

By the way, Bitcoin fanboys: I see I have been taken to task for my heretic views on the Bitcoin forums. Since those have cunningly been closed to anyone who does not already have some kind of track record of conforming to the standards of the forums (presumably meaning “don’t diss Bitcoin”) I am unable to respond to comments there, but I would like to note, for the record, that I have not deleted a single non-spam comment on my Bitcoin posts, contrary to claims I see there.

21 May 2011

Bitcoin is Slow Motion

Filed under: Anonymity,Crypto,General,Privacy,Security — Ben @ 5:32

OK, let’s approach this from another angle.

The core problem Bitcoin tries to solve is how to get consensus in a continuously changing, free-for-all group. It “solves” this essentially insoluble problem by making everyone walk through treacle, so it’s always evident who is in front.

But the problem is, it isn’t really evident. Slowing everyone down doesn’t take away the core problem: that someone with more resources than you can eat your lunch. Right now, with only modest resources, I could rewrite all of Bitcoin history. By the rules of the game, you’d have to accept my longer chain and just swallow the fact you thought you’d minted money.

If you want to avoid that, then you have to have some other route to achieve a consensus view of history. Once you have a way to achieve such a consensus, then you could mint coins by just sequentially numbering them instead of burning CPU on slowing yourself down, using the same consensus mechanism.

Now, I don’t claim to have a robust way to achieve consensus; any route seems to open to attacks by people with more resources. But I make this observation: as several people have noted, currencies are founded on trust: trust that others will honour the currency. It seems to me that there must be some way to leverage this trust into a mechanism for consensus.

Right now, for example, in the UK, I can only spend GBP. At any one time, in a privacy preserving way, it would in theory be possible to know who was in the UK and therefore formed part of the consensus group for the GBP. We could then base consensus on current wielders of private keys known to be in the UK, the vast majority of whom would be honest. Or their devices would be honest on their behalf, to be precise. Once we have such a consensus group, we can issue coins simply by agreeing that they are issued. No CPU burning required.

18 Mar 2011

Completely Redundant Disks with ZFS and FreeBSD

Filed under: General,Open Source — Ben @ 15:00

A while back, I bought a ReadyNAS device for my network, attracted by the idea of RAID I can grow over time and mirrored disks.

Today I just finished building the same thing “by hand”, using FreeBSD and ZFS. At a fraction of the cost. Here’s how.

First off, I bought this amazing bargain: an HP ProLiant MicroServer. These would be cheap even at list price, but with the current £100 cashback offer, they’re just stupidly cheap. And rather nice.

Since I want to cater for a realistic future, I am assuming by the time I need to replace a drive I will no longer be able to buy a matching device, so I started from day one with a different second drive (the primary is 250 GB, secondary is 500 GB – both Seagate, which was not the plan, but I’ll remedy that in the next episode). I also added an extra 1GB of RAM to the machine (this is important for ZFS which is apparently not happy with less than 2GB of system RAM).

I then followed, more or less, Pawel’s excellent instructions for creating a fully mirrored setup. However, I had to deviate from them somewhat, so here’s my version.

The broad overview of the process is as follows

  1. Install FreeBSD on the primary disk, using a standard sysinstall.
  2. Create and populate gmirror and ZFS partitions on the secondary disk.
  3. Boot from the primary disk, but mount the secondary.
  4. Create and populate gmirror and ZFS partitions on the primary disk.
  5. Use excess secondary disk as scratch.

In my case the two disks are ad4 (primary, 250 GB) and ad8 (secondary, 500 GB). Stuff I typed is in italic.

Since we need identical size partitions for the mirror, we need to simulate the first disk (since it happens to be smaller). Get the disk’s size

# diskinfo -v /dev/ad4
/dev/ad4
        512             # sectorsize
        250059350016    # mediasize in bytes (233G)
        488397168       # mediasize in sectors
        0               # stripesize
        0               # stripeoffset
        484521          # Cylinders according to firmware.
        16              # Heads according to firmware.
        63              # Sectors according to firmware.
        9VMQN8T5        # Disk ident.

Create a memory disk the same size. Note that the sector sizes must match!

# mdconfig -a -t swap -s 488397168
md0

Verify they are the same.

# diskinfo -v /dev/ad4 /dev/md0
/dev/ad4
        512             # sectorsize
        250059350016    # mediasize in bytes (233G)
        488397168       # mediasize in sectors
        0               # stripesize
        0               # stripeoffset
        484521          # Cylinders according to firmware.
        16              # Heads according to firmware.
        63              # Sectors according to firmware.
        9VMQN8T5        # Disk ident.

/dev/md0
        512             # sectorsize
        250059350016    # mediasize in bytes (233G)
        488397168       # mediasize in sectors
        0               # stripesize
        0               # stripeoffset

Now partition the memory disk as we will the first disk later on.

# gpart create -s gpt md0
md0 created
# gpart add -b 34 -s 128 -t freebsd-boot md0
md0p1 added
# gpart add -s 2g -t freebsd-swap -l swap1 md0
md0p2 added
# gpart add -t freebsd-zfs -l systemx md0
md0p3 added

and show the resulting sizes

# gpart show md0
=>       34  488397101  md0  GPT  (233G)
         34        128    1  freebsd-boot  (64K)
        162    4194304    2  freebsd-swap  (2.0G)
    4194466  484202669    3  freebsd-zfs  (231G)

Now blow away the memory disk, we don’t need it any more.

# mdconfig -d -u 0

Create the partitions on the second disk.

# gpart create -s gpt ad8
ad8 created
# gpart add -b 34 -s 128 -t freebsd-boot ad8
ad8p1 added
# gpart add -s 2g -t freebsd-swap -l swap1 ad8
ad8p2 added
# gpart add -s 484202669 -t freebsd-zfs -l system8 ad8
ad8p3 added

And eat the rest of the disk as a scratch area (this area will not be mirrored, and so should only be used for disposable stuff).

# gpart add -t freebsd-zfs -l scratch8 ad8
ad8p4 added

Check it matches the md0 simulation

# gpart show ad8
=>       34  976773101  ad8  GPT  (466G)
         34        128    1  freebsd-boot  (64K)
        162    4194304    2  freebsd-swap  (2.0G)
    4194466  484202669    3  freebsd-zfs  (231G)
  488397135  488376000    4  freebsd-zfs  (233G)

And don’t forget to set up the bootloader

# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad8
bootcode written to ad8

I realised as this point I had intended to label everything with an 8, to match the unit number, and had not done so for swap, so for completeness, here’s how you fix it

# gpart modify -i 2 -l swap8 ad8
ad8p2 modified
# gpart show -l ad8
=>       34  976773101  ad8  GPT  (466G)
         34        128    1  (null)  (64K)
        162    4194304    2  swap8  (2.0G)
    4194466  484202669    3  system8  (231G)
  488397135  488376000    4  scratch8  (233G)

Note that the label change is not reflected by the device names in /dev/gpt, which is needed for the next step, so at this point I rebooted.

Now set up the swap mirror.

# gmirror label -F -h -b round-robin swap /dev/gpt/swap8

Create the ZFS storage pool called system, consisting only of our system8 partition.

# zpool create -O mountpoint=/mnt -O atime=off -O setuid=off -O canmount=off system /dev/gpt/system8

And create a dataset – “mountpoint=legacy” stops ZFS from managing it.

# zfs create -o mountpoint=legacy -o setuid=on system/root

Mark it as the default bootable dataset.

# zpool set bootfs=system/root system

Mount it

# mount -t zfs system/root /mnt
# mount
/dev/ad4s1a on / (ufs, local)
devfs on /dev (devfs, local, multilabel)
system/root on /mnt (zfs, local, noatime)

And create the remaining mountpoints according to Pawel’s suggested layout…

# zfs create -o compress=lzjb system/tmp
# chmod 1777 /mnt/tmp
# zfs create -o canmount=off system/usr
# zfs create -o setuid=on system/usr/local
# zfs create -o compress=gzip system/usr/src
# zfs create -o compress=lzjb system/usr/obj
# zfs create -o compress=gzip system/usr/ports
# zfs create -o compress=off system/usr/ports/distfiles
# zfs create -o canmount=off system/var
# zfs create -o compress=gzip system/var/log
# zfs create -o compress=lzjb system/var/audit
# zfs create -o compress=lzjb system/var/tmp
# chmod 1777 /mnt/var/tmp
# zfs create -o canmount=off system/usr/home

And create one for each user:

# zfs create system/usr/home/ben

Now, at a slightly different point from Pawel, I edit the various config files. First /boot/loader.conf. Note that some of these are commented out: this is because, although they appear in Pawel’s version, they are already built into the kernel (this is because I use a GENERIC kernel and he uses a stripped-down one). Including them seems to cause problems (particularly geom_part_gpt, which causes a hang during boot if present).

geom_eli_load=YES
#geom_label_load=YES
geom_mirror_load=YES
#geom_part_gpt_load=YES
zfs_load=YES
vm.kmem_size=3G # This should be 150% of your RAM.

Enable ZFS

# echo zfs_enable=YES >> /etc/rc.conf

Change fstab for the new layout (note, you might want to edit these in – for example, my system had an entry for cd drives).

# cat > /etc/fstab
system/root / zfs rw,noatime 0 0
/dev/mirror/swap.eli none swap sw 0 0
^D

The .eli extension here is magic: geom_eli finds it at startup and automatically encrypts it.

Set the work directory for ports (so that it uses the faster compression scheme during builds).

# echo WRKDIRPREFIX=/usr/obj >> /etc/make.conf

These need to be done now because the next step is to copy the entire install to the new ZFS filesystem. Note that this particular command pastes completely incorrectly from Pawel’s blog post so be careful!

# tar -c --one-file-system -f - . | tar xpf - -C /mnt/

Tar can’t copy some types of file, so expect an error or two at this point:

tar: ./var/run/devd.pipe: tar format cannot archive socket
tar: ./var/run/log: tar format cannot archive socket
tar: ./var/run/logpriv: tar format cannot archive socket

Just for fun, take a look at the ZFS we’ve created so far…

# zfs list
NAME USED AVAIL REFER MOUNTPOINT
system 1.12G 225G 21K /mnt
system/root 495M 225G 495M legacy
system/tmp 30K 225G 30K /mnt/tmp
system/usr 652M 225G 21K /mnt/usr
system/usr/home 50K 225G 21K /mnt/usr/home
system/usr/home/ben 29K 225G 29K /mnt/usr/home/ben
system/usr/local 297M 225G 297M /mnt/usr/local
system/usr/obj 21K 225G 21K /mnt/usr/obj
system/usr/ports 190M 225G 159M /mnt/usr/ports
system/usr/ports/distfiles 30.8M 225G 30.8M /mnt/usr/ports/distfiles
system/usr/src 165M 225G 165M /mnt/usr/src
system/var 100K 225G 21K /mnt/var
system/var/audit 21K 225G 21K /mnt/var/audit
system/var/log 35K 225G 35K /mnt/var/log
system/var/tmp 23K 225G 23K /mnt/var/tmp

Unmount ZFS

# zfs umount -a

And the one we mounted by hand

# umount /mnt

And set the new ZFS-based system to be mounted on /

# zfs set mountpoint=/ system

And … reboot! (this is the moment of truth)

After the reboot, you should see

$ mount
system/root on / (zfs, local, noatime)
devfs on /dev (devfs, local, multilabel)
system/tmp on /tmp (zfs, local, noatime, nosuid)
system/usr/home/ben on /usr/home/ben (zfs, local, noatime, nosuid)
system/usr/local on /usr/local (zfs, local, noatime)
system/usr/obj on /usr/obj (zfs, local, noatime, nosuid)
system/usr/ports on /usr/ports (zfs, local, noatime, nosuid)
system/usr/ports/distfiles on /usr/ports/distfiles (zfs, local, noatime, nosuid)
system/usr/src on /usr/src (zfs, local, noatime, nosuid)
system/var/audit on /var/audit (zfs, local, noatime, nosuid)
system/var/log on /var/log (zfs, local, noatime, nosuid)
system/var/tmp on /var/tmp (zfs, local, noatime, nosuid)
$ zfs list
NAME USED AVAIL REFER MOUNTPOINT
system 1.79G 225G 21K /
system/root 763M 225G 763M legacy
system/tmp 43K 225G 43K /tmp
system/usr 1.04G 225G 21K /usr
system/usr/home 50.5K 225G 21K /usr/home
system/usr/home/ben 29.5K 225G 29.5K /usr/home/ben
system/usr/local 297M 225G 297M /usr/local
system/usr/obj 416M 225G 416M /usr/obj
system/usr/ports 190M 225G 159M /usr/ports
system/usr/ports/distfiles 30.8M 225G 30.8M /usr/ports/distfiles
system/usr/src 165M 225G 165M /usr/src
system/var 106K 225G 21K /var
system/var/audit 21K 225G 21K /var/audit
system/var/log 41.5K 225G 41.5K /var/log
system/var/tmp 23K 225G 23K /var/tmp
$ swapinfo
Device 1K-blocks Used Avail Capacity
/dev/mirror/swap.eli 2097148 0 2097148 0%

Note that system is not actually mounted (it has canmount=off) – it is used to allow all the other filesystems to inherit the / mountpoint. The one that is actually mounted on / is system/root, which is marked as legacy because it is mounted before zfs is up.

Now we’re up on the second disk, time to get the first disk back in the picture (we’re using it for boot but nothing else right now).

First blow away the MBR

# dd if=/dev/zero of=/dev/ad4 count=79
79+0 records in
79+0 records out
40448 bytes transferred in 0.008059 secs (5018970 bytes/sec)

and create the GPT partitions:

# gpart create -s GPT ad4
ad4 created
# gpart add -b 34 -s 128 -t freebsd-boot ad4
ad4p1 added
# gpart add -s 2g -t freebsd-swap -l swap4 ad4
ad4p2 added
# gpart add -t freebsd-zfs -l system4 ad4
ad4p3 added
# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad4
bootcode written to ad4

No scratch partition on this one, there’s no room. Now the two disks should match

# gpart show
=>       34  976773101  ad8  GPT  (466G)
         34        128    1  freebsd-boot  (64K)
        162    4194304    2  freebsd-swap  (2.0G)
    4194466  484202669    3  freebsd-zfs  (231G)
  488397135  488376000    4  freebsd-zfs  (233G)

=>       34  488397101  ad4  GPT  (233G)
         34        128    1  freebsd-boot  (64K)
        162    4194304    2  freebsd-swap  (2.0G)
    4194466  484202669    3  freebsd-zfs  (231G)

apart from the scratch partition, of course.

Add the mirrored swap

# gmirror insert -h -p 1 swap /dev/gpt/swap4

And when rebuilding is finished, you should see

# gmirror status
       Name    Status  Components
mirror/swap  COMPLETE  gpt/swap8
                       gpt/swap4

Now add the second disk’s zfs partition

# zpool attach system /dev/gpt/system8 /dev/gpt/system4
If you boot from pool 'system', you may need to update
boot code on newly attached disk '/dev/gpt/system4'.

Assuming you use GPT partitioning and 'da0' is your new boot disk
you may use the following command:

        gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0

We already did this part, so no need to do anything. Wait for it to finish. Here it is partway through

# zpool status
  pool: system
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 39.20% done, 0h0m to go
config:

        NAME             STATE     READ WRITE CKSUM
        system           ONLINE       0     0     0
          mirror         ONLINE       0     0     0
            gpt/system8  ONLINE       0     0     0
            gpt/system4  ONLINE       0     0     0  718M resilvered

errors: No known data errors

and now done

# zpool status
  pool: system
 state: ONLINE
 scrub: resilver completed after 0h2m with 0 errors on Fri Mar 18 12:13:19 2011
config:

        NAME             STATE     READ WRITE CKSUM
        system           ONLINE       0     0     0
          mirror         ONLINE       0     0     0
            gpt/system8  ONLINE       0     0     0
            gpt/system4  ONLINE       0     0     0  1.79G resilvered

errors: No known data errors

And we’re done. Reboot one last time to check everything worked.

One final task not relevant to the mirroring is to mount the scratch disk area.

Create a mountpoint

# mkdir /scratch

And a pool

# zpool create -O mountpoint=/scratch -O atime=off -O setuid=off scratch /dev/gpt/scratch8

This filesystem has no redundancy, as previously mentioned. (edit: I am told that the mkdir and mountpoint are both redundant – zfs will create the directory as needed, and uses the pool name as the mount point by default)

In the next installment I will fail and replace one of the disks.

Edit:

daily_status_zfs_enable="YES"
daily_status_gmirror_enable="YES"

should be added to /etc/periodic.conf so checks are added to the daily mails.

13 Jun 2010

FreeBMD Seeks An Executive Director

Filed under: General — Ben @ 15:35

I don’t often mention the FreeBMD project on this blog, perhaps I should. Anyway, we (the trustees of FreeBMD) have decided that it’s time to hire an Executive Director. It occurred to me that some of my readers might be interested, or might know someone who is.

9 Jun 2010

TLS Renegotiation, 7 Months On

Filed under: General,Security — Ben @ 9:18

It’s been 7 months since the TLS renegotiation problem went public and Opera’s security group have a couple of interesting articles about it. The first is about adoption of patched versions and the verdict is not good, as this graph shows…

Only 12% of servers are patched.

At this rate it will be two years before the fix is widely adopted!

The second is about version intolerance – scarily, nearly 90% of patched servers will not work when a future version of TLS bumps the major version number to 4 (it is currently 3). This is pretty astonishingly crap, and is likely to cause us problems in the future, so I’m glad the Opera guys are working hard to track down the culprits.

By the way, at least according to Opera, OpenSSL does not have this problem.

24 Jan 2010

Stupid: A Metalanguage For Cryptography

Filed under: Crypto,General,Open Source,Programming,Security — Ben @ 20:32

Various threads lately have got me thinking about implementing cryptography and cryptographic protocols. As I have mentioned before, this is hard. But obviously the task itself is the same every time, by its very nature – if I want to interoperate with others, then I must implement effectively the same algorithm as them. So why do we ever implement anything more than once? There are various reasons, varying as to their level of bogosity. Here’s a few

  • Trust: “I don’t want to trust anyone else’s code”. This, in my view is mostly bogus. If you don’t trust others to write your crypto, then you’ve got some pretty big problems on your hands…
    • You’re likely to be using some pretty heavyweight stuff like SSL and/or X.509, and reimplementing those is a seriously major job. Are you really going to do that?
    • Unless you are also a cryptographer, then you’re trusting the guys that designed the crypto you’re implementing anyway.
    • Ditto protocol desginer.
  • Languages: an implementation in Java isn’t going to work so well in Python. And although its true that you can plug C implementations into almost everything, there are legitimate and not-so-legitimate reasons for not wanting to do so…
    • You don’t trust native code: see above.
    • It makes your distribution hard to build and/or use and tricky to install.
    • You are running on a platform that doesn’t allow native code, for example, a web hosting company, or Google’s App Engine.
    • Native code has buffer overflows and MyFavouriteLanguage doesn’t: true, but probably the least of your worries, given all the other things that you can get wrong, at least if the native code is widely used and well tested.
  • Efficiency: you are not in the long tail of users who’s transactions per second is measured in fractions. In this case, you may well want specialised implementations that exploit every ounce of power in your platform.

Of these, reimplementation for efficiency clearly needs a completely hand-crafted effort. Trust issues are, in my view, largely bogus, but if you really want to go that way, be my guest. So what does that leave? People who want it in their chosen language, are quite happy to have someone else implement it and are not in need of the most efficient implementation ever. However, they would like correctness!

This line of thinking let me spend the weekend implementing a prototype of a language I call “Stupid”. The idea is to create a language that will permit the details of cryptography and cryptographic protocols to be specified unambiguously, down to the bits and bytes, and then compile that language into the language of your choice. Because we want absolute clarity, Stupid does not want to be like advanced languages, like OCaml and Haskell, or even C, where there’s all sorts of implicit type conversions and undefined behaviour going on – it wants it to be crystal clear to the programmer (or reviewer) exactly what is happening at every stage. This also aids the process of compiling into the target language, of course. So, the size of everything wants to be measured in bits, not vague things like “long” or “size_t”. Bits need to be in known places (for example, big-endian). Operations need to take known inputs and produce known outputs. Sign extension and the like do not want to happen magically. Overflow and underflow should be errors, unless you specifically stated that they were not, and so on.

To that end, I wrote just enough compiler to take as input a strawman Stupid grammar sufficient to do SHA-256, and produce various languages as output, in order to get a feel for what such a language might look like, and how hard it would be to implement.

The result is: you can do something rough in a weekend 🙂

Very rough – but it seems clear to me that proceeding down this road with more care would be very useful indeed. We could write all the cryptographic primitives in Stupid, write relatively simple language plugins for each target language and we’d have substantially improved the crypto world. So, without further ado, what does my proto-Stupid look like? Well, here’s SHA-256, slightly simplified (it only processes one block, I was going cross-eyed before I got round to multiple blocks). Note, I haven’t tested this yet, but I am confident that it implements (or can be easily extended to implement) everything needed to make it work – and the C output the first language plugin produces builds just fine with gcc -Wall -Werror. I will test it soon, and generate another language, just to prove the point. In case the code makes your eyes glaze over, see below for some comments on it…

"This code adapted from Wikipedia pseudocode";

"Note 2: All constants in this pseudo code are in big endian";

"Initialize variables";
"(first 32 bits of the fractional parts of the square roots of the first 8 primes 2..19):";
uint32 h0 = 0x6a09e667;
uint32 h1 = 0xbb67ae85;
uint32 h2 = 0x3c6ef372;
uint32 h3 = 0xa54ff53a;
uint32 h4 = 0x510e527f;
uint32 h5 = 0x9b05688c;
uint32 h6 = 0x1f83d9ab;
uint32 h7 = 0x5be0cd19;

"Initialize table of round constants";
"(first 32 bits of the fractional parts of the cube roots of the first 64 primes 2..311):";
array(uint32, 64) k =
(0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5,
0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3,
0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc,
0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7,
0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13,
0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3,
0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5,
0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208,
0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2);

"For now, dummy in the message instead of declaring a function wrapper";
"Also, for now, allow enough room in the input for padding, etc, to simplify the loop";
uint32 message_bits = 123;
array(uint8, 64) message =
(0x12, 0x34, 0x56, 0x78, 0x9a, 0xbc, 0xde, 0xf0,
0x0f, 0xed, 0xcb, 0xa9, 0x87, 0x65, 0x43, 0x21);
uint32 pad_byte = 0;
uint32 pad_bit = 0;
uint32 tmp = 0;
uint32 tmp2 = 0;
array(uint32, 16) w = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
uint32 i = 0;
uint32 s0 = 0;
uint32 s1 = 0;
uint32 a = 0;
uint32 b = 0;
uint32 c = 0;
uint32 d = 0;
uint32 e = 0;
uint32 f = 0;
uint32 g = 0;
uint32 h = 0;
uint32 maj = 0;
uint32 t1 = 0;
uint32 t2 = 0;
uint32 ch = 0;

"Pre-processing:";
"append the bit '1' to the message";

"note that we're using a 32-bit length for now";
"all the op32, op8 etc are _without_ wrap (where applicable) - i.e. wrap is an error";
"they also require left and right to both be the correct type and size";
"also, we have no precedence, it is up to you to bracket things";
"rshift is with zero padding";

pad_bit = 7 minus32 (message_bits mod32 8);
pad_byte = (message_bits plus32 1) rshift32 8;
message[pad_byte] = message[pad_byte] or8 (1 lshift8 pad_bit);

"append k bits '0', where k is the minimum number >= 0 such that the
resulting message length (in bits) is congruent to 448 (mod 512)";

"eq32 and friends return a boolean value (which is not even a bit)";

if (pad_bit eq32 0) {
pad_bit = 7;
pad_byte = pad_byte plus32 1;
} else {
pad_bit = pad_bit minus32 1;
}

"bor is like C || (i.e. RHS is only executed if LHS is false)";

"448/8 = 56";
while (((pad_byte mod32 512) ne32 56) bor (pad_bit ne32 7)) {
message[pad_byte] = message[pad_byte] and8 (not8 (1 lshift8 pad_bit));
if (pad_bit eq32 0) {
pad_bit = 7;
pad_byte = pad_byte plus32 1;
} else {
pad_bit = pad_bit minus32 1;
}
}

"append length of message (before pre-processing), in bits, as 64-bit big-endian integer";

message[pad_byte] = 0;
message[pad_byte plus32 1] = 0;
message[pad_byte plus32 2] = 0;
message[pad_byte plus32 3] = 0;

message[pad_byte plus32 7] = mask32to8 message_bits;
tmp = message_bits rshift32 8;
message[pad_byte plus32 6] = mask32to8 message_bits;
tmp = message_bits rshift32 8;
message[pad_byte plus32 5] = mask32to8 message_bits;
tmp = message_bits rshift32 8;
message[pad_byte plus32 4] = mask32to8 message_bits;

"for each chunk (we only have one, so don't bother with the loop for now)";

" break chunk into sixteen 32-bit big-endian words w[0..15]";
tmp = 0;
while(tmp ne32 16) {
tmp2 = tmp lshift32 2;
w[tmp] = ((widen8to32 message[tmp2]) lshift32 24)
plus32 ((widen8to32 message[tmp2 plus32 1]) lshift32 16)
plus32 ((widen8to32 message[tmp2 plus32 2]) lshift32 8)
plus32 (widen8to32 message[tmp2 plus32 3]);
tmp = tmp plus32 1;
}

" Extend the sixteen 32-bit words into sixty-four 32-bit words";
i = 16;
while(i ne32 64) {
s0 = (w[i minus32 15] rrotate32 7) xor32 (w[i minus32 15] rrotate32 18) xor32 (w[i minus32 15] rshift32 3);
s1 = (w[i minus32 2] rrotate32 17) xor32 (w[i minus32 2] rrotate32 19) xor32 (w[i minus32 2] rshift32 10);
w[i] = w[i minus32 16] plus32 s0 plus32 w[i minus32 7] plus32 s1;
}

" Initialize hash value for this chunk:";

a = h0;
b = h1;
c = h2;
d = h3;
e = h4;
f = h5;
g = h6;
h = h7;

" Main loop:";

i = 0;
while(i ne32 64) {
s0 = (a rrotate32 2) xor32 (a rrotate32 13) xor32 (a rrotate32 22);
maj = (a and32 b) xor32 (a and32 c) xor32 (b and32 c);
t2 = s0 plus32 maj;
s1 = (e rrotate32 6) xor32 (e rrotate32 11) xor32 (e rrotate32 25);
ch = (e and32 f) xor32 ((not32 e) and32 g);
t1 = h plus32 s1 plus32 ch plus32 k[i] plus32 w[i];
h = g;
g = f;
f = e;
e = d plus32 t1;
d = c;
c = b;
b = a;
a = t1 plus32 t2;
}

" Add this chunk's hash to result so far:";

h0 = h0 plus32 a;
h1 = h1 plus32 b;
h2 = h2 plus32 c;
h3 = h3 plus32 d;
h4 = h4 plus32 e;
h5 = h5 plus32 f;
h6 = h6 plus32 g;
h7 = h7 plus32 h;

"end of outer loop (when we do it)";

"Obviously I can also do this part, but right now I am going cross-eyed";
"Produce the final hash value (big-endian):
digest = hash = h0 append h1 append h2 append h3 append h4 append h5 append h6 append h7";

Notice that every operator specifies the input and output sizes. For example plus32 means add two 32-bit numbers to get a 32-bit result, with wrap being an error (this probably means, by the way, that the last few plus32s should be plus32_with_overflow, since SHA-256 actually expects overflow for these operations). So far we only deal with unsigned quantities; some “overflows” are actually expected when dealing with negative numbers, so that would have to be specified differently. Also, I didn’t deal with the size of constants, because I wasn’t sure about a good notation, though I am leaning towards 23_8 to mean an 8-bit representation of 23 (subscripted, like in TeX).

Because Stupid really is stupid, it should be very easy to write static analysis code for it, enabling checks to be omitted sometimes – for example, the fact that we only subtract 1 from pad_bit if pad_bit is non-zero means that we would not have to check for underflow in that case.

Anyway, I’m a bit bemused after writing a lot of rather repetitive code for the compiler, so I think I’ll wait for reactions before commenting further – but it does seem to me that this is a project worth pursuing. The compiler itself, whilst somewhat crude, particularly since it doesn’t yet do most of the checks I suggest should be there, is pretty small and easily understood: less than 1,500 lines of Perl and YAPP. I won’t bore you with the details, but if you want a peek, here’s a tarball.

8 Jan 2010

TLS Renegotiation Fix: Nearly There

Filed under: Crypto,General,Open Source,Open Standards,Security — Ben @ 13:19

Finally, after a lot of discussion, the IESG have approved the latest draft of the TLS renegotation fix. It is possible it’ll still change before an RFC number is assigned, but it seems unlikely to me.

But that doesn’t mean there isn’t plenty of work left to do. Now everyone has to implement it (in fact, many have already done so, including tracking the various changes as the I-D was updated), interop test with each other and roll out to clients and servers. And even then it isn’t over, since until clients are (mostly) universally updated, servers will have to allow old clients to connect and clients may have to be prepared to connect to old servers. In the case of a new server and an old client, it doesn’t hugely matter that the client has not been updated because it is defended by the server, which should not allow a renegotiation to occur if the client is old. However, in the case of an old server and a new client, or an old server and an old client, then there’s a problem – the client could be attacked. Obviously a new client can detect it is talking to an old server, and decline to play, but for some transitional period, it seems likely that clients will have to tolerate this, perhaps warning their user.

We could summarise the situation like this:

Client
Old New
Server Old vulnerable vulnerable but client is aware, client should decline or at least warn
New not vulnerable if renegotiation is forbidden, client is unaware not vulnerable, everyone is aware

7 Apr 2009

CodeCon Is Back!

Filed under: General,Open Source,Programming — Ben @ 10:43

Unfortunately, I can’t be there, but the lineup looks great. The bio-hacking track looks particularly fun.

Not long to go now, less than two weeks. Sign up!

28 Mar 2009

More Banking Stupidity: Phished by Visa

Filed under: General,Rants,Security — Ben @ 14:21

Not content with destroying the world’s economies, the banking industry is also bent on ruining us individually, it seems. Take a look at Verified By Visa. Allegedly this protects cardholders – by training them to expect a process in which there’s absolutely no way to know whether you are being phished or not. Even more astonishing is that this is seen as a benefit!

Frame inline displays the VbV authentication page in
the merchant’s main window with the merchant’s
header. Therefore, VbV is seen as a natural part of the
purchase process. It is recommended that the top
frame include the merchant’s standard branding in a
short and concise manner and keep the cardholder
within the same look and feel of the checkout process.

Or, in other words

Please ensure that there is absolutely no way for your customer to know whether we are showing the form or you are. In fact, please train your customer to give their “Verified by Visa” password to anyone who asks for it.

Craziness. But it gets better – obviously not everyone is pre-enrolled in this stupid scheme, so they also allow for enrolment using the same inline flow. Now the phishers have the opportunity to also get information that will allow them to identify themselves to the bank as you. Yes, Visa have provided a very nicely tailored and packaged identity theft scheme. But, best of all, rather like Chip and PIN, they push all blame for their failures on to the customer

Verified by Visa helps protect you from fraudulent claims from cardholders – that they didn’t take part in, or authorise, a payment. Once you are up and running with Verified by Visa, you are no longer liable for chargebacks of this nature.

In other words, if the phisher uses your Verified by Visa password, then it’s going to be your fault – obviously the only way they could know it is if you told them! If you claim it was not you, then you are guilty of fraud; it says so, right there.

16 Feb 2009

Identification Is Not Security

Filed under: General,Rants,Security — Ben @ 18:51

The New York Times have an article about the Stanford Clean Slate project. It concludes

Proving identity is likely to remain remarkably difficult in a world where it is trivial to take over someone’s computer from half a world away and operate it as your own. As long as that remains true, building a completely trustable system will remain virtually impossible.

As far as I can tell, Clean Slate itself doesn’t make this stupid claim, the NYT decided to add it for themselves. But why do they think identification is relevant? Possibly because we are surrounded by the same spurious claim. For example…

  • We need ID cards because they will prevent terrorism.
  • We shouldn’t run software on our Windows box that isn’t signed because that’ll prevent malware.
  • We should only connect to web servers that have certificates from well-known CAs because only they can be trusted.

But…

  • The guys who crashed the planes were all carrying ID. Didn’t help.
  • The guys who blew up the train in Spain were all carrying ID. Didn’t help.
  • People get hacked via their browser all the time. Did signing it help?
  • What does it take to sign code? A certificate, issued by a CA…
  • What does it take to get a certificate? Not much … proof that you own a domain, in fact. So, I can trust the server because the guy that owns it can afford to pay Joker $10? And I can trust the code he signed? Why?

Nope. Security is not about knowing who gave you the code that ate your lunch – security is about having a system that is robust against code that you don’t trust. The identity of the author of that code should be irrelevant.

9 Jan 2009

CodeCon Comes Back

Filed under: General — Ben @ 11:20

After an absence of several years (the last one was in 2006), CodeCon is back!

CodeCon has long been one of my favourite low-cost conferences, focusing on stuff that actually works. And I hear this year it won’t be in a nightclub, which may be bad from a beer point of view, but is good from the sticky floor perspective…

Check out the CFP.

6 Jan 2009

The Atheist Bus

Filed under: General — Ben @ 19:37

I’m pretty amused by this very successful appeal, which has put on buses all over England

Atheist bus advert

and I’m really impressed that they raised £135,000 (against a target of £5,500) to do it!

As an atheist humanist myself, I can only disagree with them in one respect: the use of the word “probably”. The website they’re reacting to, after all, doesn’t leave much room for doubt. They don’t say “You will probably be condemned to everlasting separation from God and then you might spend all eternity in torment in hell”, do they?

(via Boing Boing)

4 Dec 2008

What Does “Unphishable” Mean?

Filed under: Crypto,General,Identity Management,Security — Ben @ 7:36

About a week ago, I posted about password usability. Somewhere in there I claimed that if passwords were unphishable, then you could use the same password everywhere.

Since then, I have had a steady stream of people saying I am obviously wrong. So, let’s take them one at a time…

…as long as the password I type in there is send over (encrypted of course) to the backend and recoverable there as plaintext password, you have to trust it is stored/used securely there.

This does assume that everywhere you use it actually secures your password, and doesn’t just store it as plain text.

…there are many attacks to finding your password — an administrator at Facebook could look it up in the password database…

OK, OK, that’s three, but they say the same thing. This one is easily dismissed – obviously if we are using an unphishable protocol the password is not sent at all and it is not kept in Facebook’s database. If it were, then clearly a phisher would easily be able to get your password once he tricked you into typing it in on his site.

Even with perfect or near-perfect hardware, somebody will always find a way to game the system via social engineering.

Don’t forget that we are in a utopia here where users only ever type their passwords into the unphishable password gadget. I think it’s pretty reasonable to assume that if we’ve trained users to do that, we have also trained them to never reveal their password at all anywhere else, including in person, over the phone, via video-conference or during a teledildonics session. Yes, this does mean changing the world, but … utopia, remember?

Mythical crypto-gadgets simply won’t save the day. All somebody has to do is replace your crypto-gadget with an identical-looking crypto-gadget of their own making and now it becomes the new “password” input field that is so phishable

This seems to be more a criticism of the idea that we can ever get to the password utopia, which is a fair comment, but doesn’t make my argument incorrect. I will offer, though, hardware devices (such as the one I wrote about recently) as an answer. Clearly much harder to replace with “an identical-looking crypto-gadget of their own making” than software.

There is also the notion of the “trusted path” which, if anyone ever figures out how to implement it in software, would make such a replacement equally difficult even if we don’t use hardware. However, if you read the Red Pill/Blue Pill paper, you’ll see I don’t hold out much hope for this.

you could have a weak password that the hacker could attack via brute force

This one is actually correct! Yes, it’s true that an unphishable password must be strong. Clearly no system relying solely on a password can defend against an attacker guessing the password and seeing if it works. The only defence against this is to make it infeasible for the attacker to guess it in reasonable time. So, yes, you must use a strong password. Sorry about that.

The primary reason one should not use the same password everywhere is that once that password is discovered at one location, then it can be reused at other locations

I feel that we’re veering off into philosophy slightly with this one, particularly since, in the same post, Conor says

I also look forward to being able to login once at the start of my day and maintain that state in a reasonably secure fashion for the entire day without having to re-authenticate every few minutes

which is an interesting piece of doublethink – surely if whatever provides this miraculous experience (one I also look forward to) is compromised then you are just as screwed – so wouldn’t the argument be that I should have a large number of these things, which I have to log into separately?

Nevertheless, I will have a go at it. In our utopia, remember, our password is only ever revealed to trusted widgets (whether hardware, software or something else is immaterial). This means, of course, that the password can’t be “discovered at one location” – this is the nature of unphishability! Therefore, I claim that the criticism is a priori invalid. Isn’t logic wonderful?

I don’t follow.

Because I can’t be fooled into divulging some credential where I shouldn’t means that it is appropriate that I use it everywhere? Are there not other attack vectors that would drool at the thought?

I include this for completeness. Clearly, this is a rhetorical device. When Paul comes up with an actual attack, rather than suggesting that there surely must be one, I shall respond.

Finally…

Conversely, that the fact that I can use the same credential everywhere is somehow a necessary aspect of ‘unphishability’?

Indeed it is. If it were unsafe to use the same credential everywhere, then the protocol must somehow reveal something to the other side that can be used to impersonate you (generally known as a “password equivalent” – for example, HTTP Digest Auth enrollment reveals a password equivalent that is not your password). This would make the protocol phishable. Therefore, it is a necessary requirement that an unphishable protocol allows you to use the same password everywhere.

Even more finally, for those whose heads exploded at the notion that I can log in with a password without ever revealing the password or a password equivalent, I offer you SRP.

10 Aug 2008

NYT Doesn’t Quite Get It, Hilarity From OpenID

Filed under: General,Identity Management,Privacy,Security — Ben @ 13:31

The New York Times’ Randy Stross has a piece about passwords and what a bad idea they are (sorry, behind a loginwall). So far, so good (and I’ll admit to bias here: I was interviewed for this piece, and whilst there’s no attribution, what I was saying is clearly reflected in the article), but Stross weirdly focuses on OpenID as the continuing cause of our password woes, because, he says, it is blocking the deployment of information cards, which will save us all.

Now, I am no fan of OpenID, but Stross is dead wrong here. OpenID says nothing about how you log in. It is not OpenID’s fault that the login is generally done with a password – that blame we must all accept collectively.

And whilst I firmly believe that the only way out of this mess is strong authentication, information cards are hardly the be-all and end-all of that game. They certainly have a way to go in usability before they’re going to be taking the world by storm. Don’t blame OpenID for that.

In the meantime, Scott Kveton, chair of the OpenID Foundation board, reacts:

The OpenID community has identified two key issues it needs to address in 2008 that Randy mentioned in his column; security and usability.

I just have to giggle. I mean, apart from those two minor issues, OpenID is pretty good, right? He forgot to mention privacy, though.

9 Jul 2008

FreeBMD Gets New Boots

Filed under: General — Ben @ 22:04

FreeBMD recently moved its servers from one of The Bunker‘s data centres to the other.

Our marvellous sysadmin posted some pictures. It never fails to amaze me how much tin it takes to keep that crazy idea running.

22 Jun 2008

Can Haz Blogroll

Filed under: General — Ben @ 13:41

I’ve been meaning to put one of these up for years. Literally.

Anyway, I finally got around to it.

21 May 2008

Modern Mail Clients

Filed under: General,Lazyweb — Ben @ 12:54

Way back when, I used to use Pine to read my email. After it had marked everything I read as unread again once too many times (admittedly not entirely its fault, but it did leave everything ’til the last minute), I switched to, well, something else. I don’t remember exactly what. But after a long series of experiments I ended up with Thunderbird, which I mostly like – or at least hate less than all other clients I’ve tried.

But, it really doesn’t handle big mailboxes very well. I’m lazy when it comes to tidying up, as my wife will testify, and so I tend to find myself with 100,000 read messages lying around and a similar number unread.

Thunderbird can read mailboxes like that (which is an improvement – earlier versions couldn’t), but it really doesn’t handle deleting them very well. Select even a small number, like a thousand or so, and hit delete, and watch Thunderbird go away for a very long sleep.

In the end I had to go back to Pine to tidy my mailbox. Incidentally, I tried mutt, but it couldn’t handle more than a few thousand messages at a time. Pine seems to manage whatever I throw at it, though its UI can only be described as arcane.

So, my question to the lazyweb: is there an answer to this? A modern open source client that can do graphical stuff, is nice to use and can handle big IMAP mailboxes? Or is my Thunderbird/Pine hybrid as good as it gets?

29 Apr 2008

Fun With FreeBSD and gmirror

Filed under: General — Ben @ 20:49

A while ago I moved a lot of my stuff from a very ancient box to a quite new one. For some reason the new one has three disks in it, and so we (that is the ultra-patient Lemon and me) decided to mirror two of them. Not really having need of a third enormous disk it was left spare for now (possibly this was unwise in retrospect).

Since I run FreeBSD on my server boxes, we used gmirror. Being adventurous, we also decided we were going to mirror the root partition – slightly nerve-wracking, because when FreeBSD boots, it boots from the (unmirrored) root partition. But the theory is this works fine with mirrored disks.

So, we had three disks, which FreeBSD saw as ad4 (ata2-master), ad5 (ata2-slave) and ad6(ata3-master). We figured that ad4 and ad6 should be the mirrors, since they are on different controllers. So that’s what we did and it all works fine.

Fast forward several months and its time to upgrade the kernel. We’re moving from FreeBSD 6.x to FreeBSD 7.x, so its slightly nerve-wracking, but I do what I always do, which is to build the system from source, following the time-honoured

1. `cd /usr/src' (or to the directory containing your source tree).
2. `make buildworld'
3. `make buildkernel KERNCONF=YOUR_KERNEL_HERE' (default is GENERIC).
4. `make installkernel KERNCONF=YOUR_KERNEL_HERE' (default is GENERIC).
[steps 3. & 4. can be combined by using the "kernel" target]
5. `reboot' (in single user mode: boot -s from the loader prompt).
6. `mergemaster -p'
7. `make installworld'
8. `make delete-old'
9. `mergemaster'
10. `reboot'

(btw, mergemaster -U is good medicine for step 9). Everything goes fine. Until I realise uname -a is still reporting we’re running FreeBSD 6.2! WTF?

Well, to make a long story short, for some reason the BIOS thinks ata2-slave(i.e. our “spare” disk!) is the “first” disk, and so this is what it boots off. Presumably during the build the system got installed on “disk 1”, whichever that happened to be (we didn’t actually do the base build ourselves). Then when mirroring was set up, our “disk 1” didn’t match the BIOSes, and confusion reigned.

The happy ending to this story, though, is that

  • You can run FreeBSD 7 userland on a FreeBSD 6.2 kernel!
  • Switching the BIOS to boot off “disk 2” (i.e. ata2-master) made everything work as it should

I am recording this episode not because I think it is very interesting but because I hope it’ll be useful to someone else.

2 Feb 2008

Eggonomics

Filed under: General — Ben @ 19:14

So, Egg are shedding customers they think are a bad risk. I wonder what happens if this becomes fashionable? Since the worst offenders are in the habit of moving their debt from one free offer to the next, presumably whoever moves last will end up with the credit card equivalent of the sub-prime mortgage disaster. Buy Egg, sell HSBC!

Next Page »

Powered by WordPress