Ben Laurie blathering

Why Online Backups Suck

Cory Doctorow points to some dude called Michael Arrington who talks about how the world needs a better online backup product. Clearly he hasn’t done the sums.

My default policy is to back up everything – in my experience, trying to choose what to back up is a great way to miss something vital. So, let’s say I have a rather modest 100 GB disk, and I have the usual ADSL link, i.e. 128 kb/s up. How long would it take me to do the first backup? 75 days. Really.

So, assume I get smart and can winnow that down to only 10% of my disk: then it still takes 7.5 days. Assume further that by some miracle only 10% of the files change each day, then daily backups take .75 of a day. That is, my uplink is maxed out most of the time.

Clearly this sucks.

He goes on to suggest that whoever produces this non-viable product should also give the use of 500 GB for $20 a year. I’m not sure where he buys disks, but where I buy them this means I might recover the original investment in, oh, 10 years or so. So long as the rest of the hardware is free and I buy in huge bulk. Great business model.

An interesting question is: will this get any better? In other words, does uplink speed per dollar improve faster than disk size per dollar? I’m sure someone has the historical data for that … let me know!


  1. Ben,

    Two good points (upload speed and disk cost). On upload speed, I think I can be happy with bittorent-like-speed backups, even over a period of weeks. I don’t need to access these files until some disaster happens, and I’m happy with this running continually in the background.

    On disk cost, $20 per year is clearly not enough at current storage costs. A higher number would still be acceptable. Also, I think that revenues can be augmented with advertising, to a very limited extent. I also can be happier with a higher cost today as long as it decreases over time as costs also decline.

    What I really want are better tools than exist currently for uploading and managing files. I mention some of these in my post.

    In any event, great feedback.

    Comment by Michael Arrington — 23 Nov 2005 @ 10:36

  2. Allmydata may be of interest to you. Distributed BitTorrent meets Freenet meets rsync scheme.


    Comment by Pablos — 23 Nov 2005 @ 10:55

  3. Brad Templeton has a much better idea: a commodity network-based RAID system, running over a LAN (or WLAN). It doesn’t have to be as fast as you’d normally want a disk to be; just fast enough to make a useful RAID backup system.

    An IDE enclosure that fits an IDE hard drive, with a Gumstix board running some kind of self-assembling LAN-RAID software… it’d be a cool product.

    Comment by Justin Mason — 23 Nov 2005 @ 19:45

  4. I think this is where compression and compare-by-hash come in (Pablos alluded to this). The OS and apps reduce to nothing (unless you’re using Gentoo) and the other stuff can shrink maybe 2x. Since my uplink is a blazing 384 kbps maybe my daily backup could be done in ~8 hours while I’m at work.

    As for LAN-RAID, I think it’s called Zetera.

    Comment by Wes Felter — 28 Nov 2005 @ 23:25

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress