As discussed in my previous article, if a disk fails then a ZFS system will just carry on as if nothing has happened. Of course, we’d like to restore the system to its former redundant glory, so here’s how…
Once more, we simulate a failure by removing the primary disk, but this time replace it with a new unformatted disk (I guess if the new disk was already bootable you’d need to fix that first).
Let’s assume we’re several years down the line and no longer have any documentation at all. First off, find your disks by inspecting dmesg. As before we have ad4 and ad8. ad4 is the new disk.
# diskinfo -v ad4 ad8 ad4 512 # sectorsize 500107862016 # mediasize in bytes (466G) 976773168 # mediasize in sectors 0 # stripesize 0 # stripeoffset 969021 # Cylinders according to firmware. 16 # Heads according to firmware. 63 # Sectors according to firmware. S20BJ9AB212006 # Disk ident. ad8 512 # sectorsize 500107862016 # mediasize in bytes (466G) 976773168 # mediasize in sectors 0 # stripesize 0 # stripeoffset 969021 # Cylinders according to firmware. 16 # Heads according to firmware. 63 # Sectors according to firmware. 9VMYLC5V # Disk ident.
This time they are conveniently exactly the same size, despite having diffferent manufacturers (Samsung and Seagate respectively). We already know from the first article in this series that we can deal with disks that don’t look the same, and in any case only 250GB is currently replicated. So, let’s partition the new disk as the old one…
# gpart show ad8 => 34 976773101 ad8 GPT (466G) 34 128 1 freebsd-boot (64K) 162 4194304 2 freebsd-swap (2.0G) 4194466 484202669 3 freebsd-zfs (231G) 488397135 488376000 4 freebsd-zfs (233G) # gpart show -l ad8 => 34 976773101 ad8 GPT (466G) 34 128 1 (null) (64K) 162 4194304 2 swap8 (2.0G) 4194466 484202669 3 system8 (231G) 488397135 488376000 4 scratch8 (233G) # gpart create -s gpt ad4 ad4 created # gpart add -b 34 -s 128 -t freebsd-boot ad4 ad4p1 added # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad4 bootcode written to ad4 # gpart add -s 4194304 -t freebsd-swap -l swap4 ad4 ad4p2 added # gpart add -s 484202669 -t freebsd-zfs -l system4 ad4 ad4p3 added # gpart add -t freebsd-zfs -l scratch4 ad4 ad4p4 added # gpart show ad4 => 34 976773101 ad4 GPT (466G) 34 128 1 freebsd-boot (64K) 162 4194304 2 freebsd-swap (2.0G) 4194466 484202669 3 freebsd-zfs (231G) 488397135 488376000 4 freebsd-zfs (233G)
Now we’re ready to reattach the disk to the various filesystems.
First the swap. Since we can’t remove the dead disk from the gmirror setup, first we forget then add the new swap partition back in.
# gmirror forget swap # gmirror insert -h -p 1 swap /dev/gpt/swap4 # gmirror status Name Status Components mirror/swap DEGRADED gpt/swap8 gpt/swap4 (29%)
and after a while
# gmirror status Name Status Components mirror/swap COMPLETE gpt/swap8 gpt/swap4
Next the main filesystem. In this case, since the new device has the same name as the old one, we can just write
# zpool replace system /dev/gpt/system4 If you boot from pool 'system', you may need to update boot code on newly attached disk '/dev/gpt/system4'. Assuming you use GPT partitioning and 'da0' is your new boot disk you may use the following command: gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0
Once more we’ve already done this step, so no need to do it again. Note, this command took a little while, don’t be alarmed!
# zpool status pool: scratch state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM scratch ONLINE 0 0 0 gpt/scratch8 ONLINE 0 0 0 errors: No known data errors pool: system state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h0m, 9.77% done, 0h2m to go config: NAME STATE READ WRITE CKSUM system DEGRADED 0 0 0 mirror DEGRADED 0 0 0 gpt/system8 ONLINE 0 0 0 replacing DEGRADED 0 0 0 gpt/system4/old UNAVAIL 0 0 0 cannot open gpt/system4 ONLINE 0 0 0 221M resilvered errors: No known data errors
and after not very long
# zpool status pool: scratch state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM scratch ONLINE 0 0 0 gpt/scratch8 ONLINE 0 0 0 errors: No known data errors pool: system state: ONLINE scrub: resilver completed after 0h1m with 0 errors on Sun Mar 27 13:04:02 2011 config: NAME STATE READ WRITE CKSUM system ONLINE 0 0 0 mirror ONLINE 0 0 0 gpt/system8 ONLINE 0 0 0 gpt/system4 ONLINE 0 0 0 2.21G resilvered errors: No known data errors
And we’re all good, back to where we were before. Reboot to check everything is fine.
Note, by the way, that all of this was done on a live system in multi-user mode. Apart from the occasional reboot there was no loss of service whatsoever.
Also, because the primary disk didn’t really fail, if I wanted I could put it in my other machine and end up with a working replicated system there without any need for setup.
There is one niggling question remaining: I started off with one 250 GB and one 500 GB disk. I now have two 500 GBs, which means the non-redundant scratch file system I had before could now become redundant. Or they could become part of the system pool. Or they could become a bigger non-redundant scratch filesystem.
In the end I decided to do the simplest thing, which is to make the scratch partitions part of the larger system partition. If I ever need to rearrange that is always possible either with the help of an additional disk or, even, with less safety, by taking one of the disks out of the pools and rearranging onto that (see a description of doing this kind of thing on freenas).
So, to make them part of the existing pool, first destroy the scratch filesystem (if I’d already used it I’d have to copy it before I started, but since I haven’t I can just blow it away). Since we mounted the pool direct, we destroy it with zpool:
# zpool destroy scratch
(and we can confirm it has gone with zpool list and zfs list). Just
for naming sanity, I rename the two scratch partitions:
# gpart modify -i 4 -l system8.p2 ad8
# gpart modify -i 4 -l system4.p2 ad4
and since those aren’t reflected in /dev/gpt, reboot. Then finally
# zpool add system mirror /dev/gpt/system4.p2 /dev/gpt/system8.p2
# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
system 463G 2.21G 461G 0% ONLINE -