Crash Recovery with ZFS
Well…
I got caught with my pants down last month. I made a change to my FreeBSD website and rebooted, which I do from time to time. But … it crashed. And not only did it crash, but it trashed the boot code and would not boot at all.
Not for nothing, but (insert whiny voice) this was not my fault!
Regardless, I had to do the laborious rounds of installation, restore, test, restore, test, restore, test, etc. Wash, rinse, repeat. If you are in the tech world, this will not surprise you at all.
It got me thinking about what I could do for boot block recovery next time, and I began to get very deep into ZFS send/receive for saving and restoring ZFS datasets from a backup server.
Here’s how I managed to get it working.
NOTE: Please read and apply understanding. The following code may cause your system to work poorly, or stop working altogether. Do not cut and paste with out understanding what the code does.
Recommended reading:
- zfsconcepts(7) - Overview of ZFS concepts
- zfsprops(7) - native and user-defined properties of ZFS datasets
- zpool(8) – configure ZFS storage pools
- zfs(8) - Configure ZFS datasets
- zfs-send(8) - generate backup stream of ZFS dataset
- zfs-receive(8) - create snapshot from backup stream
- zfs-snapshot(8) – create snapshots of ZFS datasets
- gpart(8) - control utility for the disk partitioning GEOM class
- if_bridge(4) -network bridge device and tap(4) - tap, vmnet – Ethernet tunnel software network interface
For the boot environment version:
- bectl – Utility to manage boot environments on ZFS
QEMU Setup
Below, I describe two virtual machines hostA and hostB. HostA is the source machine, and hostB is the backup machine that will hold ZFS snapshots.
See Tapping Into Qemu for a more detailed explanation of a very similar QEMU setup.
HostA uses telnet port 4470 for serial access and hostB uses telnet port 4472.
For example, use telnet localhost 4470
in a separate window to access the serial port of hostA.
HostA has one disk (vtbd0
) and hostB has two - vtbd0
for boot and vtbd1
for storing snapshots.
The VMs are connected via a virtual bridge, with tap12
and tap13
being assigned for hostA and hostB, respectively.
You can use static addressing by editing /etc/rc.conf
, or you can use DHCP if the host computer network interface is added to the bridge configuration.
The VM configurations below assume the FreeBSD boot ISO is linked to fbsd.iso
and that the VM disk images and startup scripts are all in the current directory.
Create the QEMU disk image for hostA:
jpb@jpblt:~ $ qemu-img create -f qcow2 -o preallocation=full \
hostA.qcow2 8G
Create the QEMU disk images for hostA and hostB:
jpb@jpblt:~ $ qemu-img create -f qcow2 -o preallocation=full \
hostB.qcow2 8G
jpb@jpblt:~ $ qemu-img create -f qcow2 -o preallocation=full \
hostB_bigDisk.qcow2 32G
On both installations, add the default user to the wheel group.
Note - you may find it convenient to put the below QEMU commands
in their own shell script: hostA.sh and hostB.sh
jpb@jpblt:~ $ sudo /usr/local/bin/qemu-system-x86_64 -monitor none \
-serial telnet:localhost:4470,server=on,wait=off \
-cpu qemu64 \
-vga cirrus \
-m 8192 \
-boot order=cd,menu=on \
-cdrom ./fbsd.iso \
-drive if=none,id=drive0,cache=none,aio=threads,format=qcow2,file=./hostA.qcow2 \
-device virtio-blk,drive=drive0 \
-netdev tap,id=nd0,ifname=tap12,script=no,downscript=no \
-device e1000,netdev=nd0,mac=02:49:be:ad:ba:be \
-name "hostA" &
In the hostB install below, select only the first disk (vtbd0) for installation.
jpb@jpblt:~ $ sudo /usr/local/bin/qemu-system-x86_64 -monitor none \
-serial telnet:localhost:4472,server=on,wait=off \
-cpu qemu64 \
-vga cirrus \
-m 8192 \
-boot order=cd,menu=on \
-cdrom ./fbsd.iso \
-drive if=none,id=drive0,cache=none,aio=threads,format=qcow2,file=./hostB.qcow2 \
-device virtio-blk,drive=drive0 \
-drive if=none,id=drive1,cache=none,aio=threads,format=qcow2,file=./hostB_bigDisk.qcow2 \
-device virtio-blk,drive=drive1 \
-netdev tap,id=nd0,ifname=tap13,script=no,downscript=no \
-device e1000,netdev=nd0,mac=02:49:be:ad:ba:dd \
-name "hostB" &
After the installation completes, restart hostB and set up the second
disk (big_Disk.img).
As root, perform the following:
root@hostB:~ # gpart create -s gpt vtbd1
vtbd1 created
root@hostB:~ #
root@hostB:~ # gpart add -a 1m -t freebsd-zfs vtbd1
vtbd1p1 added
root@hostB:~ #
root@hostB:~ # zpool create hostA vtbd1p1
root@hostB:~ #
root@hostB:~ # zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
hostA 31.5G 408K 31.5G - - 0% 0% 1.00x ONLINE -
zroot 5.50G 924M 4.60G - - 0% 16% 1.00x ONLINE -
root@hostB:~ #
The big disk is now ready for use.
In the text below hostA is at 192.168.1.134 and hostB is at 192.168.1.151.
For both machines, enable the serial console in /boot/loader.conf.
On hostA:
root@hostA:~ # sysrc -f /boot/loader.conf console="comconsole"
And on hostB:
root@hostA:~ # sysrc -f /boot/loader.conf console="comconsole"
On reboot:
For hostA, open a terminal window and enter:
jpb@jpblt:~ # telnet localhost 4470
For hostB, open a terminal window and enter:
jpb@jpblt:~ # telnet localhost 4472
When using hostC, use port 4474.
Both machines should be installed with a very recent version of FreeBSD. The examples here are using FreeBSD 14.3. Add a user to the wheel group on both machines. The examples use the name “hostA” for the Zpool on hostB that will hold the snapshots for hostA, but you can name it whatever you want.
On hostA add an extra user or two, add some packages, and create some test data, all so that you can verify it once the system is recovered.
You will also want to set up hostB to allow the root user on hostA to login without a password.
I usually do this by creating a separate sshd config file:
/etc/ssh/sshdconfig.2223
with these changes:
root@hostB:/etc/ssh # diff sshd_config sshd_config.2223
16c16
< #Port 22
---
> Port 2223
35c35
< #PermitRootLogin no
---
> PermitRootLogin Yes
root@hostB:/etc/ssh #
and then run this command to start up a separate sshd
process on port 2223:
root@hostB:~ # /usr/sbin/sshd -f /etc/ssh/sshd_config.2223
We are now ready to test crash recovery with ZFS!
The Easy Way(TM) with bectl(8)
ZFS supports boot environments that allow for creating ZFS datasets which function as clones of the host bootable datasets.
Follow the procedure below to test with ZFS boot environments:
First, create a "deep" boot environment. See bectl(8) for details on
"deep" vs. "shallow" boot environments.
root@hostA:~ # bectl create -r 2025-08-08-deep
Next, take a snapshot of the top level dataset, here zroot.
You can name the snapshot whatever you want, but having a date stamp is helpful
for retrieving a specific one later.
root@hostA:~ # zfs snapshot -r zroot@2025-08-08-deep
root@hostA:~ # zfs list -r
NAME USED AVAIL REFER MOUNTPOINT
zroot 1010M 4.34G 96K /zroot
zroot/ROOT 1007M 4.34G 96K none
zroot/ROOT/2025-08-08-deep 8K 4.34G 1006M / <<<<<<< NOTE
zroot/ROOT/default 1007M 4.34G 1006M /
zroot/home 484K 4.34G 96K /home
zroot/home/han 128K 4.34G 128K /home/han
zroot/home/jpb 128K 4.34G 128K /home/jpb
zroot/home/luke 132K 4.34G 132K /home/luke
zroot/tmp 96K 4.34G 96K /tmp
zroot/usr 288K 4.34G 96K /usr
zroot/usr/ports 96K 4.34G 96K /usr/ports
zroot/usr/src 96K 4.34G 96K /usr/src
zroot/var 716K 4.34G 96K /var
zroot/var/audit 96K 4.34G 96K /var/audit
zroot/var/crash 96K 4.34G 96K /var/crash
zroot/var/log 212K 4.34G 152K /var/log
zroot/var/mail 112K 4.34G 112K /var/mail
zroot/var/tmp 104K 4.34G 104K /var/tmp
root@hostA:~ #
Listing the snapshot shows the included "deep" environment:
root@hostA:~ # zfs list -t snap
NAME USED AVAIL REFER MOUNTPOINT
zroot@2025-08-08-deep 0B - 96K - <<<< NOTE
zroot/ROOT@2025-08-08-deep 0B - 96K -
zroot/ROOT/2025-08-08-deep@2025-08-08-deep 0B - 1006M -
zroot/ROOT/default@2025-08-08-14:30:56-0 0B - 1006M -
zroot/ROOT/default@2025-08-08-deep 0B - 1006M -
zroot/home@2025-08-08-deep 0B - 96K -
zroot/home/han@2025-08-08-deep 0B - 128K -
zroot/home/jpb@2025-08-08-deep 0B - 128K -
zroot/home/luke@2025-08-08-deep 0B - 132K -
zroot/tmp@2025-08-08-deep 0B - 96K -
zroot/usr@2025-08-08-deep 0B - 96K -
zroot/usr/ports@2025-08-08-deep 0B - 96K -
zroot/usr/src@2025-08-08-deep 0B - 96K -
zroot/var@2025-08-08-deep 0B - 96K -
zroot/var/audit@2025-08-08-deep 0B - 96K -
zroot/var/crash@2025-08-08-deep 0B - 96K -
zroot/var/log@2025-08-08-deep 60K - 152K -
zroot/var/mail@2025-08-08-deep 0B - 112K -
zroot/var/tmp@2025-08-08-deep 0B - 104K -
root@hostA:~ #
On hostA we have created a boot environment and a snapshot of the entire “deep” environment.
Now save it to hostB. Note that on the ZFS receive, we name the incoming dataset “hostA” to avoid confusion.
On hostA:
root@hostA:~ # zfs send -R zroot@2025-08-08-deep | \
ssh -p 2223 root@192.168.1.151 zfs receive -Fu hostA <<<<< NOTE NAMING
The authenticity of host '[hostb.attlocal.net]:2223 ([2600:1700:3901:4940:49:beff:fead:badd]:2223)' can't be established.
ED25519 key fingerprint is SHA256:EdgJObZFG/hzdwQ1J03tQ/Dncq/RpojPzKdbz71NOt8.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[hostb.attlocal.net]:2223' (ED25519) to the list of known hosts.
receiving full stream of zroot@2025-08-08-deep into hostA@2025-08-08-deep
received 46.0K stream in 0.21 seconds (217K/sec)
receiving full stream of zroot/ROOT@2025-08-08-deep into hostA/ROOT@2025-08-08-deep
received 46.0K stream in 0.24 seconds (191K/sec)
receiving full stream of zroot/ROOT/default@2025-08-08-14:30:56-0 into hostA/ROOT/default@2025-08-08-14:30:56-0
. . .
This is a large dataset - and will take some time to send/receive.
If needed, use "netstat -I em0 -w 1" on hostB to see the live transfer stats.
When stats are zero (0 bytes), the transfer is done.
. . .
input em0 output
packets errs idrops bytes packets errs bytes colls
5229 0 0 7907626 2645 0 229054 0
1837 0 0 2763818 939 0 82698 0
77 0 0 113418 40 0 3432 0
4 0 0 344 7 0 926 0
4 0 0 440 4 0 468 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
^C
The received snapshot on hostB shows the entire entire list of datasets:
On hostB:
root@hostB:~ # zfs list
NAME USED AVAIL REFER MOUNTPOINT
hostA 925M 29.6G 96K /zroot
hostA/ROOT 922M 29.6G 96K none
hostA/ROOT/default 922M 29.6G 922M /
hostA/ROOT/hostA-2025-08-08-deep 0B 29.6G 922M /
hostA/home 512K 29.6G 96K /home
hostA/home/han 140K 29.6G 140K /home/han
hostA/home/jpb 136K 29.6G 136K /home/jpb
hostA/home/luke 140K 29.6G 140K /home/luke
hostA/tmp 96K 29.6G 96K /tmp
hostA/usr 288K 29.6G 96K /usr
hostA/usr/ports 96K 29.6G 96K /usr/ports
hostA/usr/src 96K 29.6G 96K /usr/src
hostA/var 644K 29.6G 96K /var
hostA/var/audit 96K 29.6G 96K /var/audit
hostA/var/crash 96K 29.6G 96K /var/crash
hostA/var/log 148K 29.6G 148K /var/log
hostA/var/mail 112K 29.6G 112K /var/mail
hostA/var/tmp 96K 29.6G 96K /var/tmp
zroot 993M 8.23G 96K /zroot
zroot/ROOT 991M 8.23G 96K none
zroot/ROOT/default 990M 8.23G 990M /
zroot/home 224K 8.23G 96K /home
. . .
root@hostB:~ # zfs list -t snap
NAME USED AVAIL REFER MOUNTPOINT
hostA@2025-08-08-deep 0B - 96K - <<<< NOTE
hostA/ROOT@2025-08-08-deep 0B - 96K -
hostA/ROOT/2025-08-08-deep@2025-08-08-deep 0B - 1006M -
hostA/ROOT/default@2025-08-08-14:30:56-0 0B - 1006M -
hostA/ROOT/default@2025-08-08-deep 0B - 1006M -
hostA/home@2025-08-08-deep 0B - 96K -
hostA/home/han@2025-08-08-deep 0B - 128K -
hostA/home/jpb@2025-08-08-deep 0B - 128K -
hostA/home/luke@2025-08-08-deep 0B - 132K -
hostA/tmp@2025-08-08-deep 0B - 96K -
hostA/usr@2025-08-08-deep 0B - 96K -
hostA/usr/ports@2025-08-08-deep 0B - 96K -
hostA/usr/src@2025-08-08-deep 0B - 96K -
hostA/var@2025-08-08-deep 0B - 96K -
hostA/var/audit@2025-08-08-deep 0B - 96K -
hostA/var/crash@2025-08-08-deep 0B - 96K -
hostA/var/log@2025-08-08-deep 0B - 152K -
hostA/var/mail@2025-08-08-deep 0B - 112K -
hostA/var/tmp@2025-08-08-deep 0B - 104K -
root@hostB:~ #
We now take down hostA.
On hostA:
root@hostA:~ # poweroff
Ok, we have saved our boot environment and are ready to proceed with restoring on a fresh system. To avoid any cache issues, we will use a new QEMU instance to do the restore.
jpb@jpbtl:~ $ qemu-img create -f qcow2 -o preallocation=full hostC.qcow2 8G
Start up the hostC instance:
jpb@jpblt:~ $ sudo /usr/local/bin/qemu-system-x86_64 -monitor none \
-boot order=cd,menu=on \
-serial telnet:localhost:4474,server=on,wait=off \
-cpu qemu64 \
-vga cirrus \
-m 8192 \
-cdrom ./fbsd.iso \
-drive if=none,id=drive0,cache=none,aio=threads,format=qcow2,file=./hostC.qcow2 \
-device virtio-blk,drive=drive0 \
-netdev tap,id=nd0,ifname=tap16,script=no,downscript=no \
-device e1000,netdev=nd0,mac=02:49:be:ad:ba:cc \
-name "hostC" &
You must use the serial console for the restore operation.
On QEMU console, select serial & console option 5 and boot with option 1.
Telnet to localhost 4474 for the serial console on hostC.
On serial console, select VT100 and Shell (not Install)
On the serial console:
Start up networking with DHCP (or manually set the IP address if required):
# dhclient em0
Set up the primary boot disk:
# gpart create -s gpt "vtbd0"
# gpart add -a 4k -l gptboot0 -t freebsd-boot -s 512k "vtbd0"
# gpart bootcode -b "/boot/pmbr" -p "/boot/gptzfsboot" -i 1 "vtbd0"
# gpart add -a 1m -l swap0 -t freebsd-swap -s 2147483648b "vtbd0"
# gpart add -a 1m -l zfs0 -t freebsd-zfs "vtbd0"
Load ZFS:
# kldload zfs
Create the pool:
# zpool create -o altroot=/mnt -O mountpoint=/ zroot vtbd0p3
Transfer the ZFS boot environment snapshot back from hostB:
# ssh -p 2223 root@192.168.1.151 "zfs send -R hostA@2025-08-08-deep" | \
zfs receive -v -Fu zroot
The authenticity of host '[192.168.1.151]:2223 ([192.168.1.151]:2223)' can't be established.
ED25519 key fingerprint is SHA256:EdgJObZFG/hzdwQ1J03tQ/Dncq/RpojPzKdbz71NOt8.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Could not create directory '/root/.ssh' (Read-only file system).
Failed to add the host to the list of known hosts (/root/.ssh/known_hosts).
(root@192.168.1.151) Password for root@hostB:
receiving full stream of hostA@hostA-2025-08-08-deep into zroot@hostA-2025-08-08-deep
received 46.0K stream in 0.35 seconds (130K/sec)
receiving full stream of hostA/usr@hostA-2025-08-08-deep into zroot/usr@hostA-2025-08-08-deep
received 46.0K stream in 0.25 seconds (183K/sec)
receiving full stream of hostA/usr/ports@hostA-2025-08-08-deep into zroot/usr/ports@hostA-2025-08-08-deep
received 46.0K stream in 0.24 seconds (192K/sec)
receiving full stream of hostA/usr/src@hostA-2025-08-08-deep into zroot/usr/src@hostA-2025-08-08-deep
received 46.0K stream in 0.20 seconds (233K/sec)
receiving full stream of hostA/ROOT@hostA-2025-08-08-deep into zroot/ROOT@hostA-2025-08-08-deep
received 46.0K stream in 0.24 seconds (188K/sec)
receiving full stream of hostA/ROOT/default@2025-08-08-12:31:18-0 into zroot/ROOT/default@2025-08-08-12:31:18-0
. . .
#
Set the bootfs and mountpoint properties for the newly installed dataset.
# zpool set bootfs=zroot/ROOT/2025-08-08-deep zroot
# zfs set mountpoint=/ zroot/ROOT/2025-08-08-deep
And we're done!
Poweroff and restart:
# poweroff
QEMU knows that the hardisk is now prepped for booting and it should
boot as usual:
jpb@jpblt:~ # /bin/sh hostC.sh
Booting FreeBSD:
. . .
Starting background file system checks in 60 seconds.
Fri Aug 8 15:17:54 EDT 2025
FreeBSD/amd64 (hostA) (ttyu0)
login:
Login and test all users, files, packages etc.
There is one caveat - the MAC address for hostA should be adjusted to that of the original hostA system. You can edit the MAC address for any of these scripts by editing the line containing the “mac” parameter near the bottom of the script. Set the MAC address as desired.
. . .
-device e1000,netdev=nd0,mac=02:49:be:ad:ba:cc \
-name "hostC" &
Enjoy!
PS - There is also a “Hard Way(TM)” do to this, which is more granular. I’ll post that version at a later time.