Crash Recovery with ZFS

August 8, 2025

Well…

I got caught with my pants down last month. I made a change to my FreeBSD website and rebooted, which I do from time to time. But … it crashed. And not only did it crash, but it trashed the boot code and would not boot at all.

Not for nothing, but (insert whiny voice) this was not my fault!

Regardless, I had to do the laborious rounds of installation, restore, test, restore, test, restore, test, etc. Wash, rinse, repeat. If you are in the tech world, this will not surprise you at all.

It got me thinking about what I could do for boot block recovery next time, and I began to get very deep into ZFS send/receive for saving and restoring ZFS datasets from a backup server.

Here’s how I managed to get it working.

NOTE: Please read and apply understanding. The following code may cause your system to work poorly, or stop working altogether. Do not cut and paste with out understanding what the code does.

Recommended reading:

  • zfsconcepts(7) - Overview of ZFS concepts
  • zfsprops(7) - native and user-defined properties of ZFS datasets
  • zpool(8) – configure ZFS storage pools
  • zfs(8) - Configure ZFS datasets
  • zfs-send(8) - generate backup stream of ZFS dataset
  • zfs-receive(8) - create snapshot from backup stream
  • zfs-snapshot(8) – create snapshots of ZFS datasets
  • gpart(8) - control utility for the disk partitioning GEOM class
  • if_bridge(4) -network bridge device and tap(4) - tap, vmnet – Ethernet tunnel software network interface

For the boot environment version:

  • bectl – Utility to manage boot environments on ZFS

QEMU Setup

Below, I describe two virtual machines hostA and hostB. HostA is the source machine, and hostB is the backup machine that will hold ZFS snapshots.

See Tapping Into Qemu for a more detailed explanation of a very similar QEMU setup.

HostA uses telnet port 4470 for serial access and hostB uses telnet port 4472. For example, use telnet localhost 4470 in a separate window to access the serial port of hostA.

HostA has one disk (vtbd0) and hostB has two - vtbd0 for boot and vtbd1 for storing snapshots. The VMs are connected via a virtual bridge, with tap12 and tap13 being assigned for hostA and hostB, respectively.

You can use static addressing by editing /etc/rc.conf, or you can use DHCP if the host computer network interface is added to the bridge configuration. The VM configurations below assume the FreeBSD boot ISO is linked to fbsd.iso and that the VM disk images and startup scripts are all in the current directory.


Create the QEMU disk image for hostA:
jpb@jpblt:~ $ qemu-img create -f qcow2 -o preallocation=full \
hostA.qcow2 8G

Create the QEMU disk images for hostA and hostB:
jpb@jpblt:~ $ qemu-img create -f qcow2 -o preallocation=full \
hostB.qcow2 8G

jpb@jpblt:~ $ qemu-img create -f qcow2 -o preallocation=full \
hostB_bigDisk.qcow2 32G

On both installations, add the default user to the wheel group.

Note - you may find it convenient to put the below QEMU commands
in their own shell script:  hostA.sh and hostB.sh 

jpb@jpblt:~ $ sudo /usr/local/bin/qemu-system-x86_64 -monitor none \
  -serial telnet:localhost:4470,server=on,wait=off \
  -cpu qemu64 \
  -vga cirrus \
  -m 8192 \
  -boot order=cd,menu=on \
  -cdrom ./fbsd.iso \
  -drive if=none,id=drive0,cache=none,aio=threads,format=qcow2,file=./hostA.qcow2 \
  -device virtio-blk,drive=drive0 \
  -netdev tap,id=nd0,ifname=tap12,script=no,downscript=no \
  -device e1000,netdev=nd0,mac=02:49:be:ad:ba:be \
  -name "hostA"  &

In the hostB install below, select only the first disk (vtbd0) for installation.

jpb@jpblt:~ $ sudo /usr/local/bin/qemu-system-x86_64 -monitor none \
 -serial telnet:localhost:4472,server=on,wait=off \
 -cpu qemu64 \
 -vga cirrus \
 -m 8192 \
 -boot order=cd,menu=on \
 -cdrom ./fbsd.iso \
 -drive if=none,id=drive0,cache=none,aio=threads,format=qcow2,file=./hostB.qcow2 \
 -device virtio-blk,drive=drive0 \
 -drive if=none,id=drive1,cache=none,aio=threads,format=qcow2,file=./hostB_bigDisk.qcow2 \
 -device virtio-blk,drive=drive1 \
 -netdev tap,id=nd0,ifname=tap13,script=no,downscript=no \
 -device e1000,netdev=nd0,mac=02:49:be:ad:ba:dd \
 -name "hostB"  &
 
After the installation completes, restart hostB and set up the second
disk (big_Disk.img).
As root, perform the following:

root@hostB:~ # gpart create -s gpt vtbd1
vtbd1 created
root@hostB:~ # 
root@hostB:~ # gpart add -a 1m -t freebsd-zfs vtbd1
vtbd1p1 added
root@hostB:~ # 
root@hostB:~ # zpool create hostA vtbd1p1
root@hostB:~ # 
root@hostB:~ # zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
hostA  31.5G   408K  31.5G        -         -     0%     0%  1.00x    ONLINE  -
zroot  5.50G   924M  4.60G        -         -     0%    16%  1.00x    ONLINE  -
root@hostB:~ # 

The big disk is now ready for use.
 

In the text below hostA is at 192.168.1.134 and hostB is at 192.168.1.151.

For both machines, enable the serial console in /boot/loader.conf.


On hostA:
root@hostA:~ # sysrc -f /boot/loader.conf console="comconsole"

And on hostB:
root@hostA:~ # sysrc -f /boot/loader.conf console="comconsole"

On reboot:

For hostA, open a terminal window and enter:
jpb@jpblt:~ # telnet localhost 4470

For hostB, open a terminal window and enter:
jpb@jpblt:~ # telnet localhost 4472

When using hostC, use port 4474.
 

Both machines should be installed with a very recent version of FreeBSD. The examples here are using FreeBSD 14.3. Add a user to the wheel group on both machines. The examples use the name “hostA” for the Zpool on hostB that will hold the snapshots for hostA, but you can name it whatever you want.

On hostA add an extra user or two, add some packages, and create some test data, all so that you can verify it once the system is recovered.

You will also want to set up hostB to allow the root user on hostA to login without a password. I usually do this by creating a separate sshd config file: /etc/ssh/sshdconfig.2223 with these changes:


root@hostB:/etc/ssh # diff sshd_config sshd_config.2223
16c16
< #Port 22
---
> Port 2223
35c35
< #PermitRootLogin no
---
> PermitRootLogin Yes
root@hostB:/etc/ssh # 
 

and then run this command to start up a separate sshd process on port 2223:


root@hostB:~ # /usr/sbin/sshd -f /etc/ssh/sshd_config.2223
 

We are now ready to test crash recovery with ZFS!

The Easy Way(TM) with bectl(8)

ZFS supports boot environments that allow for creating ZFS datasets which function as clones of the host bootable datasets.

Follow the procedure below to test with ZFS boot environments:


First, create a "deep" boot environment.  See bectl(8) for details on
"deep" vs. "shallow" boot environments.

root@hostA:~ # bectl create -r 2025-08-08-deep

Next, take a snapshot of the top level dataset, here zroot.
You can name the snapshot whatever you want, but having a date stamp is helpful
for retrieving a specific one later.

root@hostA:~ # zfs snapshot -r zroot@2025-08-08-deep

root@hostA:~ # zfs list -r
NAME                         USED  AVAIL  REFER  MOUNTPOINT
zroot                       1010M  4.34G    96K  /zroot
zroot/ROOT                  1007M  4.34G    96K  none
zroot/ROOT/2025-08-08-deep     8K  4.34G  1006M  /           <<<<<<<  NOTE
zroot/ROOT/default          1007M  4.34G  1006M  /
zroot/home                   484K  4.34G    96K  /home
zroot/home/han               128K  4.34G   128K  /home/han
zroot/home/jpb               128K  4.34G   128K  /home/jpb
zroot/home/luke              132K  4.34G   132K  /home/luke
zroot/tmp                     96K  4.34G    96K  /tmp
zroot/usr                    288K  4.34G    96K  /usr
zroot/usr/ports               96K  4.34G    96K  /usr/ports
zroot/usr/src                 96K  4.34G    96K  /usr/src
zroot/var                    716K  4.34G    96K  /var
zroot/var/audit               96K  4.34G    96K  /var/audit
zroot/var/crash               96K  4.34G    96K  /var/crash
zroot/var/log                212K  4.34G   152K  /var/log
zroot/var/mail               112K  4.34G   112K  /var/mail
zroot/var/tmp                104K  4.34G   104K  /var/tmp
root@hostA:~ #

Listing the snapshot shows the included "deep" environment:
root@hostA:~ # zfs list -t snap
NAME                                         USED  AVAIL  REFER  MOUNTPOINT
zroot@2025-08-08-deep                          0B      -    96K  -  <<<<  NOTE
zroot/ROOT@2025-08-08-deep                     0B      -    96K  -
zroot/ROOT/2025-08-08-deep@2025-08-08-deep     0B      -  1006M  -
zroot/ROOT/default@2025-08-08-14:30:56-0       0B      -  1006M  -
zroot/ROOT/default@2025-08-08-deep             0B      -  1006M  -
zroot/home@2025-08-08-deep                     0B      -    96K  -
zroot/home/han@2025-08-08-deep                 0B      -   128K  -
zroot/home/jpb@2025-08-08-deep                 0B      -   128K  -
zroot/home/luke@2025-08-08-deep                0B      -   132K  -
zroot/tmp@2025-08-08-deep                      0B      -    96K  -
zroot/usr@2025-08-08-deep                      0B      -    96K  -
zroot/usr/ports@2025-08-08-deep                0B      -    96K  -
zroot/usr/src@2025-08-08-deep                  0B      -    96K  -
zroot/var@2025-08-08-deep                      0B      -    96K  -
zroot/var/audit@2025-08-08-deep                0B      -    96K  -
zroot/var/crash@2025-08-08-deep                0B      -    96K  -
zroot/var/log@2025-08-08-deep                 60K      -   152K  -
zroot/var/mail@2025-08-08-deep                 0B      -   112K  -
zroot/var/tmp@2025-08-08-deep                  0B      -   104K  -
root@hostA:~ #
 

On hostA we have created a boot environment and a snapshot of the entire “deep” environment.

Now save it to hostB. Note that on the ZFS receive, we name the incoming dataset “hostA” to avoid confusion.

On hostA:

root@hostA:~ # zfs send -R zroot@2025-08-08-deep  |  \
ssh -p 2223 root@192.168.1.151 zfs receive -Fu hostA   <<<<< NOTE NAMING

The authenticity of host '[hostb.attlocal.net]:2223 ([2600:1700:3901:4940:49:beff:fead:badd]:2223)' can't be established.
ED25519 key fingerprint is SHA256:EdgJObZFG/hzdwQ1J03tQ/Dncq/RpojPzKdbz71NOt8.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[hostb.attlocal.net]:2223' (ED25519) to the list of known hosts.
receiving full stream of zroot@2025-08-08-deep into hostA@2025-08-08-deep
received 46.0K stream in 0.21 seconds (217K/sec)
receiving full stream of zroot/ROOT@2025-08-08-deep into hostA/ROOT@2025-08-08-deep
received 46.0K stream in 0.24 seconds (191K/sec)
receiving full stream of zroot/ROOT/default@2025-08-08-14:30:56-0 into hostA/ROOT/default@2025-08-08-14:30:56-0
. . .

This is a large dataset - and will take some time to send/receive.

If needed, use "netstat -I em0 -w 1" on hostB to see the live transfer stats.
When stats are zero (0 bytes), the transfer is done.

. . .
            input            em0           output
   packets  errs idrops      bytes    packets  errs      bytes colls
      5229     0     0    7907626       2645     0     229054     0
      1837     0     0    2763818        939     0      82698     0
        77     0     0     113418         40     0       3432     0
         4     0     0        344          7     0        926     0
         4     0     0        440          4     0        468     0
         0     0     0          0          0     0          0     0
         0     0     0          0          0     0          0     0
         0     0     0          0          0     0          0     0
^C

The received snapshot on hostB shows the entire entire list of datasets:

On hostB:

root@hostB:~ # zfs list
NAME                               USED  AVAIL  REFER  MOUNTPOINT
hostA                              925M  29.6G    96K  /zroot
hostA/ROOT                         922M  29.6G    96K  none
hostA/ROOT/default                 922M  29.6G   922M  /
hostA/ROOT/hostA-2025-08-08-deep     0B  29.6G   922M  /
hostA/home                         512K  29.6G    96K  /home
hostA/home/han                     140K  29.6G   140K  /home/han
hostA/home/jpb                     136K  29.6G   136K  /home/jpb
hostA/home/luke                    140K  29.6G   140K  /home/luke
hostA/tmp                           96K  29.6G    96K  /tmp
hostA/usr                          288K  29.6G    96K  /usr
hostA/usr/ports                     96K  29.6G    96K  /usr/ports
hostA/usr/src                       96K  29.6G    96K  /usr/src
hostA/var                          644K  29.6G    96K  /var
hostA/var/audit                     96K  29.6G    96K  /var/audit
hostA/var/crash                     96K  29.6G    96K  /var/crash
hostA/var/log                      148K  29.6G   148K  /var/log
hostA/var/mail                     112K  29.6G   112K  /var/mail
hostA/var/tmp                       96K  29.6G    96K  /var/tmp
zroot                              993M  8.23G    96K  /zroot
zroot/ROOT                         991M  8.23G    96K  none
zroot/ROOT/default                 990M  8.23G   990M  /
zroot/home                         224K  8.23G    96K  /home
. . .

root@hostB:~ # zfs list -t snap
NAME                                         USED  AVAIL  REFER  MOUNTPOINT
hostA@2025-08-08-deep                          0B      -    96K  -  <<<< NOTE
hostA/ROOT@2025-08-08-deep                     0B      -    96K  -
hostA/ROOT/2025-08-08-deep@2025-08-08-deep     0B      -  1006M  -
hostA/ROOT/default@2025-08-08-14:30:56-0       0B      -  1006M  -
hostA/ROOT/default@2025-08-08-deep             0B      -  1006M  -
hostA/home@2025-08-08-deep                     0B      -    96K  -
hostA/home/han@2025-08-08-deep                 0B      -   128K  -
hostA/home/jpb@2025-08-08-deep                 0B      -   128K  -
hostA/home/luke@2025-08-08-deep                0B      -   132K  -
hostA/tmp@2025-08-08-deep                      0B      -    96K  -
hostA/usr@2025-08-08-deep                      0B      -    96K  -
hostA/usr/ports@2025-08-08-deep                0B      -    96K  -
hostA/usr/src@2025-08-08-deep                  0B      -    96K  -
hostA/var@2025-08-08-deep                      0B      -    96K  -
hostA/var/audit@2025-08-08-deep                0B      -    96K  -
hostA/var/crash@2025-08-08-deep                0B      -    96K  -
hostA/var/log@2025-08-08-deep                  0B      -   152K  -
hostA/var/mail@2025-08-08-deep                 0B      -   112K  -
hostA/var/tmp@2025-08-08-deep                  0B      -   104K  -
root@hostB:~ # 


We now take down hostA.

On hostA:
root@hostA:~ # poweroff

 

Ok, we have saved our boot environment and are ready to proceed with restoring on a fresh system. To avoid any cache issues, we will use a new QEMU instance to do the restore.


jpb@jpbtl:~ $ qemu-img create -f qcow2 -o preallocation=full hostC.qcow2 8G

Start up the hostC instance:

jpb@jpblt:~ $ sudo /usr/local/bin/qemu-system-x86_64 -monitor none \
 -boot order=cd,menu=on \
 -serial telnet:localhost:4474,server=on,wait=off \
 -cpu qemu64 \
 -vga cirrus \
 -m 8192 \
 -cdrom ./fbsd.iso \
 -drive if=none,id=drive0,cache=none,aio=threads,format=qcow2,file=./hostC.qcow2 \
 -device virtio-blk,drive=drive0 \
 -netdev tap,id=nd0,ifname=tap16,script=no,downscript=no \
 -device e1000,netdev=nd0,mac=02:49:be:ad:ba:cc \
 -name "hostC"  &

You must use the serial console for the restore operation.

On QEMU console, select serial & console option 5 and boot with option 1.

Telnet to localhost 4474 for the serial console on hostC.

On serial console, select VT100 and Shell (not Install)

On the serial console:

Start up networking with DHCP (or manually set the IP address if required):
# dhclient em0

Set up the primary boot disk:

# gpart create -s gpt "vtbd0"
# gpart add -a 4k -l gptboot0 -t freebsd-boot -s 512k "vtbd0"
# gpart bootcode -b "/boot/pmbr" -p "/boot/gptzfsboot" -i 1 "vtbd0"
# gpart add -a 1m -l swap0 -t freebsd-swap -s 2147483648b "vtbd0"
# gpart add -a 1m -l zfs0 -t freebsd-zfs "vtbd0"

Load ZFS:
# kldload zfs

Create the pool:
# zpool create -o altroot=/mnt -O mountpoint=/ zroot  vtbd0p3

Transfer the ZFS boot environment snapshot back from hostB:
# ssh -p 2223 root@192.168.1.151 "zfs send -R hostA@2025-08-08-deep" | \
zfs receive -v -Fu zroot

The authenticity of host '[192.168.1.151]:2223 ([192.168.1.151]:2223)' can't be established.
ED25519 key fingerprint is SHA256:EdgJObZFG/hzdwQ1J03tQ/Dncq/RpojPzKdbz71NOt8.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Could not create directory '/root/.ssh' (Read-only file system).
Failed to add the host to the list of known hosts (/root/.ssh/known_hosts).
(root@192.168.1.151) Password for root@hostB:
receiving full stream of hostA@hostA-2025-08-08-deep into zroot@hostA-2025-08-08-deep
received 46.0K stream in 0.35 seconds (130K/sec)
receiving full stream of hostA/usr@hostA-2025-08-08-deep into zroot/usr@hostA-2025-08-08-deep
received 46.0K stream in 0.25 seconds (183K/sec)
receiving full stream of hostA/usr/ports@hostA-2025-08-08-deep into zroot/usr/ports@hostA-2025-08-08-deep
received 46.0K stream in 0.24 seconds (192K/sec)
receiving full stream of hostA/usr/src@hostA-2025-08-08-deep into zroot/usr/src@hostA-2025-08-08-deep
received 46.0K stream in 0.20 seconds (233K/sec)
receiving full stream of hostA/ROOT@hostA-2025-08-08-deep into zroot/ROOT@hostA-2025-08-08-deep
received 46.0K stream in 0.24 seconds (188K/sec)
receiving full stream of hostA/ROOT/default@2025-08-08-12:31:18-0 into zroot/ROOT/default@2025-08-08-12:31:18-0
. . .
#

Set the bootfs and mountpoint properties for the newly installed dataset.
# zpool set bootfs=zroot/ROOT/2025-08-08-deep zroot
# zfs set mountpoint=/ zroot/ROOT/2025-08-08-deep

And we're done!

Poweroff and restart:
# poweroff

QEMU knows that the hardisk is now prepped for booting and it should
boot as usual:
jpb@jpblt:~ # /bin/sh hostC.sh

Booting FreeBSD:
. . .
Starting background file system checks in 60 seconds.

Fri Aug  8 15:17:54 EDT 2025

FreeBSD/amd64 (hostA) (ttyu0)

login: 
 

Login and test all users, files, packages etc.

There is one caveat - the MAC address for hostA should be adjusted to that of the original hostA system. You can edit the MAC address for any of these scripts by editing the line containing the “mac” parameter near the bottom of the script. Set the MAC address as desired.

. . .
 -device e1000,netdev=nd0,mac=02:49:be:ad:ba:cc \
 -name "hostC"  &
 

Enjoy!

PS - There is also a “Hard Way(TM)” do to this, which is more granular. I’ll post that version at a later time.