FreeNAS USB Flash Boot Drives: Mirroring For Fault Tolerance.

FreeNAS LogoFreeNAS encourages the use of USB flash drives as the operating system boot drive. This allows FreeNAS to dedicate all of the motherboard SATA connectors for data storage drives. I didn’t think commodity USB flash drives are trustworthy enough to hold the operating system, but I was willing to experiment and be proven wrong.

The very first night, I got worrying news from the nightly system check:

  pool: freenas-boot
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
  scan: scrub repaired 3K in 0h10m with 0 errors

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          da0p2     ONLINE       0     0    11

errors: No known data errors

Looking on the bright side, “No known data errors” is comforting, as is the “repaired […] with 0 errors”. It’s nice FreeNAS was able to repair whatever was wrong with my USB stick. I suspect inexpensive commodity USB flash drives frequently encounter errors that are silently corrected by the operating system. Still, an error is an error and it’ll only be a matter of time before I run into a serious problem.

Fortunately, FreeNAS authors had the foresight to make sure a bad boot device does not become a single point of failure. A second one can be added to the system act as a mirror to the boot device. If either of them fails, the other can take over.

Much to my dismay, the second USB stick I tried also encountered a data checksum error. I didn’t have much luck figuring out how to interpret the checksum error code, but I did learn that it is supposed to be zero. The first stick returned 21, the second 26.

I tried a third USB stick and was relieved to finally see a zero checksum. The output below was generated when I ran ‘zpool status’ while the third stick is in the middle of replacing the second stick.

  pool: freenas-boot
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress

        NAME             STATE     READ WRITE CKSUM
        freenas-boot     ONLINE       0     0     0
          mirror-0       ONLINE       0     0    21
            da0p2        ONLINE       0     0    21
            replacing-1  ONLINE       0     0     0
              da1p2      ONLINE       0     0    26
              da2p2      ONLINE       0     0     0  (resilvering)

errors: No known data errors

I also found a fourth USB stick that was checksum error-free, so I had it take the place of the first one.

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0h29m with 0 errors

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          mirror-0  ONLINE       0     0    21
            da1p2   ONLINE       0     0     0
            da2p2   ONLINE       0     0     0

errors: No known data errors        

Now both boot drives in the mirror set have zero checksum error, but the mirror volume overall still has checksum error 21 from the first USB stick. I’m still learning if that means anything (bad) and what it would take to reset that to zero.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s