Setting Up ZFS Scrubbing with Cron Jobs

 

I have a ZFS pool as my main Network Attached Storage (NAS) as well as it’s backup. To make sure the data doesn’t get corrupted, it’s important to verify the pools maintain integrity. The simplest way to check data integrity is to initiate an explicit scrubbing of all data within the pool. This operation traverses all the data in the pool once and verifies that all blocks can be read.

The easiest way to do this is in the command line of your device, run:

sudo zpool scrub NAS

where NAS is the name of your zpool. To check on the status or results, run:

zpool status

Now, instead of doing this manually, one can set up a Cron Job to automatically do this at any interval of your choosing. One can open the cron tab by using this command:

sudo crontab -e

Once here, you will need to enter any job in the following format (as you can see in the image above): * * * * * command to execute

Each of the stars represent an interval, and the stars are wild card. So if you wanted to run a job every hour on the 35th minute of the hour, it would look like: 35 * * * * command ; or if you wanted to run a command once a day at 1.45am, it would look like: 45 1 * * * command. You can also use a comma to indicate multiple intervals (so 1.55am and 2.55am would be: 55 1,2 * * * command); and one can also use a hyphen to indicate a range of values (so 2.15am on Monday, Tuesday, Wednesday, Thursday, and Friday would be: 15 2 * * 1-5 command). 

For my purpose, I will be running two commands (with comments on top, indicated by the #):

# zpool scrub NAS every month at 2am on the 1st of the month
0 2 1 * * zpool scrub nas 
# zpool scrub NASbackup every month at 2am on the 2nd of the month
0 2 2 * * zpool scrub NASbackup 

And that’s it! Now, linux will automatically run that command, and you can run the zpool status anytime to find out the results. 

In my NAS, I also have two other jobs running, one runs every 30 minutes checking the zpool status, and if it’s anything other than healthy, it turns off the samba share. The other runs every morning at 5am to replicate NAS to NASbackup. Those commands look like this:

#Main replication job runs 5AM every day 
0 5 * * * sudo /home/gio/zfsreplication.sh
#Runs every 30 min to see status of pools and disable samba share if there is an error 
*/30 * * * * sudo /home/gio/zfsalert.sh

The sh files are bash shell scripts, that are essentially a set of linux commands. This is useful in cases of setting up one cron job to run a set of commands.