ZFS (ZFS)
ZFS is a storage system that treats data integrity as its top priority. Every piece of data it stores gets a checksum (a mathematical fingerprint), and ZFS verifies that fingerprint every time data is read. If it detects corruption (from a failing drive, bad cable, or memory error), it automatically repairs it from a redundant copy. It also supports instant snapshots (freeze a point in time, undo mistakes), built-in compression, and flexible storage pools.
ZFS is a combined filesystem and logical volume manager originally developed by Sun Microsystems (2005), now maintained as OpenZFS. It provides end-to-end data integrity, pooled storage, snapshots, and built-in redundancy.
Key features:
- Copy-on-write (CoW): never overwrites existing data; writes to a new location and updates the pointer. Enables atomic transactions and instant snapshots.
- Checksumming: every data block and metadata block has a 256-bit checksum (SHA-256 or fletcher4). Detects silent data corruption (bit rot) that traditional filesystems miss.
- Self-healing: with redundant vdevs (mirror, RAIDZ), ZFS automatically repairs corrupted blocks using the checksum-verified good copy.
- Snapshots: instantaneous, space-efficient point-in-time copies. Only consume space as the active data diverges. Used for backups, rollbacks, and replication.
- Compression: transparent inline compression (LZ4 default, ZSTD for higher ratios). Often improves performance by reducing I/O.
- ARC (Adaptive Replacement Cache): sophisticated read cache in RAM that adapts to workload patterns.
- Send/Receive: efficient incremental replication of datasets between pools (local or remote). Foundation for backup strategies.
Redundancy levels (vdev types):
| Type | Drives | Fault Tolerance | Usable Capacity |
|---|---|---|---|
| Mirror | 2+ | N-1 drives | 50% (2 drives) |
| RAIDZ1 | 3+ | 1 drive failure | (N-1)/N |
| RAIDZ2 | 4+ | 2 drive failures | (N-2)/N |
| RAIDZ3 | 5+ | 3 drive failures | (N-3)/N |
Critical rule: never use hardware RAID controllers with ZFS. ZFS needs direct access to drives to perform checksumming and self-healing. Use HBA (Host Bus Adapter) in IT/JBOD mode.
ZFS pool and dataset management
# Create a mirrored pool
$ sudo zpool create tank mirror /dev/sda /dev/sdb
# Create datasets with compression
$ sudo zfs create -o compression=lz4 tank/data
$ sudo zfs create -o compression=zstd tank/backups
# Take a snapshot before changes
$ sudo zfs snapshot tank/data@before-migration
# List snapshots
$ sudo zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
tank/data@before-migration 128K - 24.5G -
# Rollback if something goes wrong
$ sudo zfs rollback tank/data@before-migration
# Send snapshot to remote backup server
$ sudo zfs send -i tank/data@yesterday tank/data@today | \
ssh backup-server sudo zfs receive backup/data
# Check pool health
$ sudo zpool status tank
pool: tank
state: ONLINE
scan: scrub repaired 0B in 02:15:30
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0 ZFS is the filesystem of choice for anyone serious about data integrity. TrueNAS (both CORE and SCALE) is built entirely around ZFS, providing a web UI for pool management, snapshots, and replication. Proxmox VE natively supports ZFS for VM storage, combining copy-on-write snapshots with virtualization for instant VM rollbacks. In enterprise settings, ZFS powers high-reliability storage for databases, media servers, and backup repositories. The standard homelab ZFS setup is a mirror or RAIDZ1 pool with automatic scrubs (integrity checks) scheduled weekly. The zfs send/receive pipeline is the foundation for automated off-site backups: take an incremental snapshot, send only the changes to a remote server. Regular scrubs are essential; they proactively read every block to verify checksums and trigger self-healing before a second drive failure compounds the problem.