Docker on ZFS, In the Cloud and On-Prem - Thy Why (part 1/2)

This is a serie of 2 articles: the why and the how.

Why run Docker on ZFS?

TLDR; Backups are faster and easier. Data is better protected. ARC really shines on block storage. And you achieve this with open-source tools that shield you from cloud provider screw-ups, hyperscalers lockins, and make rebuilding your clusters from scratch easier.

Unique benefits of ZFS - accross the board

The unique benefits of ZFS apply to cloud-native workloads just as well.

  • Fast, easy backups with zfs send & zfs receive
  • … made even easier with tools such as syncoid and sanoid
  • Copy-on-Write (COW) Snapshots
  • Native encryption for data at rest
  • Data compression
  • Data deduplication (caution!)

When running on-prem

  • Data integrity with built-in checksumming
  • Advanced RAID features (mirrors of 2, mirrors of 3)

Those two aren’t relevant aswhen leveraging block storage in the cloud. Good quality cloud-based block storage options usually ensure data integrity, with our without multi-AZ. OVH for instance, leverages CEPH for their block storage infrastructure and data is triple-replicated even on the cheapest tier.

When running in the cloud

  • Hyperscaler also screw up sometime. You do not want to blindely rely on them doing their jobs when it comes to securing and backing up your data.

Here’s a great article that brings to point home better than I could: Lessons from the OVH fire: disaster recovery plans are not a work of fiction.

ZFS does great with cloud-native deployments on block-storage

When using ZFS with block storage in the cloud, the Adaptive Replacement Cache (ARC) is a key feature that can significantly contribute to the performance of your cloud-native application.

The ARC is a sophisticated caching mechanism in ZFS designed to keep frequently accessed data in RAM, providing faster access times and reducing the need to retrieve data from slower block storage.

This will translate into:

  • Faster Access to Frequently-Used Data
  • Reduced Dependency block storage (usually slower than local storage)
  • Improved I/O Performance

In practice

ZFS pool setup

On-premises, on raw drives

In the cloud, on block storage

Pick your poison hyperscaler.

AWS

Scaleway

OVH

Docker daemon setup

ZFS backups