Linux Don’t Need No Stinkin’ ZFS: BTRFS Intro & Benchmarks

ZFS may be locked into the Solaris operating system but "Butter FS" is on the horizon and it's boasting more features and better performance.

Feature Breakdown

Btrfs is a very ambitious project with tons of new features. While the more important of these features have already been listed, it is worthwhile to go into more depth about some of the key features.

Dynamic inode allocation:
This feature may sound boring but it is actually quite important when creating a file system (or extending one). Dynamic inode allocation means that when the file system is created only a few inodes are created. Then when inodes are needed the file system creates them on the fly. What this feature does for you is that the creation or extension of a file system takes place extremely fast. Contrast this with ext3 that takes minutes to create the file system.

Btrfs allows you to create snapshots of your file system or sections of your file system. You can then use these snapshots to create backups or as a fast emergency copy of existing data. You can also use it to grab sections of the file system to dump to an archive (after archiving you go back and erase the snapshot and the original data). Usually no modifications are made to the snapshot since it’s used for critical functions such as backups, repairs, or archiving. But in some cases, you may want to write to the snapshot and btrfs allows you to do this. For example, you might take a snapshot prior to do working with a directory. Then you can make sure the active directory updates the snapshot so you can keep an updated copy for various purposes. Btrfs allows you to this. It also allows you to take a snapshot of a snapshot. There is a great deal of flexibility in btrfs’ snapshot functionality.

Copy On Write:
Copy on write is a technique that allows a piece of data to be copied when it is written (making two copies). This has all kinds of uses in btrfs. For example, btrfs can use it in conjunction with snapshots or even snapshots of snapshots and allow them to be easily updated. Btrfs also uses copy on write in conjunction with logging. This makes the file system even more resilient because a copy of the data or metadata to be written is kept in case of a power failure (basically you can keep a copy of the journal until the file system is absolutely sure that the journal has been committed and the data or metadata is on the disk and correct).

Btrfs has the capability of taking parts of a file system and mounting them as the root part for an internal file system. This is terribly useful it you want to limit user access to a certain potion of a directory structure. For example if there is a subdirectory that users need to access without being allowed access to other parts of the main directory, then the user subdirectory can be mounted as a subvolume and to the user it appears as root file system for that data.

Multiple Devices:
With current Linux file systems, if we want to create a RAID-0 or RAID-1 or any other RAID level, then ideally we have to use lvm to create the volumes and then use a hardware RAID card or software RAID (md) to combine the volumes into a device that can then be formatted by the file system. This can make things difficult in terms of management. But btrfs can do multiple devices (RAID) as part of the file system. At this time, it can do RAID-0, RAID-1, RAID-10 but will be adding other RAID levels. Btrfs also allows you to add devices (disks) to the file system once the file system has been formatted (dynamic inode allocations are a big key as well) and allows you to remove devices from the file system (all while it’s mounted).

Fsck and Defragmentation Enhancements:
Fsck can be the bane of an administrator’s existence because you usually have to take the file system offline, run fsck to repair it, which can take a great deal of time, and then remount the file system (assuming everything was fine). Btrfs allows you to perform an fsck on a mounted file system that is actually in use. While the performance of a file system undergoing an fsck is not spectacular, you can still use the file system. In addition, despite the best efforts of file system developers, fragmentation happens and can severely impact performance. To “defrag” a file system you have to also take it off line, perform the defragmentation, and then remount the file system. Btrfs allows you to defrag the file system while it’s mounted and in use.

Encryption of file systems is becoming an ever popular topic especially for corporate systems such as laptops, that are stolen from time to time (you see the occasional article about a laptop that is lost of stolen that has sensitive information on the hard drive). There are encryption add-ons that can be used with existing file systems to provide encryption. Btrfs has strong encryption built into the file system and will be adding additional encryption techniques in the future (remember that it’s a work in progress).

In addition to encryption, btrfs can also provide compression to save space and improve performance. Currently it uses the zlib capabilities built into the kernel.

Btrfs – Coming Soon?

When a new feature for Linux is discussed there are many people who take that to mean that the feature is fully baked and ready to use. However, the truth is that the feature is not yet ready and still requires a great deal of testing, debugging, and development. These new features are almost always initially developed outside the kernel. Once the feature reaches a certain point it is sometimes added to the kernel and marked as “experimental”. This is done so that the feature gets more exposure and so it can be developed as part of the kernel rather than try to hoist a potentially large external code base into the kernel.

This process is exactly the process being followed in the development of btrfs. Initially the code base was developed outside the kernel. Recently, in the 2.6.29 kernel, btrfs was added with the “experimental” label. The goal is to get much wider testing and exposure while making the development process easier because btrfs is in the kernel. So btrfs is not really ready for prime time use, but it is ready for testing. If you want to help Linux development and don’t have the skills for kernel development (like me), you can make a definite contribution by testing something like btrfs and report any problems to the btrfs mailing list.

Creating btrfs File Systems and Benchmarking

As with the previous article on ext4, it is always good to start with a quick introduction to commands for a new file system. Btrfs is no exception.

The btrfs wiki contains a set of instructions on getting started with btrfs at the moment. Assuming that one follows these directions, then diving into creating file systems is fairly easy.

A simple way to start is by using,

% mkfs.btrfs /dev/sda1

This happens so fast that I ran the “date” command before and after creating the filesystem to see how long the command took. Here is the output from the command:

[root@test64 ~]# ./test.sh
Sat Apr 18 15:47:50 EDT 2009

WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sda1
        nodesize 4096 leafsize 4096 sectorsize 4096 size 465.76GB
Btrfs Btrfs v0.18
Sat Apr 18 15:47:51 EDT 2009

As your can see it took about 1 second to create a file system on a 465GB file system. The is a direct result of the dynamic inode allocation.

Once the file system is created it is easy to mount.

% mount -t btrfs /dev/sda1 /mnt/data_btrfs

It is really simple to create a file system from multiple devices. At this time by default, the metadata is mirrored across all of the devices. and the data is striped across all available devices. In addition, btrfs allows you to define the metadata behavior. You can have metadata in the following manner:

  • RAID-0 (metadata is appended across all devices present)
  • RAID-1 (metadata is mirrored across all devices present)
  • RAID-10 (metadata is appended and mirrored across all devices present)
  • single (metadata is mirrored on a single device)

An example of building btrfs with multiple devices is,

% mkfs.btrfs /dev/sda1 /dev/sdb1

Then you can mount the file system as before using either /dev/sda1 or /dev/sdb1. Note that using multiple devices doesn’t slow down the creation of the file system compared to a single device.

Next: Benchmarking btrfs