Using the principles of good benchmarking, we explore the metadata performance of four linux file systems using a simple benchmark, fdtree.
In a previous article, the case was made for how low file system benchmarks have fallen. Benchmarks have become the tool of marketing to the point where they are mere numbers and do not prove of much use. The article reviewed a paper that examined nine years of storage and file system benchmarking and made some excellent observations. The paper also made some recommendations about how to improve benchmarks.
This article isn’t so much about benchmarks as a product, but rather it is an exploration looking for interesting observations or trends or the lack thereof. In particular this article examines the metadata performance of several Linux file systems using a specific micro-benchmark. Fundamentally this article is really an exploration to understand if there is any metadata performance differences between 4 Linux file systems (ext3, ext4, btrfs, and nilfs) using a metadata benchmark called fdtree. So now it’s time to eat our dog food and do benchmarking with the recommendations previously mentioned.
Start at the Beginning – Why?
The previous article made several observations about benchmarking, one of which is that storage and file system benchmarks seldom, if ever, explain why they are performing a benchmark. This is a point that is not to be underestimated. Specifically, if the reason why the benchmark was performed can not be adequately explained, then the benchmark itself becomes suspect (it may just be pure marketing material).
Given this point, the reason the benchmark in this article is being performed is to examine or explore if, and possibly how much, difference there is between the metadata performance of four Linux file systems using a single metadata benchmark. The search is not to find which file system is the best because it is a single benchmark, fdtree. Rather it is to search for differences and contrast the metadata performance of the file systems.
Why is examining the metadata performance a worthwhile exploration? Glad that you asked. There are a number of applications, workloads, and classes of applications that are metadata intensive. Mail servers can be very metadata intensive applications because of the need to read and write very small files. Sometimes databases have workloads that do a great deal of reading and writing small files. In the world of technical computing, many bioinformatic applications such as gene sequencing applications, do a great deal of small reads and writes.
The metadata benchmark used in this article is called fdtree. It is a simple bash script that stresses the metadata aspects of the file system using standard *nix commands. While it is not the most well known benchmark in the storage and file system world, it is a bit better known in the HPC (High Performance Computing) world.
An Examination of fdtree
Before jumping into the results, it is appropriate and highly recommended to examine the benchmark itself. fdtree is a simple bash script that performs four different metadata tests:
- Directory creation
- File creation
- File removal
- Directory Removal
It creates a specified number of files of a given size (in blocks) in a top-level directory. Then it creates a specified number of subdirectories and then in turn subdirectories are recursively created up to a specified number of levels and are populated with files.
This phase of the benchmark begins by creating the number of specified directories in the main directory using the simple “mkdir” command in a bash function “create_dir”.
The bash variables specify the details of the directory names. The next step is to call the “create_dir” function recursively with a different “base name” (directory) to create all of the required directories.
create_dirs $((nl-1)) $base_name"L"$nl"D"$nd"/"
This step of the benchmark creates the required number of files using the “dd” command in a bash function, “create_files”.
dd if=/dev/zero bs=4096 count=$fsize of=$file_name > /dev/null 2>&1
To create files in the subdirectories, the bash function is called recursively. As part of the benchmark, the number of 4 KiB blocks is specified ($fsize).
The third function in the benchmark is to remove the files that were created. This is done with the standard “rm” command in a function called “remove_files”.
rm -f $file_name
The function “remove_files” is called recursively to remove all of the files.
The fourth and final function in the benchmark is to remove the directories. This is done in a bash function “remove_dirs” using the *nix command “rmdir $dir_names”
The function “remove_dirs” is called recursively to remove all of the directories.
Overall the script uses standard *nix commands for the benchmark. It does not use any recursive options for any of the *nix commands. It stresses the metadata capabilities of the file system because of the potentially large number of files and directories.
The one interesting thing that the test does, is round the results, time and rates, to integer values. So there could be times when the time to execute the test could be 0 seconds. That is, the test ran in less than 1 second.
Running the benchmark
In the benchmark exploration in this article, fdtree was used in 4 different approaches to stressing the metadata capability:
- Small files (4 KiB)
- Shallow directory structure
- Deep directory structure
- Larger files (4 MiB)
- Shallow directory structure
- Deep directory structure
The two file sizes, 4 KiB (1 block) and 4 MiB (1,000 blocks) were used to get some feel for a range of performance as a function of the amount of data. The two directory structures were used to stress the metadata in different ways to discover if there is any impact on the metadata performance. The shallow directory structure means that there are many directories but not very many levels down. The deep directory structure means that there are not many directories at a particular level but that there are many levels.
To create the specific parameters for fdtree used in the exploration, there were three overall goals:
- Keep the total run time to approximately 10-12 minutes at a maximum
- Keep the total data for the two directory structures approximately the same
- Keep the run time for each of the four functions greater than 1 minute if possible
All four functions were not always run for 1 minute, sometimes only for a few seconds. These will be noted in the results.
The command lines for the four combinations are:
Small Files – Shallow Directory Structure
./fdtree.bash -d 20 -f 40 -s 1 -l 3
This command creates 20 sub-directories from each upper level directory at each level (“-d 20″) and there are 3 levels (“-l 3″). It’s a basic tree structure. This is a total of 8,421 directories. In each directory there are 40 files (“-f 40″) each sized at 1 block (4 KiB) denoted by “-s 1″. This is a total of 336,840 files and 1,347,360 KiB total data.
Small Files – Deep Directory Structure
./fdtree.bash -d 3 -f 4 -s 1 -l 10
This command creates 3 sub-directories from each upper level directory at each level (“-d 3″) and there are 10 levels (“-l 10″). This is a total of 88,573 directories. In each directory there are 4 files each sized at 1 block (4 KiB). This is a total of 354,292 files and 1,417,168 KiB total data.
Medium Files – Shallow Directory Structure
./fdtree.bash -d 17 -f 10 -s 1000 -l 2
This command creates 17 sub-directories from each upper level directory at each level (“-d 17″) and there are 2 levels (“-l 2″). This is a total of 307 directories. In each directory there are 10 files each sized at 1,000 blocks (4 MiB). This is a total of 3,070 files and 12,280,000 KiB total data.
Medium Files – Deep Directory Structure
./fdtree.bash -d 2 -f 2 -s 1000 -l 10
This command creates 2 sub-directories from each upper level directory at each level (“-d 2″) and there are 10 levels (“-l 10″). This is a total of 2,047 directories. In each directory there are 2 files each sized at 1,000 blocks (4 MiB). This is a total of 4,094 files and 16,376,000 KiB total data.