dcsimg

Tuning CFQ – What Station is That?

The last article was a quick overview of the 4 schedulers in the Linux kernel. This article takes a closer look at the Completely Fair Queuing (CFQ) scheduler and how you can tune it.

This point of this article is not to perform a detailed tuning, but rather to show an example of how you can go about tuning. In particular, this article will change the value of the parameter, quantum which is the number of dispatched requests in the device queue. Increasing the value of this parameter means that the IO scheduler increases the possible number of IO operations executed in the alloted time slice. For this simple tuning example, the default value of quantum is used (quantum=4). It was also increased to 32 so that the impact of increased dispatched requests can be understood.

The goal from tuning (increasing the number of dispatched requests) is to improve throughput. If the application is doing synchronous IO (reads and/or write), then it is presumed that increasing the number of dispatches could improve the throughput.

One of the most important aspects of tuning is to define how you are going to measure performance before and after the tuning. This allows you to understand the impact of varying parameters. In the interest of time, only two benchmarks will be used to measure performance changes – fdtree and IOzone. In addition, in the interest of time, only the ext4 file system will be tested.

As with tests from previous articles each test for fdtree and iozone was run 10 times each for ext4. The test system used a stock CentOS 5.3 distribution but with a 2.6.30 kernel (from kernel.org) and e3fsprogs was upgraded to the latest version as of the writing of this article, 1.41.9. The tests were run on the following system:


  • GigaByte MAA78GM-US2H motherboard
  • An AMD Phenom II X4 920 CPU
  • 8GB of memory
  • Linux 2.6.30 kernel
  • The OS and boot drive are on an IBM DTLA-307020 (20GB drive at Ulta ATA/100)
  • /home is on a Seagate ST1360827AS
  • There are two drives for testing. They are Seagate ST3500641AS-RK with 16 MB cache each. These are /dev/sdb and /dev/sdc.

Only the first Seagate drive was used, /dev/sdb, for all of the tests.

Fdtree Results

There was a previous article that used a simple bash script, called fdtree, for measuring some basic metadata performance metrics. fdtree is a simple bash script that performs four different metadata tests:


  • Directory creation
  • File creation
  • File removal
  • Directory Removal

It creates a specified number of files of a given size (in blocks) in a top-level directory. Then it creates a specified number of subdirectories and then in turn subdirectories are recursively created up to a specified number of levels and are populated with files.

As in the previous article there are four tests for stressing the metadata capability:


  • Small files (4 KiB)


    • Shallow directory structure

    • Deep directory structure


  • Larger files (4 MiB)


    • Shallow directory structure

    • Deep directory structure


The two file sizes, 4 KiB (1 block) and 4 MiB (1,000 blocks) were used to get some feel for a range of performance as a function of the amount of data. The two directory structures were used to stress the metadata in different ways to discover if there is any impact on the metadata performance. The shallow directory structure means that there are many directories but not very many levels down. The deep directory structure means that there are not many directories at a particular level but that there are many levels.

The command lines for the four tests are:
Small Files – Shallow Directory Structure

./fdtree.bash -d 20 -f 40 -s 1 -l 3

This command creates 20 sub-directories from each upper level directory at each level (“-d 20″) and there are 3 levels (“-l 3″). It’s a basic tree structure. This is a total of 8,421 directories. In each directory there are 40 files (“-f 40″) each sized at 1 block (4 KiB) denoted by “-s 1″. This is a total of 336,840 files and 1,347,360 KiB total data.

Small Files – Deep Directory Structure

./fdtree.bash -d 3 -f 4 -s 1 -l 10

This command creates 3 sub-directories from each upper level directory at each level (“-d 3″) and there are 10 levels (“-l 10″). This is a total of 88,573 directories. In each directory there are 4 files each sized at 1 block (4 KiB). This is a total of 354,292 files and 1,417,168 KiB total data.

Medium Files – Shallow Directory Structure

./fdtree.bash -d 17 -f 10 -s 1000 -l 2

This command creates 17 sub-directories from each upper level directory at each level (“-d 17″) and there are 2 levels (“-l 2″). This is a total of 307 directories. In each directory there are 10 files each sized at 1,000 blocks (4 MiB). This is a total of 3,070 files and 12,280,000 KiB total data.

Medium Files – Deep Directory Structure

./fdtree.bash -d 2 -f 2 -s 1000 -l 10

This command creates 2 sub-directories from each upper level directory at each level (“-d 2″) and there are 10 levels (“-l 10″). This is a total of 2,047 directories. In each directory there are 2 files each sized at 1,000 blocks (4 MiB). This is a total of 4,094 files and 16,376,000 KiB total data.

The first combination tested was for small files (4 KiB) with a shallow directory structure. Table 1 below lists the wall clock times for the functions with an average value above and just below it, in red, is the standard deviation.

Table 1 – Benchmark Times Small Files (4 KiB) – Shallow Directory Structure

File System Directory Create
(secs.)
File Create
(secs.)
File Remove
(secs.)
Directory Remove
(secs.)
ext4
(default, quantum=4)
10.60
0.92
327.20
4.89
58.10
1.87
1.40
0.92
ext4
(quantum=32)
11.10
1.04
332.10
3.81
57.70
1.55
1.30
0.64

Our good benchmarking habits tell us that all but the File Create and the File Remove tests don’t really run long enough to be considered in the performance evaluation.

Table 2 below lists the actual performance results with an average value above and just below it, in red, the standard deviation. Recall that we’re only considering the File Create and the File Remove tests.

Table 2 – Performance Results of Small Files (4 KiB) – Shallow Directory Structure

File System Directory Create
(Dirs/sec)
File Create
(Files/sec)
File Create
(KiB/sec)
File Remove
(Files/sec)
Directory Remove
(Dirs/sec)
ext4
(default, quantum=4)
800.00
69.88
1,029.10
15.21
4,118.30
60.59
5,803.40
111.90
7,368.30
2,157.37
ext4
(quantum=32)
764.80
70.66
1,013.70
11.42
4,057.20
46.03
5,841.70
157.17
7,428.50
1,989.89

The results are about the same for the two values of quantum (4 and 32). Actually, the increased value of quantum decreases the performance slightly but the values are within the standard deviation so it’s very difficult to say one value of quantum is better than the other.

The second combination tested was for small files (4 KiB) with a deep directory structure. Table 3 below lists the benchmark times with an average value above and just below it, in red, is the standard deviation.

Table 3 – Benchmark Times Small Files (4 KiB) – Deep Directory Structure

File System Directory Create
(secs.)
File Create
(secs.)
File Remove
(secs.)
Directory Remove
(secs.)
ext4
(default, quantum=4)
187.00
11.22
443.20
7.69
192.50
12.51
73.30
42.09
ext4
(quantum=32)
189.30
12.45
454.10
5.58
204.80
13.79
54.50
9.72

All four tests can really be considered for comparison since the average time is fairly large.Table 4 below lists the performance results with an average value above and just below it, in red, the standard deviation.

Table 4 – Performance Results of Small Files (4 KiB) – Deep Directory Structure

File System Directory Create
(Dirs/sec)
File Create
(Files/sec)
File Create
(KiB/sec)
File Remove
(Files/sec)
Directory Remove
(Dirs/sec)
ext4
(default, quantum=4)
475.00
29.05
799.10
13.45
3,198.00
53.73
1,848.00
124.31
1,539.60
201.76
ext4
(quantum=32)
469.20
29.61
779.80
9.53
3,120.80
38.00
1,737.30
115.78
1,675.40
286.21

The performance result for the two values of quantum are almost the same. But the differences between the two are within the standard deviation of the measures making it difficult to say that one is better than the other.

The third combination tested was for medium files (4 MiB) with a shallow directory structure. Table 5 below lists the benchmark times with an average value above and just below it, in red, is the standard deviation.

Table 5 – Benchmark Times Medium Files (4 MiB) – Shallow Directory Structure

File System Directory Create
(secs.)
File Create
(secs.)
File Remove
(secs.)
Directory Remove
(secs.)
ext4
(default, quantum=4)
0.20
0.40
156.80
4.75
11.80
2.99
0.20
0.40
ext4
(quantum=32)
0.40
0.49
151.20
2.96
12.10
2.91
0.10
0.30

The only measure that can be considered for comparison is File Create because it takes over 60 seconds (our minimum amount of time).
Table 6 below lists the performance results with an average value above and just below it, in red, is the standard deviation.

Table 6 – Performance Results of Medium Files (4 MiB) – Shallow Directory Structure

File System Directory Create
(Dirs/sec)
File Create
(Files/sec)
File Create
(KiB/sec)
File Remove
(Files/sec)
Directory Remove
(Dirs/sec)
ext4
(default, quantum=4)
61.40
122.80
18.90
0.54
78,393.20
2,252.90
278.30
75.69
61.40
122.80
ext4
(quantum=32)
122.80
150.40
19.80
0.40
81,246.90
1,555.68
268.60
64.72
30.70
92.10

The differences between the two values of quantum are fairly close but the case of quantum=32 is a little bit faster. The difference between the two is just outside the standard deviation of the two, so we can say that the case of quantum=32 is a bit faster.

The fourth and final combination tested was for medium files (4 MiB) with a deep directory structure. Table 7 below lists the benchmark times with an average value above and just below it, in red, is the standard deviation.

Table 7 – Benchmark Times Medium Files (4 MiB) – Deep Directory Structure

File System Directory Create
(secs.)
File Create
(secs.)
File Remove
(secs.)
Directory Remove
(secs.)
ext4
(default, quantum=4)
3.20
0.75
219.50
1.12
13.40
4.72
1.20
0.40
ext4
(quantum=32)
2.50
0.81
215.60
1.50
11.60
2.84
2.50
1.12

As with the previous case the only measure that has a reasonable amount of testing time is File Create.Table 8 below lists the performance results with an average value above and just below it, in red, is the standard deviation.

Table 8 – Results of Medium Files (4 MiB) – Deep Directory Structure

File System Directory Create
(Dirs/sec)
File Create
(Files/sec)
File Create
(KiB/sec)
File Remove
(Files/sec)
Directory Remove
(Dirs/sec)
ext4
(default, quantum=4)
671.70
147.98
18.10
0.30
74,607.50
380.54
331.50
112.06
1,842.20
409.60
ext4
(quantum=32)
888.50
213.13
18.50
0.50
75,958.60
526.70
371.10
77.73
1,910.40
639.40

The values for File Create are about the same for the two values of quantum. However, as with the previous case, the differences are just outside the standard deviation of the two measures. So that means that the case of quantum=32 is a bit faster.

IOzone Results

Fdtree tested metadata performance and IOzone is used to test throughput. As with the previous article the tests that IOzone run are:


  • Write
  • Re-write
  • Read
  • Re-read
  • Random Read
  • Random Write
  • Backwards Read
  • Record Rewrite
  • Strided Read
  • Fwrite
  • Frewrite
  • Fread
  • Freread

For this article, as with the last, only four record sizes are tested: (1) 1MB, (2) 4MB, (3) 8MB, and (4) 16MB. For a file size of 16GB that results in (1) 16,000 records, (2) 4,000 records, (3) 2,000 records, (4) 1,000 records.

The command line for the first record size (1MB) is,

./IOzone -Rb spreadsheet_output_1M.wks -s 16G -r 1M > output_1M.txt

The command line for the second record size (4MB) is,

./IOzone -Rb spreadsheet_output_4M.wks -s 16G -r 4M > output_4M.txt

The command line for the third record size (8MB) is,

./IOzone -Rb spreadsheet_output_8M.wks -s 16G -r 8M > output_8M.txt

The command line for the fourth record size (16MB) is,

./IOzone -Rb spreadsheet_output_16M.wks -s 16G -r 16M > output_16M.txt

As mentioned previously, there are 13 tests that are each run 10 times and there are 4 record sizes. This makes a total of 520 tests that were run per file system.

Because of the large number of tests that are run, the results are split into two tables. The first table is for the write tests: write, re-write, random write, record re-write, fwrite, frewrite. The second table is for the read tests: read, re-read, random reads, backwards read, strided read, fread, and freread. Each group below has two tables for a specific record size group (1MB, 4MB, 8MB, 16MB). So that means there are 8 tables of results.

The first two tables of IOzone results are for the 1MB record size. Table 9 below presents the throughput in KB/s for the file systems for the 6 write tests.

Table 9 – IOzone Write Performance Results with a Record Length of 1MB and a File Size of 16GB

File System Write
KB/s
Re-write
(KB/s)
Random write
(KB/s)
Record re-write
(KB/s)
fwrite
(KB/s)
frewrite
(KB/s)
ext4
(default, quantum=4)
109,339.90
370.61
103,843.50
8,348.59
66,683.50
445.16
2,980,795.70
42,895.82
109,147.50
180.18
108,184.30
165.20
ext4
(quantum=32)
109,216.70
406.24
108,078.00
102.16
68,563.10
498.10
2,978,084.80
37,124.70
108,992.10
124.25
108,163.70
75.60

In comparing the two rows of data we see the following:


  • Random write is a little faster for quantum=32 (the difference is outside the standard deviation)
  • fwrite for quantum=32 is a little slower (the difference is outside the standard deviation)
  • The standard deviation of re-write is much smaller for quantum=32 (2 orders of magnitude)

Table 10 below presents the throughput in KB/s for the file systems for the 7 read tests for a record length of 1MB.

Table 10 – IOzone read Performance Results with a Record Length of 1MB and a File Size of 16GB

File System Read
(KB/s)
Re-read
(KB/s)
Random read
(KB/s)
Backwards read
(KB/s)
Strided read
(KB/s)
fread
(KB/s)
freread
(KB/s)
ext4
(default, quantum=4)
97,969.20
46.97
98,050.60
55.21
52,620.10
104.39
75,419.50
135.84
51,048.90
35.03
98,039.90
15.35
98,041.30
54.90
ext4
(quantum=32)
97,904.50
54.41
97,992.00
41.19
52,597.90
100.18
75,570.90
91.59
51,107.10
27.70
97,947.30
61.78
98,011.80
32.61

In comparing the the two results, the only difference is that fread for quantum=32 is a little slower (the difference is outside the standard deviation).

The next two tables of results are for the 4MB record size. Table 11 below presents the throughput in KB/s for the file systems for the 6 write tests.

Table 11 – IOzone Write Performance Results with a Record Length of 4MB and a File Size of 16GB

File System Write
KB/s
Re-write
(KB/s)
Random write
(KB/s)
Record re-write
(KB/s)
fwrite
(KB/s)
frewrite
(KB/s)
ext4
(default, quantum=4)
109,407.30
142.35
101,720.10
9,752.95
76,935.90
406.46
2,496,937.70
91,639.95
109,115.20
220.57
105,865.70
6,536.20
ext4
(quantum=32)
109,375.40
59.60
108,058.70
77.88
86,134.50
123.38
2,541,886.20
84,327.10
109,089.60
164.59
107,721.60
982.00

In comparing the two results, the following observations are made:


  • Re-write for quantum=32 is faster (the difference is outside the standard deviation)
  • Random write for quantum=32 is much faster – about 12% (this is larger than the standard deviation so it’s a valid comparison)
  • The standard deviation for re-write is two orders of magnitude smaller for quantum=32

Table 12 below presents the throughput in KB/s for the file systems for the 7 read tests for a record length of 4MB.

Table 12 – IOzone Read Performance Results with a Record Length of 4MB and a File Size of 16GB

File System Read
(KB/s)
Re-read
(KB/s)
Random read
(KB/s)
Backwards read
(KB/s)
Strided read
(KB/s)
fread
(KB/s)
freread
(KB/s)
ext4
(default, quantum=4)
97,929.80
43.56
98,041.80
44.83
86,095.00
108.35
102,674.20
88.16
88,711.90
113.00
98,012.70
45.39
98,054.10
37.33
ext4
(quantum=32)
97,877.10
42.77
97,979.90
15.40
86,134.50
123.38
102,508.70
89.58
88,821.60
108.66
97,965.30
25.21
97,981.10
45.03

In comparison the results seem to be about the same (within the standard deviations) for both quantum=4 and quantum=32.

The next two tables of results are for the 8MB record size. Table 13 below presents the throughput in KB/s for the file systems for the 6 write tests.

Table 13 – IOzone Write Performance Results with a Record Length of 8MB and a File Size of 16GB

File System Write
KB/s
Re-write
(KB/s)
Random write
(KB/s)
Record re-write
(KB/s)
fwrite
(KB/s)
frewrite
(KB/s)
ext4
(default, quantum=4)
109,327.10
171.36
104,004.90
8,327.54
79,617.30
647.18
1,532,668.10
33,213.47
109,296.60
95.56
107,530.10
1,743.57
ext4
(quantum=32)
109,317.50
156.27
108,123.20
106.00
81,868.40
792.33
1,536,628.60
30,775.62
109,002.10
346.58
108,053.30
242.40/font>


  • Re-write is faster for quantum=32 but the difference is very close to the standard deviation.
  • The standard deviation for rewrite for quantum=32 is two orders of magnitude smaller for quantum=32 than quantum=4
  • The standard deviation for frewrite is much smaller for quantum=32 than quantum=4

Table 14 below presents the throughput in KB/s for the file systems for the 7 read tests for a record length of 8MB.

Table 14 – IOzone Read Performance Results with a Record Length of 8MB and a File Size of 16GB

File System Read
(KB/s)
Re-read
(KB/s)
Random read
(KB/s)
Backwards read
(KB/s)
Strided read
(KB/s)
fread
(KB/s)
freread
(KB/s)
ext4
(default, quantum=4)
97,985.40
40.32
98,031.20
55.91
98,299.90
195.36
108,635.40
159.98
100,060.90
128.50
97,968.30
43.66
98,053.30
18.49
ext4
(quantum=32
97,965.50
35.45
97,974.30
54.67
98,549.50
64.85
108,622.50
92.20
100,270.40
128.13
97,896.40
44.47
97,990.40
18.50

The results for quantum=32 and quantum=4 are about the same (i.e. the differences are smaller than the standard deviations).

The final two tables of results are for the 16MB record size. Table 15 below presents the throughput in KB/s for the file systems for the 6 write tests.

Table 15 – IOzone Write Performance Results with a Record Length of 16MB and a File Size of 16GB

File System Write
KB/s
Re-write
(KB/s)
Random write
(KB/s)
Record re-write
(KB/s)
fwrite
(KB/s)
frewrite
(KB/s)
ext4
(default, quantum=4)
109,348.90
141.87
103,807.50
8,477.20
81,682.40
1,304.42
1,505,397.20
41,948.58
109,129.30
201.36
101,880.90
9,679.37
ext4
(quantum=32)
109,240.70
156.27
108,082.10
66.94
84,538.60
671.46
1,508,204.20
22,140.12
109,064.10
289.74
108,204.40
96.89


  • Re-write for quantum=32 is faster than quantum=4 but the differences are somewhat close to the standard deviation of quantum=4
  • Random write is faster for quantum=32 than than the default
  • The frewrite performance is faster for quantum=32 than quantum=4 but the differences are somewhat close to the standard deviation of quantum=4
  • The standard deviation of re-write is two orders of magnitude smaller for quantum=32
  • The standard deviation of frewrite is two orders of magnitude smaller for quantum=32

Table 16 below presents the throughput in KB/s for the file systems for the 7 read tests for a record length of 16MB.

Table 16 – IOzone Read Performance Results with a Record Length of 16MB and a File Size of 16GB

File System Read
(KB/s)
Re-read
(KB/s)
Random read
(KB/s)
Backwards read
(KB/s)
Strided read
(KB/s)
fread
(KB/s)
freread
(KB/s)
ext4
(default, quantum=4)
97,943.60
31.41
98,040.20
39.20
105,651.60
151.52
111,708.70
219.17
106,597.60
127.14
97,944.60
53.76
98,053.00
22.69
ext4
(quantum=32
97,887.60
38.79
97,981.10
37.36
105,807.90
74.27
111,775.60
124.08
106,723.40
53.60
97,873.70
31.77
97,980.70
19.45

The performance for quantum=32 and quantum=4 are about the same.

Summary of Results

The metadata performance using fdtree didn’t change too much when the CFQ parameter, quantum, was changed from 4 to 32 (maximum number of dispatches). For the larger file case (4 MiB), quantum=32 was slightly faster, but not appreciably.

In the case of throughput performance as measured by the specific IOzone tests, the read performance between quantum=4 (default) and quantum=32 is about the same (i.e. within the standard deviation). However there is a bit more of an impact on write performance.


  • Re-write is a bit faster for quantum=32 than the default of quantum=4
  • Random write is faster for quantum=32 than the default. At a record size of 4MB, it is 12% faster.
  • In some cases, frewrite is a little bit faster for quantum=32 but it’s not consistent across the record sizes tested.
  • One of the most pronounced changes is in the standard deviation. For the re-write test, the case of quantum=32 the standard deviation in performance is two orders of magnitude smaller than the default case.
  • While not as pronounced, the standard deviation for frewrite decreases as the record size increases for quantum=32 to the point where at a record size of 16MB it is two orders of magnitude smaller than the default case.

Overall, there was not as much change in the performance for an increase in quantum from 4 to 32. However, this is just a simple experiment to show what you can do in tuning CFQ and to illustrate that it’s not always easy because you have to be sure to run your performance tests before and after the changes in the IO scheduler. In addition, there are a number of parameters in CFQ that can be changed to affect performance. So changing one parameter at a time may not produce the performance change you want.

Comments on "Tuning CFQ – What Station is That?"

Always a massive fan of linking to bloggers that I enjoy but really don’t get a great deal of link appreciate from.

We came across a cool web-site that you may delight in. Take a look when you want.

Just beneath, are several absolutely not associated web-sites to ours, nevertheless, they’re surely worth going over.

The time to read or stop by the material or internet sites we have linked to below.

Here are several of the internet sites we recommend for our visitors.

Please stop by the web sites we comply with, like this 1, as it represents our picks in the web.

The information mentioned within the post are several of the most beneficial readily available.

We came across a cool site that you just may well love. Take a search in the event you want.

The information and facts mentioned in the report are several of the very best available.

The time to read or go to the subject material or websites we’ve linked to below.

Although web-sites we backlink to beneath are considerably not related to ours, we really feel they’re essentially really worth a go as a result of, so have a look.

The information mentioned inside the report are several of the most effective accessible.

Here are some links to internet sites that we link to since we believe they’re worth visiting.

The time to read or take a look at the content material or web sites we have linked to below.

Here are some hyperlinks to websites that we link to since we assume they’re worth visiting.

We prefer to honor many other world wide web websites on the internet, even when they aren?t linked to us, by linking to them. Underneath are some webpages worth checking out.

Please go to the websites we follow, which includes this one particular, as it represents our picks in the web.

One of our guests recently encouraged the following website.

Here is an excellent Blog You may Obtain Intriguing that we encourage you to visit.

Wonderful story, reckoned we could combine several unrelated data, nevertheless definitely worth taking a search, whoa did a single master about Mid East has got a lot more problerms also.

Wonderful story, reckoned we could combine some unrelated data, nonetheless genuinely really worth taking a look, whoa did one particular understand about Mid East has got much more problerms at the same time.

Here are some hyperlinks to internet sites that we link to because we assume they are really worth visiting.

Wonderful story, reckoned we could combine several unrelated data, nonetheless actually really worth taking a search, whoa did a single find out about Mid East has got extra problerms at the same time.

Here is a superb Weblog You may Locate Intriguing that we encourage you to visit.

Below you?ll obtain the link to some internet sites that we think you should visit.

Every once in a while we decide on blogs that we study. Listed beneath would be the most recent web pages that we opt for.

Here are some hyperlinks to websites that we link to due to the fact we assume they are worth visiting.

One of our visitors not long ago recommended the following website.

Usually posts some very intriguing stuff like this. If you are new to this site.

Leave a Reply