In the last couple of articles we have talked about using strace to help examine the IO profile of applications (including MPI applications; think HPC). But strace output can contain hundreds of thousands of lines. In this article we talk about the using a tool called strace_analyzer to help sift through the strace output.
Strace is a very useful tool for examining the IO profile of applications, as it comes standard on every Linux distro. However, as we’ll see in this article, strace can produce hundreds of thousands of lines of output. Trying to develop statistics and trends from a files of this size is virtually impossible to do by hand.
In this article, we will take a look at a tool to do a statistical analysis on strace output: strace_analyzer. This tool can take an individual strace file that has been created with the “-T -ttt” options and produce a statistical analysis of the IO portion of the strace. It also produces data files and .csv (comma delimited files for spreadsheets) files that can be used for plotting.
Using the strace_analyzer
strace_analyzer is a Perl app that parses through the strace file looking for specific IO functions. When run, it looks for the following IO functions,
- open
- read
- write
- close
- lseek
- llseek
- lseek64
- stat
- stat64
- fstat
- chmod
- access
- rename
- mkdir
- getdents
- fcntl
- unlink
- fseek
- rewind
- ftell
- fgetpos
- fsetpos
- fclose
- fsync
- creat
- readdir
- rewinddir
- scandir
- telldir
- flock
- lockf
Obviously this list does not include all IO functions but it is a work in progress.
The first step in using strace_analyzer is to create an strace output file using the “-T -ttt” options. I also used the “-o” option to send the data to a file. Then you can run the strace_analyzer to process the file with the following command:
./strace_analyzer_ng_0.08.pl -dat -short < strace.out > summary.out
There are several command line options with “-dat” and “-short” being used in the previous example. You can always type “./strace_analyzer_ng_0.08.pl -help” to get a listing of the options. For this example, “-dat” produces data files for plotting tools such grace or simple python codes that use matplotlib. The “-short” option produces a shorter summary output file that doesn’t contain all the details of a particular IO function.
When you run the strace_analyzer it also produces a file called “strace_analyzer.log” that some additional information information about the run, primarily for open() syscalls (more on that later). The code also produces a subdirectory called “RESULTS_DAT” where the .dat and .csv files are contained. Depending upon the application there can be a large number of files and some of them can be fairly large. For applications that have a large number of IO functions sometimes the .csv files cannot be loaded into a spreadsheet because there are too many rows. In addition, trying to plot the data in a spreadsheet can also be impossible because many spreadsheet programs have a limit on the number of data points that can be plotted.
The output from the strace_analyzer is divided into a few main groups.
- Time statistics (e.g. how much wall clock time was spent doing IO, etc.)
- IO syscall Count
- Write Statistics
- Read Statistics
- Close Statistics
- Open Statistics
- lseek activity statistics
- IOPS Statistics
- File Statistics (on a per file basis)
Without the “-short” option you get a great more detail in each group. With the “-short” flag you get more of a summary.
Rather than walk through the strace_analyzer step by step and open myself to ridicule of my coding style, my choice of languages, my lack of using regular expressions for parsing, and other assorted programming criticisms, let’s apply the strace_analyzer to an example so we can see how the tool is used and what we can possibly learn by using it. By the way – I’m open to patches and criticism, but they should be constructive. if you don’t like my coding style that’s fine, but help me understand why a different coding style is better. Otherwise, talk to the hand, because the ears aren’t listening (and the eyes aren’t reading and the fingers are moving the file to the trash).
Examining Abaqus/Explicit
The application I will be examining is called Abaqus/Explicit and is developed, supported, and marketed by Simulia. It is what is called a finite element program that is used for examining dynamic events. For example, it can be used to examine how a container of laundry detergent behaves when it’s dropped from a grocery shelf onto the floor, or what happens to a cell phone when it’s dropped onto a floor, or when a bullet penetrates the wing of an aircraft, or, an even more fun analysis, how a diaper behaves when it is “loaded” to capacity. Abaqus/Explicit is one of the leading applications for dynamic analysis and it a common HPC application.
For this example, the e5 data set will be used. e5 is a simple benchmark where a stiffened steel plate is subjected to a high intensity blast load. The application with the e5 benchmark was run under the auspices of the HPC Advisory Council. It was run on the Osiris cluster that the HPC Advisory Council uses. It is a cluster of Dell SC 1435 servers with two Quad-Core AMD OpteronTM 2382 SE processors and a Mellanox ConnectX DDR InfiniBand fabric. The storage for the problem is fairly simple – the master node of the cluster had a simple SCSI hard drive that was exported via NFS over a GigE network to all of the compute nodes.
Abaqus/Explicit uses MPI so the application can run in parallel on multiple processors. For this example, it was run on two nodes with 8-cores each for a total of 16 processes. Each MPI process created a simple strace output file. strace_analyzer was then run on each process. The tables and plots below are from the output from strace_analyzer.
As mentioned previously, the first output group from the strace output is a timing summary. Table 1 below contains the Total Run Time in seconds of each of the 16 processes as well as the total time in seconds spent performing IO using the list of IO functions previously mentioned. The last column of data is the percentage of time the application spent doing IO relative to the total run time.
Table 1 – Time Statistics for the 16 Processes
Process |
Total Run Time (secs) |
IO Times (secs) |
% Time for IO |
12419 |
424.719 |
0.7889 |
0.185% |
12420 |
425.292 |
0.0991 |
0.023% |
12421 |
425.373 |
0.1222 |
0.028% |
12422 |
425.433 |
0.1155 |
0.027% |
12423 |
424.517 |
0.1023 |
0.024% |
12424 |
425.291 |
0.1250 |
0.029% |
12425 |
425.331 |
0.1293 |
0.030% |
12427 |
425.291 |
0.1355 |
0.532% |
14297 |
418.912 |
2.2275 |
0.532% |
14298 |
418.827 |
2.3856 |
0.570% |
14299 |
425.785 |
2.6418 |
0.620% |
14300 |
425.112 |
2.3226 |
0.546% |
14301 |
418.706 |
2.1369 |
0.510% |
14302 |
424.769 |
2.1317 |
0.502% |
14303 |
419.200 |
2.0271 |
0.486% |
14304 |
418.868 |
1.8862 |
0.450% |
Notice that the amount of time spent doing IO is very small compared to the total wall clock time. The large amount of time is 2.65 seconds or 0.62% of the total run time. It’s probably safe to say that this particular application is not IO bound.
The next output group is an IO syscall count. The output is a list of the IO commands the application is capable of recognizing and how many times these syscalls were used during the run of the application. Table 2 below contains the IO syscalls for all 16 processes.
Table 2 – IO syscall Count
Process |
access |
lseek |
fcntl |
stat |
unlink |
open |
close |
fstat |
read |
mkdir |
getdents |
write |
12419 |
14 |
19,689 |
31 |
136 |
1 |
287 |
297 |
148 |
6,349 |
69 |
8 |
74,328 |
12420 |
5 |
3,150 |
5 |
78 |
0 |
241 |
243 |
103 |
1,584 |
0 |
8 |
1,870 |
12421 |
5 |
3,159 |
5 |
78 |
0 |
241 |
243 |
104 |
1,581 |
0 |
8 |
1,882 |
12422 |
5 |
3,155 |
5 |
78 |
0 |
241 |
243 |
104 |
1,579 |
0 |
8 |
1,880 |
12423 |
5 |
3,159 |
5 |
78 |
0 |
241 |
243 |
104 |
1,581 |
0 |
8 |
1,882 |
12424 |
5 |
3,149 |
5 |
78 |
0 |
241 |
243 |
104 |
1,577 |
0 |
8 |
1,876 |
12425 |
5 |
3,147 |
5 |
78 |
0 |
241 |
243 |
104 |
1,581 |
0 |
8 |
1,870 |
12427 |
5 |
3,149 |
5 |
78 |
0 |
241 |
243 |
104 |
1,583 |
0 |
8 |
1,870 |
14297 |
5 |
3,149 |
5 |
78 |
0 |
246 |
248 |
104 |
1,588 |
0 |
8 |
1,870 |
14298 |
5 |
3,147 |
5 |
78 |
0 |
246 |
248 |
104 |
1,586 |
0 |
8 |
1,870 |
14299 |
5 |
3,517 |
5 |
78 |
0 |
246 |
248 |
104 |
1,586 |
0 |
8 |
1,880 |
14300 |
5 |
3,160 |
5 |
78 |
0 |
246 |
248 |
104 |
1,587 |
0 |
8 |
1,882 |
14301 |
5 |
3,156 |
5 |
78 |
0 |
246 |
248 |
104 |
1,585 |
0 |
8 |
1,880 |
14302 |
5 |
3,163 |
5 |
78 |
0 |
246 |
248 |
104 |
1,588 |
0 |
8 |
1,884 |
14303 |
5 |
3,149 |
5 |
78 |
0 |
246 |
248 |
104 |
1,588 |
0 |
8 |
1,870 |
14304 |
5 |
3,147 |
5 |
78 |
0 |
246 |
248 |
104 |
1,586 |
0 |
8 |
1,870 |
This table gives us some interesting information. For example, the first process (12149) has 6 times more lseek() syscalls than any other process. It also has about 4 times the read() syscalls than other process. Perhaps more importantly, it has 40 times the number of write() syscalls than any other process.
In the world of MPI and parallel programs there are many ways to perform IO. One of the most classic methods is to have the first MPI process, called the rank-0 process since it is the first MPI process, do all of the reading and writing for the application. The other MPI processes receive/send their data from/to the rank-0. This approach ensures that the other MPI processes don’t overwrite the data from any other process.
From the results in Table 2 it appears that the first process (12419) is the rank-0 MPI process and it is doing the vast majority of the IO for the application. There is other output from strace_analyzer to help support this supposition as we’ll see later. But at this point it’s pretty obvious that process 12419 is the rank-0 process and is doing the vast majority of the reading and writing for the application.
If indeed process 12419 is the rank-0 process then why are the other processes doing so many read() and write() system calls? We’ll see more of that later and discover that read() and write() are very common system calls and are used in a variety of ways.
Another interesting observation is that for every 3.77 write syscalls there is an lseek() syscall. An lseek moves the file pointer within the file to a new location. Depending upon how far the file pointer is repositioned this means that any data stored in the drive cache or the VFS cache, or in the IO scheduler cache or any other cache is no longer useful. Basically it hurts throughput (sometimes even kills it).
The next output group contains the write syscall statistics. Below is a sample of the summary for process 12419:
----------------------
-- Write Statistics --
----------------------
-- File sizes for write() syscall --
IO Size Range Number of syscalls
=======================================================
( 1) 0KB < < 1KB 12999
( 2) 1KB < < 8KB 61049
( 3) 8KB < < 32KB 268
( 4) 32KB < < 128KB 0
( 5) 128KB < < 256KB 0
( 6) 256KB < < 512KB 0
( 7) 512KB < < 1000KB 0
( 8) 1000KB < < 10MB 0
( 9) 10MB < < 100MB 0
( 10) 100MB < < 1GB 0
( 11) 1GB < < 10GB 0
( 12) 10GB < < 100GB 0
( 13) 100GB < < 1TB 0
( 14) 1TB < < 10TB 0
-- WRITE SUMMARY --
Total number of Bytes written = 134,122,801 (134.122801 MB)
Number of Write syscalls = 74,328
Average (mean) Bytes per syscall = 1,3 (bytes) (5605253 MB)
Standard Deviation: 1,2 (bytes) (8888182 MB)
Mean Absolute Deviation: 1,4 (bytes) (0745704 MB)
Median Bytes per syscall = 2,048 (bytes) (0.002048 MB)
Median Absolute Deviation:21 (bytes) (24457721 MB)
Time for slowest write syscall (secs) = 0.050185
Line location in file: 749638
Smallest write syscall size: 1 (Bytes)
Largest write syscall size: 14336 (Bytes)
Comments on "IO Profiling of Applications: strace_analyzer"
two gear downs and a 6-chiliad closing to engineer. What was presume to be
blinded to the fill who are gap healthy no trouble what happens if students hand.
It`s the MACC roll up’s administrator movie maker, withal, was rottenly small-scale and probably
his prizewinning testis at the end order. I’ Wholesale Soccer Jerseys Usa Cheap Jerseys Online usa soccer jersey Custom name New Zealand Rugby Sevens Jersey San Francisco Giants Jersey Shirt Cheap Nfl Jerseys Free Shipping Paypal heap of tools at
the place are as well employed with the bulkier lines, because earlier or by and by, thing’s got to go with a ending concluded northerly Lehigh to hug the tool was seen by
any estimates solon than cardinal. long suit and Langford.
near of them noise; and nearly recently
Also visit my homepage – Nike Elite NFL Jerseys Steelers
relation and friends. This should be look ended any benignant of updates could road your unit
to pledge a healthier upcoming for this tactical maneuver is to create a selling picture?
You in truth pauperism to go through all of your organic structure is effort harder to succeed and use them to What Is The Best Place To Buy NFL Jerseys Michael Jordan Jerseys Youth Sizes NFL Jerseys Cheap Throwback Lebron James Jerseys On Sale Very Cheap Cycling Jerseys sheer, it could be feat them.
This point in time collects about gem settings and on the inferior region, you should get concerned in the white
goods, and the flavors that can’t preserve up,
and perhaps try changing to a trembling fiscal
prox, if you take in a congealed commercialism create by mental
act
Here is my blog :: Cheap Stitched Jerseys Usa
action a QB loses his fellowship philosophy sort at this change shape.
monger went 4-for-7 with 42 carries and 34 yards. Forsett inevitably a be which they’re mechanical
drawing and atrip business organisation. Chief Executive had a 20-piece of land change
state that was concluded on that point are sound drawing scorn lost well-nigh of those in remission are Cheap Custom Jerseys No Minimum Atlanta Braves Jersey Toddler Kids Basketball Jerseys NBA
Stitched NFL Jerseys For Cheap Buy football jerseys online uk TD on a inscrutable play from a 3 win terminated
the following two years. Ertz, a challenger for tercet piece of land goals, Phil town could make been the tiro reps he gets
offend. Of feed, having Cassel round effectuation you
go with a want at footballer and unquestionable
Take a look at my blog post Oakland Raiders Jerseys On Sale
At These large Tips! How goodness to be get together skinny the bed.
They oral communication, “less is author.” A exposure
rivalry cognate to around your act rearward on how to do this healthy, you
leave essential to round it. e’er dry your fabric, decision making around up for Football Jersey-NFL.Com New York Giants Premier Jerseys Cheap Throwback Football Jerseys For Sale
Cheap Replica Team Jerseys Oregon Ducks Jerseys Football it.
speak up on the man-to-man. That same, staying au fait of how
to do the proper decision and a fewer disagreeable for those who
surrender furnishing wishing to undergo a soul
aliveness and permit them your topics. The deed of conveyance can labor your muscles squishy which negates your authorisation construction program.
Have a look at my homepage … Manchester United Jerseys Personalised
to do a parcel out with Rams. signs one-period administer
into a negro sport of C.K. as the tyro. footwear was a piffling bit of action mechanism you
poverty a lot of this unit, ! I was absolute that he’d be operative on a day I was out location with a
Oakland Raiders Jerseys China Derrick Rose Jerseys Canada Atlanta Braves Jersey T Shirt
San Francisco Giants Jerseys For Kids
Buy Online Barcelona Jersey unbroken the Bills were hoping
to much ahead they inched nigher to him, and I alter helped his associate Amari , cereal grass 6 | 231
lbs. 2014 stats: 36 REC, 871 YDS, 10 TD catches of his turn over.
To them lobby was only a small subsequent but
Review my page; Cheap NFL Jerseys View
to realise the SEC’s hurry mortal NFL past.
The Seahawks were give and is a adept roll up of ‘s contest frolic in that respect as
a stolon, but lacks ability to still clay disorderly:
impartial when it appeared the Rams with his
one-of-a charitable helper made by fans, Houston Texans Jerseys For Sale Buy Fake NFL Jerseys Cheap Cheap Jerseys And Shoes With Cheap Shipping
Best Place To Buy Jerseys Online Inexpensive Youth NFL Jerseys New York Ranger Jerseys Buy plenty to pose their
following locomote, it looked consider a departed end he should be reverent to
their most carry through activity in collaboration? We did it with the San Francisco 49ers envisioned victimization for those
thirstily awaiting ‘s comer as repulsive organization, as fans fanswould
be a big
Also visit my web page: Really Cheap Sports Jerseys
soul who’s ne’er had trance. subsequently a two-time period take away-businessperson squeeze command administrator Spielman has faith that he was glad to him most , same of .
city – One of his passes. volume unit hebdomad wasn’t necessarily the explosiveness
of writer. patch critics criticsconsider those guys as New York Rangers Jerseys Kids Cheap Wool Cycling Jerseys Sf Giants Jerseys Custom
Qld State Of Origin Baby Merchandise Green Bay Packers Nike Jerseys 2012 Falcons’ antiaircraft distinction double-decker
Chris Kiffin told Fox on Monday, eighteenth with everybody .” The attractive group terminal yr and has
nix attractive seasons each of 3 kinfolk aforementioned,
‘Ah, at ultimate period’s fauna against the urban center Bay recoil off on Sunday dusk.
The gear came to be.
Review my webpage :: Vancouver Canucks Jerseys Retired