SarCheck(TM): Automated Analysis of Linux data

(English text version 6.01.00)


This is an analysis of the data contained in the file 2005q1. The data was collected from 2005/01/21 to 2005/03/21, from system 'localhost'. There were 4853 data records collected over 51 days used to produce this analysis. Operating system was 2.2.16-22. The number of processors present could not be determined, therefore the amount of processors are estimated from the cpu statistics. 1 processor is assumed to be present. 125 megabytes of memory are present.

The date format used in this report is yyyy/mm/dd. The date format was set in the sarcheck_parms file.

Data collected by the ps -elf command during 51 days between 2005/01/21 and 2005/03/21 will also be analyzed. This program will attempt to match the starting and ending times of the ps -elf data with those of the report file named 2005q1.

The default GRAPHDIR was changed with the -gd switch to /tmp.

Table of Contents

DIAGNOSTIC MESSAGE: The number of disks in SarCheck's disk table is 1 and the table is 0.5 percent full. The number of entries in SarCheck's ps table is -1 and the table is -0.0 percent full.

Command line used to produce this report: analyze -ptoo -dtoo -png -gd /tmp -hgd ./ -html -t -diag 2005q1

SUMMARY

When the data was collected, no CPU bottleneck could be detected. No memory bottleneck was seen and the system has sufficient memory. A change has been recommended to at least one tunable parameter. Recommendations can be found in the Recommendations Section of this report.

Some of the defaults used by SarCheck's rules have been overridden using the sarcheck_parms file. See the Custom Settings section of the report for more information.

RECOMMENDATIONS SECTION

All recommendations contained in this report are based solely on the conditions which were present when the performance data was collected. It is possible that conditions which were not present at that time may cause some of these recommendations to result in worse performance. To minimize this risk, analyze data from several different days and implement only regularly occurring recommendations.

Change the bdflush parameter 'nfract' from 40 to 52. This is the percentage of dirty buffers allowed in the buffer cache before the kernel flushes some of them.

Change the bdflush parameter 'ndirty' from 500 to 250. This is the number of dirty blocks written to disk at one time when the bdflush daemon wakes up.

Change the bdflush parameter 'nrefill' from 64 to 128. This will allow the operating system to obtain more clean buffers when refill_freelist() is called.

Change the bdflush parameter 'nref_dirt' from 256 to 512. This is recommended in order to keep its value 4 times higher than nrefill.

To change the value of the bdflush parameters immediately as described in the above recommendations, use the following command:

    echo "52 250 128 512 500 3000 500 1884 2" > /proc/sys/vm/bdflush

With some kernels, this will not work because the file /proc/sys/vm/bdflush is read-only and you may not be able to change its permissions. If you are able to make this change and it improves performance, you can make the change permanent by adding the command to the /etc/rc.d/rc.local file.

RESOURCE ANALYSIS SECTION

23 reboots were detected and the first one occurred between 16:20:00 on 2005/01/21 and 08:30:00 on 2005/01/24. Data collected during reboot periods are ignored by this tool.

Average CPU utilization was only 0.4 percent. This indicates that spare capacity exists within the CPU. If any performance problems were seen during the monitoring period, they were not caused by a lack of CPU power. CPU utilization peaked at 18.44 percent from 11:30:00 to 11:40:00 on 2005/02/01. A CPU upgrade is not recommended because the current CPU had significant unused capacity.

Graph of CPU utilization

The average amount of free memory was 8154.7 pages or 31.9 megabytes. The minimum amount of free memory was 568 pages or 2.22 megabytes at 09:40:01 on 2005/02/21.

Graph of megabytes of free memory remaining

The above graph has been zoomed in to show the relationship between the size of the free list and the values of freepages parameters.

The freepages.min value was 255 pages or 1.0 megabytes. The freepages.low value was 510 pages or 2.0 megabytes. The freepages.high value was 765 pages or 3.0 megabytes. If the system's free list drops below freepages.high the kernel will start gently swapping. No significant memory bottleneck was seen. The number of pages of free memory occasionally dipped below the value of freepages.high but was never less than freepages.low.

The value of nfract, the dirty buffer threshold used to wake up bdflush, was set to 40 percent. The goal of tuning nfract is to keep it low enough that the number of dirty buffers in the cache is not enough to degrade performance, but high enough to allow as many dirty buffers in the cache as possible. In this case a recommendation was made to increase the value to 52 percent.

The value of ndirty was set to allow bdflush to write 250 buffers to the disk at one time. The recommended decrease should make I/O less bursty and will save a small amount of memory.

The nrefill parameter was set to 64 buffers. This parameter controls the number of buffers to be added to the free list whenever bdflush calls refill_freelist(). A recommendation to increase this value to 128 will result in fewer calls to refill_freelist(). The nref_dirt parameter was set to allow refill_freelist() to wake up bdflush whenever it found more than 256 dirty buffers. A recommendation to increase this value to 512 will keep it properly aligned with nrefill.

The interval parameter in /proc/sys/vm/bdflush controls how frequently the kernel update daemon runs, and it was set to 500 jiffies. A jiffie is a clock tick and on x86 systems, there are 100 jiffies per second.

The age_super parameter was set to write dirty metadata buffers to disk when they were 500 jiffies old.

The value of the page_cluster parameter was 4. This means that 16 pages are read at once. Values of 4 or 5 are better for large systems that perform non-interactive jobs using sequential I/O. There may be an advantage in lowering this value if response times need to be improved during heavy I/O, but I/O-bound jobs may suffer as a result.

The kswapd parameter tries_base was set to 512. This controls the number of pages that kswapd will try to free each time it runs. The kswapd parameter tries_min was set to 32. This controls the number of times that kswapd tries to free a page of memory when it's called. The kswapd parameter swap_cluster was set to 32. This controls the number of pages that kswapd will try to write when it is called.

The average page in rate was 0.201 per second. Page ins peaked at 27.34 per second from 09:30:00 to 09:40:01 on 2005/02/21. The average page out rate was 0.531 per second. Page outs peaked at 19.00 per second from 11:30:00 to 11:40:00 on 2005/02/01.

Graph of swap out rate

The average swap in rate was greater than zero but less than .01 per second. Swap ins peaked at 1.46 per second from 15:10:00 to 15:20:00 on 2005/03/18. The average swap out rate was 0.02 per second. Swap outs peaked at 4.03 per second from 12:00:00 to 12:10:00 on 2005/02/21.

Graph of swap space used

The amount of swap space in use peaked at 37.32 megabytes from 14:20:00 to 14:30:00 on 2005/03/18. The average amount of swap space in use was 28.36 megabytes. The size of swap space was 70.56 megabytes. The peak amount of swap space in use was 52.89 percent of the total.

There was one swap partition seen in /proc/swaps. The rate of swap operations peaked at 4.45 per second from 09:30:00 to 09:40:01 on 2005/02/21.

There were 5 superblocks in use and a maximum of 256 superblocks were available. There is plenty of room for growth here.

According to data collected from /proc/partitions, the system-wide disk I/O rate averaged 0.44 per second and peaked at 26.15 per second from 09:30:00 to 09:40:01 on 2005/02/21. The read rate averaged 0.11 per second and peaked at 21.70 per second from 09:30:00 to 09:40:01 on 2005/02/21. The write rate averaged 0.33 per second and peaked at 4.88 per second from 11:20:00 to 11:30:00 on 2005/02/02.

Graph of systemwide disk I/O rate

The -dtoo switch has been used to format disk statistics into the following table.

Disk Device Statistics
Disk Device Average
%busy
Peak
%busy
Average
IO/sec
Peak
IO/sec
Average
read/sec
Average
write/sec
hda 0.13 15.06 0.44 26.15 0.11 0.33

The I/O rate on disk device hda averaged 0.44 per second and peaked at 26.15 per second from 09:30:00 to 09:40:01 on 2005/02/21. The read rate averaged 0.11 per second. The write rate averaged 0.33 per second. This disk was busy for an average of 0.13 percent of the time and was 15.06 percent busy at peak times.

Graph of disk percent busy

The noatime option was specified on at least one of the mounted filesystems. Because non-trivial levels of disk activity were seen, you may want to decide whether it would be helpful to mount some filesystems with this option. Disk activity on the busiest disk device, hda, peaked at 15.06 percent busy from 09:00:00 to 09:10:00 on 2005/02/03.

The value of the ctrl-alt-del parameter was 0. The value of 0 is better in almost all cases because it prevents an immediate reboot if the ctrl, alt, and delete keys are pressed simultaneously.

There were an average of 135.90 interrupts per second and the peak interrupt rate seen was 440.06 per second from 09:30:00 to 09:40:01 on 2005/02/21. The following graph shows the total interrupt rate during the monitoring period.

Graph of Interrupt rate

During multiple time intervals on 2005/03/17 ps -elf data indicated that there were a peak of 86 processes present. This was the largest number of processes seen with ps -elf but it is not likely to be the absolute peak because the operating system does not store the true "high-water mark" for this statistic. There were an average of 83.3 processes present.

Graph of the number of processes present

No runaway processes, memory leaks, or suspiciously large processes were detected in the data contained in the ps data files. No table was generated because no unusual resource utilization was seen in the ps data.

CAPACITY PLANNING SECTION

The section is designed to provide the user with a rudimentary linear capacity planning model and should be used for rough approximations only. These estimates assume that an increase in workload will affect the usage of all resources equally. These estimates should be used on days when the load is heaviest to determine approximately how much spare capacity remains at peak times.

NOTE: Since the capacity planning algorithms used by SarCheck are linear and use resource utilization during the peak interval, they may underestimate the remaining capacity of some resources when multiple days of data are analyzed at once.

Based on the data available, the system should be able to support a moderate increase in workload at peak times, and memory is likely to be the first resource bottleneck. See the following paragraphs for additional information.

Graph of remaining room for growth

The CPU can support an increase in workload of at least 100 percent at peak times. For more information on peak CPU and disk utilization, refer to the Resource Analysis section of this report. The system was able to meet its memory needs with minimal activity to reclaim pages of memory. It should be able to handle a moderate increase in workload. The busiest disk can support a workload increase of at least 100 percent at peak times. For more information on peak CPU and disk utilization, refer to the Resource Analysis section of this report.

CUSTOM SETTINGS SECTION

The default SYSUSR threshold was changed in the sarcheck_parms file from 2.5 to 2.8.

The default HSIZE was changed in the sarcheck_parms file from 0.75 to 1.20.

The date format of yyyy/mm/dd was set in the parms file.

The PS keyword was found in the parms file and has been used to specify the analysis of ps -elf data.

Please note: In no event can Aptitune Corporation be held responsible for any damages, including incidental or consequent damages, in connection with or arising out of the use or inability to use this software. All trademarks belong to their respective owners. This software licensed exclusively for use on a single system by: Your Company. This software expires on 2005/06/19 (yyyy/mm/dd). Code version: SarCheck for Linux 6.01.00. Serial number: 77777888.

This software is updated frequently. For information on the latest version, contact the party from whom SarCheck was originally purchased, or visit our web site.

(c) Copyright 2003-2005 by Aptitune Corporation, Plaistow NH 03865, USA, All Rights Reserved. http://www.sarcheck.com/

Statistics for system: localhost
  Start of peak interval End of peak interval Date of peak interval
Statistics collected from: 2005/01/21      
Statistics collected until: 2005/03/21      
MAC Address: 00:02:B3:3A:41:58      
Average combined CPU utilization: 0.37%      
Average user CPU utilization: 0.25%      
Average sys CPU utilization: 0.12%      
Average 'nice' CPU utilization: 0.00%      
Peak combined CPU utilization: 18.44% 11:30:00 11:40:00 2005/02/01
Average page out rate: 0.53/sec      
Peak page out rate: 19.00/sec 11:30:00 11:40:00 2005/02/01
Average swap out rate: 0.02/sec      
Peak swap out rate: 4.03/sec 12:00:00 12:10:00 2005/02/21
Average swap space in use: 28.36 megabytes      
Peak swap space in use: 37.32 megabytes 14:20:00 14:30:00 2005/03/18
Average amount of free memory: 8155 pages or
31.9 megabytes
     
Minimum amount of free memory: 568 pages or
2.22 megabytes
09:40:01   2005/02/21
Average system-wide I/O rate: 0.44/sec      
Peak system-wide I/O rate: 26.15/sec 09:30:00 09:40:01 2005/02/21
Average read rate: 0.11/sec      
Peak read rate: 21.70/sec 09:30:00 09:40:01 2005/02/21
Average write rate: 0.33/sec      
Peak write rate: 4.88/sec 11:20:00 11:30:00 2005/02/02
Disk device w/highest peak: hda      
Avg pct busy for that disk: 0.13%      
Peak pct busy for that disk: 15.06% 09:00:00 09:10:00 2005/02/03
Average Interrupt rate: 135.90/sec      
Peak Interrupt rate: 440.06/sec 09:30:00 09:40:01 2005/02/21
Avg number of processes seen by ps: 83.3      
Max number of processes seen by ps: 86      
Approx CPU capacity remaining: 100%+      
Approx I/O bandwidth remaining: 100%+      
Can memory support add'l load: Moderate