chuckmaginessfabals
New member
Hi. I'm logging performance counters on a 2008 R2 server, and I'm after a bit of advice on interpreting the results from the disk counters, if anyone can help. Sorry if what follows seems convoluted, but I'm confused by a lack of clear rules by which to interpret disk counters reliably.
Most of the MS stuff I've found just reiterates, and elaborates on, the 'Explain'/'Description' text, but one non-MS posting I found casts doubt on the reliablility of seemingly key counters like % Disk Time, Current Queue Length and Average Disk Queue Length. For example:
% Disk Time
===========
I've read that this counter is 'capped' and therefore, does 'not actually measure disk utilization'.
I also find it returns figures of several hundred per cent where RAID is involved, and I'm not sure it's as simple as dividing that by the number of disks in an array to get a meaningful figure.
Current Disk Queue Length
=========================
This counter is, apparently, unreliable, because, 'If requests are queued in the hardware, which is usual for SCSI disks and RAID controllers, the Current Disk Queue Length Counter will show a value of 0, even though requests are queued.'
Avg. Disk Queue Length
======================
This counter, I read, is derived from Avg.Disk sec/Transfer and Disk Transfers/sec, and requires an 'equilibrium assumption' to be factored in, namely, 'that the arrival rate equals the completion rate over the measurement interval. Otherwise, the calculation is meaningless.'
The corollary of this, apparently, is that the Ave. Disk Queue Length Counter value should not be accepted as reliable except where the current value of Current Disk Queue Length is the same as the previous value of Current Disk Queue Length.
In a recent log, the only instances of this were where the current and previous values for Current Disk Queue Length were 0 (though other values were recorded at other times). Given that 0 is supposedly an unreliable value for Current Disk Queue Length, does this render the Avg. Disk Queue Length values for these intervals meaningless?
Any advice on how to interpret these (and any other) disk counters to get meaningful figures on disk performance (specifically, whether the disk is a likely bottleneck) would be greatly appreciated.
Most of the MS stuff I've found just reiterates, and elaborates on, the 'Explain'/'Description' text, but one non-MS posting I found casts doubt on the reliablility of seemingly key counters like % Disk Time, Current Queue Length and Average Disk Queue Length. For example:
% Disk Time
===========
I've read that this counter is 'capped' and therefore, does 'not actually measure disk utilization'.
I also find it returns figures of several hundred per cent where RAID is involved, and I'm not sure it's as simple as dividing that by the number of disks in an array to get a meaningful figure.
Current Disk Queue Length
=========================
This counter is, apparently, unreliable, because, 'If requests are queued in the hardware, which is usual for SCSI disks and RAID controllers, the Current Disk Queue Length Counter will show a value of 0, even though requests are queued.'
Avg. Disk Queue Length
======================
This counter, I read, is derived from Avg.Disk sec/Transfer and Disk Transfers/sec, and requires an 'equilibrium assumption' to be factored in, namely, 'that the arrival rate equals the completion rate over the measurement interval. Otherwise, the calculation is meaningless.'
The corollary of this, apparently, is that the Ave. Disk Queue Length Counter value should not be accepted as reliable except where the current value of Current Disk Queue Length is the same as the previous value of Current Disk Queue Length.
In a recent log, the only instances of this were where the current and previous values for Current Disk Queue Length were 0 (though other values were recorded at other times). Given that 0 is supposedly an unreliable value for Current Disk Queue Length, does this render the Avg. Disk Queue Length values for these intervals meaningless?
Any advice on how to interpret these (and any other) disk counters to get meaningful figures on disk performance (specifically, whether the disk is a likely bottleneck) would be greatly appreciated.