![]() |
Musing on Metrics column |
||||||||||||||||||||
|
For those STC chapters that hold one, one of the biggest projects of the year can be the Publications Competition. I'm an experienced hand at the Boston competition, having participated over the years as an entrant, judge, senior judge, Best of Show judge, member of the chapter competitions committee, and judge trainer. I've also judged for the International Technical Publications Competition, open to chapter-level winners of the highest award level (Distinction). I'll bet even some writers who believe documentation quality can't be measured enter the competition. Why? Because they know their work will be judged by a team of seasoned working professionals. Even writers who balk at the notion of objective quality can accept the integrity and judgment of their peers. This is how the Boston competition works. The judges assign an overall rating to each entry on a 1-to-5 scale. The competition overseers want to help judges render objective and consistent opinions, so we provide a checklist of attributes for them to consider. The checklist for printed entries consists of 46 attributes grouped into seven major categories: 1. Audience and purpose (five attributes) 2. Organization (eight attributes) 3. Content (seven attributes) 4. Writing and editing (six attributes) 5. Illustrations/graphics (seven attributes) 6. Layout and design (seven attributes) 7. Production and integration (six attributes) For example, the writing and editing attributes are:
After using it on thousands of entries, we're confident that the checklist covers everything that contributes to development of an outstanding document. By and large, we're happy with the results: the good documents win. Does the checklist reflect ideas of documentation quality? Definitely! Half the attributes touch on appropriateness, consistency, or suitability, so the overall spirit is Juran's fitness for use. I can think of easy metrics for nine attributes:
Another ten attributes, such as a clear audience definition, are yes/no questions. Finally, some attributes, such as consistency of terminology and phrasing, are measurable with difficulty. Admittedly, some attributes (such as appropriateness of tone and style) seem beyond measurement. But even in those areas, I think tallying takes place. For example, judges note instances of inappropriate tone and style, and over some threshold may decide it's a problem and downgrade the score. Overall, I believe all the attributes are measurable, and if we could define metrics for all 46 attributes, we could derive a single score that reflects the quality of an entry. This may give some readers pause. But this is exactly what judges do now. They assign ratings of 1 to 5 to each attribute, then each category, and finally each document. We don't ask for an exact rank-ordering of all entries, but we do expect entries with higher overall ratings to win higher-level awards. The process seems mysterious only when a person does not fully understand how judges arrive at their individual assessments. I would argue that the better we can understand the process, the more objective we can make it. There are shortcomings to my approach, but they reflect limitations of the competition itself. Each category (print, online, and art) has subcategories, within which the questions have different weights. (For example, an index is critical for a reference manual, but unnecessary for a quick-reference document.) Even within a single subcategory say, GUI-based user's guides there is no industry-wide agreement on attribute weighting; much work remains to be done. Some critical quality attributes remain beyond our ability to judge. Without access to the products themselves, we cannot judge timeliness and accuracy. But on the whole, evaluation by measurement can accurately model the actual judging process. |
|||||||||||||||||||||
Next time: |
Prove your quality | ||||||||||||||||||||
|
About the Author: Steven Jong STC Senior Member, Boston Chapter |