|
Message Severity Levels: How Much Is Enough? By Martin Schwirzke and Mayuresh Ektare This article describes how we investigated software message severity levels using surveys in a series of usability tests and how the results helped us create a standard set of severity levels. These findings can also be applied to other messages. Introduction Cadence Design Systems is a supplier of electronic design automation software, used to design semiconductors and other electronics based products. Each of Cadence Design System’s software products use a different system for error messages, a practice that generates inconsistent messages and results in a high volume of customer support calls. To improve customer satisfaction and reduce support costs, the Cadence Error Message Improvement Project (EMIP) team developed message writing standards and a unique ID-based message handling system. The message system and writing standards require a standard set of severity levels to help enforce consistency. Our work supported the development of this standard. We decided to perform a usability test to evaluate severity level granularity and to identify a set of severity levels that software development teams will use consistently. We hypothesized that increasing the number of severity levels, or granularity, does not improve a user’s ability to interpret the problem described in the message. Method Twenty-three test subjects, all Cadence employees, volunteered to participate in three different online surveys. The purpose of the surveys was to determine how many severity levels were needed to help subjects identify the type of problem in the message scenarios. The surveys allowed us to explore this issue from different perspectives: subjects either rated pre-assigned severity levels or assigned their own levels. The expected findings were that subjects would only need a small number of severity levels (Error and Warning) to identify problems in the messages. Each survey used the same six message scenarios. The messages adhered to our message writing standards. In the first survey, four subjects rated the appropriateness of pre-assigned Fatal, Error, and Warning severity levels assigned to the message texts. The second survey was a fill-in survey in which three subjects assigned severity levels to the same message texts. In the third survey (based on a modified version of the fill-in survey), 16 subjects were allowed to define their own severity levels. Results Pre-Assigned Levels Survey In the first survey, test subjects agreed with the appropriateness of the three severity levels that were pre-assigned to the messages 71% of the time, but this was unevenly distributed across the message types. Respondents agreed to severity levels assigned to Error messages 88% of the time but only 62% of the time for Warning and Fatal messages (Figure 1). Respondents rated the severity level and the text in the message scenario as being equally important to resolving the problem 67% of the time. In the remaining eight cases, nearly twice as many respondents evaluated the message text (21%) as more informative than the severity level (12%). The message text was more informative because it contained detailed information about the problem. The severity level is more of a message ID, because it allows users to identify the type of problem.
Fill-In Survey When asked to assign levels to the messages themselves, the subjects assigned the correct severity level to the messages only 50% of the time, the result represented great differences according to the level. All of the Fatal messages were incorrectly assigned the Error severity level. The results suggest message severity levels are extremely subjective. The subject’s prior experiences with other messages influence his or her perception of message severity levels. (The survey instructions can be ruled out as a reason for this finding because subjects were asked to assign their own severity levels—severity levels were not provided in the fill-in survey). Although the severity level was correctly assigned to Error messages 83% of the time, only half of the Warning messages were correctly assigned (Figure 2).
Follow-Up Survey In the Follow-Up Survey, subjects assigned the correct severity level to the messages 40% of the time. Subjects used five different terms to describe severity levels: Fatal (4 times), Failure (2), Error (55), Warning (25), and Info (1). An interesting result was that they assigned Error severity levels to Fatal messages 73% of the time. (One survey was incomplete and therefore six ratings were not used, and several subjects did not provide a rating for three more cases, thus nine ratings were missing from this study.) Severity level was correctly assigned to Error messages 64% of the time, while Warning messages (33%) and Fatal messages (4%) received far fewer correct assignments (Figure 3).
Conclusions Subjects perceived all of the Fatal messages to be Error messages in the fill-in survey. The results of the follow-up survey reinforce the finding that subjects tend to rate Fatal messages the same as Error messages. When asked to provide severity levels for the messages, subjects consistently used two levels: Warning and Error (fill-in survey). This supports the argument that increasing the granularity of severity levels is unnecessary. Across all three surveys, the combination of message text and severity level was rated most important to solving the problem. In the remaining cases, message text was rated as more informative than the severity level by a margin of almost 2:1. The severity levels listed in our message writing standards were used in 93% of the survey responses (that is, Error and Warning—the Information and Question levels were not tested). The observation that subjects did not increase the granularity of the severity levels when rating the messages suggests a smaller number of severity levels is sufficient. The benefits of using fewer severity levels include: (1) more consistency in its application by software development teams and (2) less confusion for users. Recommendations As a result of our survey, we recommend using the following severity levels: Note: The Information and Question severity levels were not tested, but they are included in the recommended set because messages with Information and Question severity levels are easily identified by users, and they are used consistently at Cadence Design Systems.
|
|||||
|