CHI '95 ProceedingsTopIndexes
Short PapersTOC

User Action Graphing Effort (UsAGE)

Dana L. Uehling

Code 522.1
NASA / Goddard Space Flight Center
Greenbelt, MD 20771
(301) 286-3375, Dana.Uehling@gsfc.nasa.gov

Karl Wolf

Century Computing, Inc.
1014 West Street
Laurel, MD 20707
(301) 953-3330, kwolf@cen.com

© ACM

Abstract

This paper describes a prototype usability test tool which will automate detection of serious usability problems. The tool records the actions that a user makes while performing a predefined application task. Currently the tool supports only user interfaces created with TAE Plus.

Prior to a usability testing session, an "expert" user is recorded performing a task. The recording becomes a performance baseline. Later, during actual usability testing, a "novice" user is recorded performing the same task. The action recordings of the two users are then compared by the tool and the comparison results are shown graphically. The hypothesis is that by graphically comparing the actions of an expert and average novice users, a usability analyst can quickly figure out where usability problems (e.g. confusion with menu choices) arise with the user interface.

Keywords:

Usability testing, user interface design, TAE Plus.

Introduction

It is well known that there are many benefits to evaluating a user interface for usability. One method of evaluating a user interface is through usability testing. The testing involves observing a typical user performing predefined tasks with a system. Various types of information may be recorded including the time it takes to perform the task, the number and types of errors made, and the user's rating of the system. Often video recordings of the user sessions are also made. This data is then analyzed to identify problem areas in the user interface. This analysis is largely a manual process and can be quite time consuming.

As part of the usability testing method we have been using at NASA's Goddard Space Flight Center, we record one or more developers performing the usability tasks and use their performance as a baseline for defining the desired performance of a novice user [3]. Since the developers of the system typically know the optimum way to perform a task, a large deviation from this optimum performance may signify a usability problem. A tool to automate this comparison of performance could be useful and cost effective.

A paper by Asahi and Iseki [1] describes a tool that analyzes logs of user "events" combined with state information about the application and produces various graphical representations of the analysis. Here, the authors were analyzing a finite-state-machine application for a fax machine user interface. The number of states was small and well defined, and each logged event contained this state information.

Several questions were raised by this paper: Given that our applications typically are not written as finite state machines, can a tool like this be developed to help with our usability evaluations? How can we apply similar recording and analysis techniques to our applications? What is the best way to graphically show the matching of user actions?

DEVELOPMENT of UsAGE

Early in the development cycle we spent some time researching efforts by others to automate the analysis of usability testing data. While there have been efforts to automate the collection of data during a usability test [2], much of the analysis of this data is still a manual process. Our goal is to automate this process as much as possible

The quest for automation forced us to answer many fundamental questions: What is a measurable action? What information is logged in the event record and how do we best use it? How do we compare actions? What criteria do we use? What is the best way to display the comparison results? How do you get an analyst involved? How can the analyst provide feedback to the tool so the tool can make improved comparisons?

DESCRIPTION of UsAGE

We based our tool, UsAGE, on TAE Plus, a user interface development and management system [4], since TAE already contained the ability to record user actions. A utility called usage_collect was developed to record and store the actions of the users during a usability test session. The utility also enables the usability test administrator to record time-stamped comments during the test session.

After the usability testing is complete, the usability test analyst runs the usage_analyze utility. This utility allows the user to specify two record files to compare, an "expert" file and a "novice" file. The files are passed to a third utility called usage_compare, a perl script, which performs the comparison. Several comparison techniques are available for matching the novice action nodes to the expert action nodes: forward from last matched node, forward from first unmatched node, nearest neighbor to last matched node, and best fit. For example, the forward-from-last-matched-node method tries to compare the next novice node to the expert node that is after the last matched expert node. If that node does not match the novice node, it will try to match the novice node with the next expert node in the series. It continues searching through the expert nodes until a match is found or until it has exceeded the analyst specified event match window.

After the comparison is performed, usage_analyze displays the results in a graph of action nodes [FIGURE 1: UsAGE graph of novice nodes vs. expert nodes.]. The "expert" series of actions is displayed linearly across the top of the graph. The novice series of actions is displayed as a comparison of the expert's action, with arrows denoting the order in which the actions were performed. Out-of-sequence actions are represented by arc's connecting arrows pointing in the forward or reverse direction as necessary. Unmatched actions taken by the novice appear as nodes placed vertically below previously matched expert nodes. Currently, the nodes are labeled with the TAE event name.

In addition to the graph, usage_analyze displays some metrics about the comparison results including percentage of expert nodes matched to novice nodes, ratio of novice to expert nodes, and percentage of unmatched novice nodes.

During early prototype testing of UsAGE, it was determined it was hard for the analyst to conceptually match the nodes on the graph with the actual operations of the application. As a result, we added the capability to "playback" selected portions of the graphed actions using the real application user interface. This allows further investigation of the actions of the novice user. During a report of the problems found, the playback feature can be used to illustrate the problem areas to the application developers.

FUTURE DIRECTIONS

Most of the effort so far has been on exploring matching techniques and adding basic functionality. We are still exploring how to graphically represent the time spent between actions, with the knowledge that a large pause between actions may represent time the user spent trying to decide what action to try next and therefore may indicate a usability problem.

Another feature we would like to add is the ability to alter the matching of the action nodes. If a task involves similar actions, UsAGE may incorrectly match certain novice nodes to certain expert nodes. The usability analysts, based on observations during the testing and their knowledge of the tasks, may know that the novice node actually matched a different expert node. The analyst would then want to be able to specify the particular matching and have UsAGE reevaluate the other parts of the action sequence.

Other ways to graph the nodes and other metrics derived from the node data also need to be explored. For example, if we can display more than one novice compared to an expert on one graph, this would provide the ability to see if each novice deviates from the expert's path at the same point in the series of actions. This could suggest a common usability problem.

Finally, we need to apply this tool as part of a usability testing process and evaluate its effectiveness at locating usability problems and its own usability. This work and analysis is key to knowing if the hypothesis is valid.

Acknowledgments

The authors would like to acknowledge NASA's Marti Szczur and Sylvia Sheppard who provided much support for this task

References

1. Asahi, T. and Iseki, O. UI-tester: A Tool for Measuring Usability. HCI International '93 Abridged Proceedings, p. 216, 1993.

2. Hoiem, Derek E. and Sullivan, Kent D. Designing and using integrated data collection and analysis tools: Challenges and considerations. Behaviour & Information Technology, 1994, Vol.13 Nos. 1 and 2 (Jan. - April), 160-170.

3. Szczur, Martha. Usability testing - on a budget: a NASA usability test case study. Behaviour & Information Technology, 1994, Vol.13 Nos. 1 and 2 (Jan. - April), 106-118.

4. Szczur, M. and Sheppard, S. TAE Plus: Transportable Applications Environment Plus: A User Interface Development Environment. ACM Transactions on Information Systems, Vol. 11, No. 1, January, 1993.