




Casey Boyd
This work explores two categories for evaluating and measuring virtual
environment (VE) interfaces. One category concerns characteristics of
the interface, such as its complexity and abstractness. The other
category concerns the human capacities for understanding and using
three-dimensional input/output devices. The results may help us
predict the usability of VE interfaces and help us to design
interfaces that are well matched to their intended users.
Interfaces to virtual environment systems are awkward. Part of that
comes from having to wear cumbersome hardware devices. Technical
developments will someday make the hardware small and lightweight, but
we will still have serious problems in interacting with VEs. The
persistent problems arise from characteristics of the interfaces we
use to perform different tasks in VEs and from the perceptual,
cognitive, and sensory-motor abilities of the human users, not from
limitations of the hardware.
A VE system can offer its users many different interaction metaphors
because of its freedom from physical constraints. A user could choose
between a skateboard, a car, a boat, an airplane or helicopter, and a
rocket. As powerful as each of those metaphors are, most people would
crash immediately. Operator skills for complex and powerful vehicles
are difficult to learn. Yet those vehicles might afford a navigation
metaphor that is well suited to some tasks.[3]
The NAVE (Navigating and Acting in Virtual Environments) research
group has developed some VE systems (figure 1) with several
different models of navigation. Hundreds of visitors have tried the
systems during a number of open houses. Individuals varied widely in
their success at learning the interface controls. However, it seems
that people in general are quick to learn the simpler interfaces and
what slows them down with the others is the complexity of the command
language and the indirectness of the mapping from user input to system
response.
Simple interfaces constrain the capabilities of any system. An example
of a simple VE interface is a walking model. A complex interface can
add whole dimensions of control and interaction, but requires learning
a command language and understanding the implications of the added
power. Examples of more complex and powerful interfaces are an
automobile simulator and an aircraft simulator.
Three-dimensional interface metaphors vary widely in their purpose and
style. We can walk virtually inside an architectural model that is
rendered natural looking with 3D computer graphics.[2, 4] People know
how to walk from point A to point B and how to turn their head to
orient themselves along the way. An interface can provide a walking
metaphor by directly mapping the user's input to motion in the VE. A
forward step in the real world is tracked and causes forward motion in
the VE. A walking metaphor for navigating satisfies the requirements
of an architectural walkthrough task.
Other tasks require other metaphors. Metaphors less direct than
walking will use functional transformations between the input and
control system layers. A user's input will be transformed somehow into
a control sequence for moving the ego location through the VE. What
will the transformation be? It may be more or less expressive,
requiring the user to learn a command language of some complexity. It
will occupy some point on a continuum of abstraction, depending on the
navigation metaphor. It could simulate a realistic, natural motion or
some new virtual motion free from physical constraints.
The Entity VE system used in this work implements several models of
navigation.[1] The simplest one maps the sensor position and
orientation directly into ego position and orientation in the
VE. After a couple of minutes of play with the system, practically
every user can successfully perform a simple task immediately after I
suggest it. But a direct mapping model limits movement to the volume
in the VE that corresponds to the sensor volume. No matter how much
improvement we get in the range of tracking systems, we still won't be
able to walk everywhere. We will need navigation models with greater
range and speed.
One of the more powerful models resembles a jet backpack, but without
momentum or inertia (figure 2). Once learned, it is fast and
accurate over a long range and useful in close quarters because of
sensitivity in the underlying model. Its cognitive cost is that the
user must learn a simple, but non-trivial, control method. A
three-dimensional joystick is activated by pressing a button held in
the hand. This establishes a neutral origin point for the joystick. As
hand motion is tracked, a 3D vector from the origin to the hand
changes in direction and magnitude. The virtual movement rate is slow
near the joystick origin and accelerates non-linearly as the user
increases the vector magnitude.
The three-dimensional joystick requires at least some pragmatic
understanding of 3D vectors and the realization that the position of
the joystick origin is reset each time the user activates it. Informal
protocols indicate that some people use that feature without realizing
it consciously. Others have trouble learning it even after it is
explained to them.
One of my goals in this research is to explore which cognitive burdens
of three-dimensional interfaces make them harder to learn and use.
Subjects perform a standard task using three different
three-dimensional VE interfaces. They move from a fixed starting point
through some distance to the front end of a box and look inside it to
see and identify a symbol pasted on the far end (figure 3). The
box is made long and narrow to require fine control of position and
orientation after the initial coarse motion and orientation. The time
to complete the task with each different interface is recorded.
One interface uses a head-mounted display (HMD) and head tracking with
a direct mapping or walking metaphor. The two other interfaces display
the visual output on a workstation monitor and the viewpoint is
projected into the subject's hand. Ego motion is tied to hand motion
by tracking a hand-held 3D mouse. One of these two interfaces mimics
the direct mapping motion paradigm of the head-tracked interface. The
other one uses mouse buttons to implement a command language that
includes a virtual 3D joystick for navigation.
Operating quickly and effectively in virtual environments sometimes
requires complex and abstract interface techniques. However, people
differ in their speed of learning new interaction techniques and their
ability to use spatial visualization to make decisions. What are the
cognitive and physical differences between people that affect their
success in using VE systems?
For users, 3D interfaces are unfamiliar and puzzling. Performance at
solving novel problems is measured as fluid intelligence with the
Raven progressive matrices test.[5] The ability to understand spatial
abstractions and visualize spatial relationships is tested with the
Shepard-Metzler-Vandenberg mental rotation test.[6, 7] For use of the
ambient visual system, performance on dynamic visual acuity has been
found to correlate with driving skill. Field dependence/independence,
in its original interpretation, reflects differences in the ability to
orient an object depending on one's own orientation and visual
context.[8, 9, 10]
The psychological and VE task results will be analyzed to see how well
the tests predict a subject's performance with the different
interfaces. The conclusions will be extended to the general population
by comparing test results with previously published studies using the
same psychological tests administered to large and diverse subject
groups in studies such as the Hawaii family study.[7]
Entity, the VE system used in this study, incorporates
three-dimensional input (six degrees of freedom), stereoscopic visual
display (HMD), binaural auditory output from a simple 3D acoustic
model, and physical simulations for objects and their interactions
(figure 4). It is written in C++, using an object-oriented
design. It uses the SGI Graphics Library on Indigo workstations for
graphical display. Three-dimensional input is provided by a Logitech
ultrasonic tracking system.
The Entity system records task completion times and records all of a
subject's input for playback at a later time. The save/replay feature
lets the experimenter record the experiment without videotaping the
subject.
The difficulties that some people have with some interfaces will not
all go away when the input/output hardware becomes lightweight and
natural to use. As VE systems become more realistic, interface
limitations may converge towards individual limitations. But still, a
VE can open ways to creatively augment our abilities and afford us
ways to overcome some of our natural limitations and
disabilities. Knowing the relevant dimensions of personal abilities
and knowing their distribution in the general population will help
systems designers make appropriate decisions to support different
tasks and different user populations.
2. Brooks, Frederick P. Walkthrough - A dynamic graphics
system for simulating virtual buildings, in Interactive 3D
Graphics (October 23-24, 1986).
3. Gibson, James J. The Ecological Approach to Visual Perception,
Houghton-Mifflin, 1979.
4. Henry, Daniel and Tom Furness. Spatial Perception in
Virtual Environments: Evaluating an architectural application, in
Proc. IEEE Virtual Reality Annual International Symposium
(September 18-22, 1993, Seattle, WA).
5. Raven, J. C. Progressive Matrices: A perceptual test of
intelligence, Individual Form, London: H. K. Lewis, 1938.
6. Vandenberg, Steven G. and Allan R. Kuse. A group test of
three-dimensional spatial visualization. Perceptual and
Motor Skills, 47, (1978), 599-604.
7. Wilson, James R. et al. Cognitive Abilities: Use of family
data as a control to assess sex and age differences in two
ethnic groups. Int'l J. Aging and Human Development, 6,3,
(1975), 261.
8. Cronbach, Lee J. Essentials of Psychological Testing, 3rd
ed., Harper and Row, 1970.
9. Witkin, Herman A. and Donald R. Goodenough. Cognitive Styles,
International Universities Press, 1981.
10. Willerman, Lee. The Psychology of Individual and
Group Differences, W. H. Freeman, 1979.
Abstract
Keywords:
virtual environments, evaluation, navigation
Introduction
Figure 1.

The hand holding a ball and throwing it through a window.
INTERFACE CHARACTERISTICS
Figure 2.

Flying through the yard with the backpack, controlled by the 3D joystick.
INTERFACE TESTS
Figure 3.

Approaching the mailbox.
HUMAN CHARACTERISTICS
VIRTUAL ENVIRONMENT SYSTEM
Figure 4.

The hand dropping a ball that bounces off another ball.
CONCLUSION
Acknowledgments
The NAVE project team, especially James Cox and Michael Romberg, made
many valuable contributions to the various systems. We are grateful
to Clayton Lewis for his help in establishing the NAVE lab.
References
1. Boyd, Casey. Navigating and Acting in Virtual Environments
(NAVE) Research Group Home Page,
http://www.cs.colorado.edu/~cboyd/Home.html, 1994.