Abstract
An experimental evaluation of video support for shared
work-space software is described. Groups of two users
worked simultaneously and cooperatively on a problem
using Aspectsª on Macintosh computers in one of four
scenarios. Each of these scenarios provided a different form
of supplementary communication: audio only, reduced
frame-rate video, standard video, and full face-to-face
communication. Although the audio link had been found to
be essential in an earlier pilot study, in this experiment
there was no discernible difference in performance between
any of the four scenarios. Nevertheless, users indicated that
they were more comfortable with the face-to-face situation.
Keywords:
CSCW, video, evaluation, shared work-
space
Introduction
In recent years a great deal of attention has been given to
video-conferencing in a variety of forms, and in particular to
the integration of video-conferencing facilities with
computer software and workstations [2, 4, 9, 11]. Much of
this work tends to involve the development or innovation
of new systems, and then the observation of how users
behave when provided with these facilities. This aspect of
CSCW research is still really at the first of Card's four
growth stages [1]; that is the design and implementation of
illustrative systems. There has, however, been some work
carried out towards stage 2, that of evaluating, comparing
and reviewing existing systems in order to understand the
dimensions affecting success or failure [7, 10].
The research described in this paper is an attempt to
determine whether the addition of a video communication
channel produces any significant benefit when using shared
work-space software. In 1992, Hollan and Stornetta [5]
challenged the notion that the goal of CSCW developments
should be to attempt to emulate face-to-face
communication, generally by the use of video. Earlier work
by Ochsman and Chapanis [8] on non-computer-based
cooperative problem solving showed that the addition of an
audio communication channel was very beneficial, but that
little further benefit accrued from the addition of video. A
small study by Minneman and Bly [7] in the CSCW
context, also suggested that while an audio link was
essential, there was no evidence of significant improvement
with the addition of video. They did suggest, however, that
distinctions between audio-only and video-plus-audio
communication might only be observable in the longer
term, not just when concentrating on a single task.
The experiment described here has utilised Aspectsª [3]
shared work-space software on communicating Macintosh
computers. Aspects can provide common text or drawing
work-spaces for simultaneous operation by multiple users.
It includes facilities for floor control, and each user can
identify themselves with a personalised pointer. The
software also provides a text-based "chat-box" facility for
interactive message exchange between users. A preliminary
pilot study [6] showed that an audio communication
channel was vital for effective collaboration, confirming the
result from Ochsman and Chapanis [8].
It has been suggested that various communication options
can be ranked on an axis of "social presence", as shown in
Figure 1 [5]. This experiment was proposed to establish the
"distance" between audio-only and face-to-face on this scale,
and to determine the position of video+audio between these
two, when using shared work-space software. Further, given
that many video links used in CSCW support operate at a
reduced frame rate, the distance between reduced frame rate
video and full motion video was also to be established.
FIGURE 1: The suggested ranking of communication
options.
THE EXPERIMENT
The primary objective of this experiment was to compare
the benefits of face-to-face communication and full motion
video, reduced frame rate video and audio only links when
used in conjunction with a computer-based shared work-
space system, with two subjects working cooperatively to
solve a specific problem. Four different experimental
scenarios were utilised. For the face-to-face scenario, the
two subjects were seated in front of their computers, facing
one another across a table. Four video cameras were used to
capture the upper body view that each had of the other, and
the views of each screen. These four video signals, plus the
audio exchange between the two subjects, were merged
through a four quadrant video-mixer and recorded on a single
video tape.
For the two video scenarios, subjects were placed in
separate rooms, but each was provided with a video monitor
showing the upper body view of the other, and each was
able to communicate verbally with the other via audio head-
sets. In one case, the monitors showed normal full-motion
video (25 frames/second) and in the other, a reduced frame-
rate picture (5 frames/second) was displayed. In both cases,
four cameras were used, and the two upper body views and
two screen views were recorded on video tape as described
above. The audio-only scenario was similar, but no monitor
was provided to show each subject a view of the other.
Twelve pairs of subjects carried out the experiment, and
each pair worked in all four scenarios. The sequence of the
scenarios was different for each pair, and all possible
combinations of the first two scenarios were used to
minimise learning effects.
In order to avoid the requirement for contextual knowledge,
four jigsaw puzzles were used as the problems to be solved.
Subjects worked in the drawing environment of Aspects,
and were required to manipulate the randomly arranged
pieces of a jigsaw in order to correctly assemble the final
picture. The four puzzles were of similar difficulty, and
were tackled in the same sequence by all twelve groups.
Each puzzle typically took 30 minutes for a group to solve
cooperatively.
The subjects were male and female third- and fourth-year
computer science and information systems students;
although they were all experienced computer users, few had
encountered Aspects before.
RESULTS AND CONCLUSIONS
As mentioned above, the progress of each session was
recorded on video tape comprising upper body views of each
of the two subjects, views of each of the two screens, and
the audio record of their conversation. In addition subjects
completed a questionnaire at the beginning of each of the
four sessions to establish their expectations, another at the
end of each session to record their reactions, and then a
further questionnaire when they had completed all four
sessions which asked them to rank the four scenarios
according to a number of different factors.
Observational results suggested that there was little
discernible difference between the four scenarios. Although
subjects communicated verbally, and used gestures
frequently, they focused on their computer screens and
hardly ever looked at each other, let alone established eye-
contact, even in the face-to-face scenario. A detailed analysis
of the video record for style and content of the interaction is
still underway, but so far has yielded no parameter for
which there is any significant difference between any of the
four scenarios. However, an analysis of the questionnaires
does show that the users' expectations of the four scenarios
did vary. They felt that they would be happier working in a
face-to-face environment, and after the event, found that this
was the most satisfactory.
Clearly in the context of this experiment, that is the use of
shared work-space software, the only tangible benefit of
increased social presence is in user satisfaction. Once a full
audio link is provided, there is no change in observed
interaction or work patterns as the presence is increased to
full face-to-face communication. Further work, however,
needs to be done to determine the effects of the nature of the
problem, the number of participants, and the duration of the
collaboration, on these results.
References
1. Card, S. (1991). Presentation on the theories of HCI,
at NSF workshop on human-computer interaction,
Washington DC.
2. Gaver, W., Moran, T., MacLean, A., Lšvstrand, L.,
Dourish, P., Carter, K. & Buxton, W. (1992).
Realizing a video environment: Europarc's RAVE
system. In CHI'92 Conference Proceedings, 27-35.
3. Group Technologies, Inc. (1990). Aspects: The first
simultaneous conference software for the Macintosh.
4. Harrison, B., Mantei, M., Bierne, G. & Narine, T.
(1994). Communicating about communicating: Cross-
disciplinary design of a media space interface. In
CHI'94 Conference Proceedings, 124-130.
5. Hollan, J. & Stornetta, S. (1992). Beyond being there.
In CHI'92 Conference Proceedings, 119-125.
6. Masoodian, M., Apperley, M. & Frederikson, L.
(1994). The impact of human-to-human
communication modes in CSCW environments. In
OZCHI'94 Conference Proceedings (Melbourne,
Australia, 28 November - 1 December, 1994)
Ergonomics Society of Australia, Canberra, 193-198.
7. Minneman, S.L. & Bly, S.A. (1991). Managing ˆ
trois: a study of a multi-user drawing tool in
distributed design work. In CHI'91 Conference
Proceedings, 217-224.
8. Ochsman, R.B. & Chapanis, A. (1974). The effects of
10 communication modes on the behaviour of teams
during co-operative problem-solving. International
Journal of Man-Machine Studies, 6, 579-619.
9. Pedersen, E.R., McCall, K., Moran, T.P. & Halasz,
F.G. (1993). Tivoli: An electronic whiteboard for
informal workgroup meetings. In InterCHI'93
Conference Proceedings, 391- 398.
10. Sellen, A.J. (1992). Speech patterns in video-mediated
conversations. In CHI'92 Conference Proceedings, 49Ğ
59.
11. Tang, J.C. & Rua, M. (1994). Montage: Providing
teleproximity for distributed groups. In CHI'94
Conference Proceedings, 37-43.