Logo AHome
Logo BIndex
Logo CACM Copy

shortpapTable of Contents


The Representation of Agents:

Anthropomorphism, Agency, and Intelligence

William Joseph King*, Jun Ohya**

*Human Interface Technology Laboratory

Fluke Hall; FJ-15

University of Washington

Seattle, Washington 98195, USA

jking@hitl.washington.edu

**Communication Systems Research Laboratories

Advanced Telecommunications Research Institute

Seika-cho, Soraku-gun

Kyoto, 619-02, Japan

ohya@atr-sw.atr.co.jp


ABSTRACT

Agents have become a predominat area of research and development in human interfaces. A major issue in the development of these agents is how to represent them and their activities to the user. Anthropomorphic forms have been suggested, since they provide a great degree of subtlety and afford social interaction. However, these forms may be problematic since they may be inherently interpretted as having a high degree of agency and intelligence. An experiment is presented which supports these contentions.

KEYWORDS:

agents, anthropomorphism, facial expression, user interface design

INTRODUCTION

In recent years, research into autonomous agents has increased dramatically. These agents are meant to carry out tasks for the user and serve as another layer of mediation within the system [1]. This layer of mediation employs artificial intelligence techniques to make decisions and to take certain actions within the system. This proactive agent acts autonomously or with permission from the user; therefore, it relieves the user of the burden of having to carry out tedious or repetitive actions [1,2].

A great challenge in designing the human interface to a system using autonomous agents is representing the state of the agents. On can assume that the user probably needs to know, at some high level of granularity, a number of specifics about the agentís operations. These specifics may include the stage the agent is at within a task, the certainty the agent has of its own decisions within the task, and the movement toward the goal of finishing the task. One can also assume that, at the highest level, the user simply needs to know how much trust to invest in the agent.

Trust, in this situation, is the relative ability level which the human user attributes to a certain software entity. Learning theory would indicate that this level of trust may be developed over time through the userís observation of the success and failure of a given agent. However, this may take quite a long time since many agents depend upon establishing a history of the userís actions and preferences to achieve their optimum performance [2]. In the short term, especially during the initial exposures of the user to the agent, the interface will have to provide some other information to aid the user in appraising the agent.

Anthropomorphic representations for agents have been suggested [1, 3, 4, 5]. The reasoning behind this suggestion is that the anthropomorphic representation allows for a rich set of easily identifiable behaviors and for social interaction. However, these representations do not tend to afford a quick assessment of the agentís capabilities, especially during initial exposure to the agent. If anything, these representations may make the agent seem more intelligent, capable of a higher level of agency, and more trustworthy than it actually is. Empirical data are necessary to make any conclusive statements.

SUBJECTS

Eighteen adult subjects participated in the experiment. The subject pool was chosen to represent a range of expertise in the use of computers; however, none of the subjects had prior exposure to this experimental system. Half of the subjects were from îWestern” philosophical traditions and half of the subjects were from îEastern” philosophical traditions. Half of the subjects were male.

REPRESENTATIONS

Twenty stimuli were chosen for the experiment.. These stimuli ranged from simple geometric shapes (e.g., sphere) to three dimensional, fully articulated human forms. The set included Chernoff faces, with both positive and negative facial expressions, and stylized facial caricatures. It included very high resolution human forms in two and three dimensions and in wire frame. Some of the faces exhibited random eye blinking. The set also included a variety of non-facial forms of equal geometric complexity to balance the experiment. Each of the 20 stimuli was presented both static and dynamic. In the dynamic presentation, the stimuli would move into the subjectís field of view from above until it reached the center of the display.

MEASURES

The measure used in this experiment was a questionnaire containing 40 sets (stimuli were presented in two different conditions) of three questions. The first question was a multi-dimensional scale developed to assess the type of phenomena which was presented. The scale was composed of a triangle with the vertices labeled îobject”, îagent”, and îevent”. Subjects were asked to place an îX” in the triangle at the place which best described the phenomenological properties of the stimuli presented.

The second and third questions were forced choice. The second question asked the subject to again appraise the stimuli; this time the subject had to make a forced choice between the three. In the third question, subjects were asked to rate the îintelligence” or îpotential intelligence” of the stimuli shown. This rating was given on a standard ten point scale, in which î1” represented no intelligence and î10” represented high intelligence. All of the questions and instructions were given in either English or Japanese.

DESIGN AND PROCEDURES

Each subject was given written instructions before the trial began. The instructions included definitions of each type of phenomena. They also included sample answers to the questions. The subjects were allowed to ask questions about the instructions and experiment at this time.

The subjects were seated approximately four feet in front of a large, wall-sized lenticular display. A reflective marker was placed on their chin. The system was then calibrated so that the subjectís head could be tracked by remote infrared cameras. The lenticular display presented the stimuli in three dimensional monochrome; the stimuli were stabilized so that the subject could move his or her head and upper body to see different perspectives.

Once the subject was comfortable with the experimental configuration and the system was calibrated, the lights were dimmed, and the experiment began. The trial proceeded at the subjectís own pace. In each presentation, the subject was exposed to a representation or stimuli for 15 seconds. At the conclusion of the presentation, the subject completed the three questions associated with it and signaled for the next presentation. This sequence continued until all presentations were made in random order.

RESULTS

Preliminary results indicate that fully articulated human forms were appraised as agents significantly more often than other anthropomorphic forms and all of the other stimuli presented F(3,14)=18.43, p<.05. Subjects rated the intelligence of the human forms to be significantly higher than either the caricatures, t(17)=7.05, p<.001, and t(17)=6.01, p<.001, or the Chernoff faces, t(17)=8.44, p<.001, and t(17)=8.83, p<.001.

Within the human forms, the three dimensional representations were significantly more likely to be appraised as agents rather than objects. Appraised level of agency and intelligence was found to be similar across the more symbolic caricatures and Chernoff faces. However, all of the anthropomorphic representations, including the more symbolic ones, were judged to be more intelligent and capable of higher agency than the other stimuli presented (e.g., geometric shapes, objects, etc.). The blinking human forms were appraised to have significantly higher agency and to be more intelligent than any of the other representations, including the identical non-blinking forms.

DISCUSSION

The designer wants the software entity to be appraised with a certain level of agency and intelligence. While human facial displays and anthropomorphic representations may seem appropriate, the results indicate that these representations are judged to inherently have a high degree of agency and intelligence. Subtle behavioral displays (e.g., eye blinking) can have a great effect on the userís appraisal of these capabilities.

The designer of agent-based interfaces must be careful to choose appropriate representations and behaviors. These choices will have a dramatic effect on the userís judgment of the agent and on the trust which the user places upon the agent. Unless the agent is capable of human-like actions and decisions, then the designer may be wise to avoid human-like representations.

ACKNOWLEDGEMENTS

The authors wish to thank Fumio Kishino and Nobuyoshi Terashima for providing funding and a laboratory where this research could be conducted.

REFERENCES

  1. Maes, P. Agents that Reduce Work and Information Overload, Communications of the ACM, 37, 7, 31-40.
  2. Kozierok, R. & Maes, P. A Learning Interface Agent for Scheduling Meetings, Proceedings of the 1993 International Workshop on Intelligent User Interfaces, New York: ACM Press, 81-96.
  3. Laurel, B. Interface Agents: Metaphors with Character. In B. Laurel (Ed.). The Art of Human-Computer Interface Design, Reading, MA: Addison-Wesley Publishing Company, 1990.
  4. Oren, T., et. al. Guides: Characterizing the Interface. In B. Laurel (Ed.). The Art of Human-Computer Interface Design, Reading, MA: Addison-Wesley Publishing Company, 1990.
  5. Naas, C., et. al. Computers are Social Actors, Proceedings of the 1994 Conference on Human Factors in Computing Systems (CHI ë94), New York: ACM Press, 72-77.