Logo AHome
Logo BIndex
Logo CACM Copy

DesbriefTable of Contents


A User Interface for Accessing
3D Content on the World Wide Web



Mike Mohageg, Rob Myers, Chris Marrin,
Jim Kent, David Mott, and Paul Isaacs


Silicon Graphics, Inc.
2011 N. Shoreline Blvd.
Mountain View, CA 94039-7311
(415) 960-1980
{michaelm,rob,cmarrin,jkent,mott,pauli}@sgi.com


Abstract

A strategy for accessing and viewing three dimensional data on the World Wide Web is introduced. Factors driving the user interface design of a 3D web browser are presented. The interface for the initial implementation of Silicon Graphics' WebSpaceNavigator, the first commercially available 3D Web browser, is given. Close attention is paid to design issues. Usability lessons learned from this interface are described and it is shown how they affected the second generation browser interface design.

Keywords

User interface design, three dimensional (3D) navigation, World Wide Web (WWW), Virtual Reality Modeling Language (VRML)

Introduction

Users of the World Wide Web typically access documents comprised of text and images, with access to audio and video. We believe that 3D environments are equally viable and compelling "documents" of interest on the web, with applicability to architecture, education, engineering, advertising, entertainment, and more. For instance, web-based 3D allows a web user to enter 3D models of places that are too dangerous to visit physically, or historical locations that no longer exist (e.g., Pompeii), or remote and/or physically restricted locations. A user might enter a re-creation of an ancient Aztec city, for example, and walk through it to investigate this historic site. The site could establish links between objects in the 3D environment and multi­media content describing the objects role in Aztec culture.

In a more commercial application, a 3D museum model allows users to walk through the rooms of the museum and view art exhibitions. The user can click on a piece of art to hear information about the piece, or be taken to a Hyper Text Markup Language (HTML) page that provides textual information and a movie about the piece. Conversely, standard HTML pages can contain links to 3D content. Users can investigate a 3D world then return to their HTML documents.

For more detailed discussion and motivation for using 3D data on the web, see [7,10].

Viewing and interacting with 3D data can be significantly more complicated than viewing 2D text and graphics. Users need to examine 3D objects from different viewpoints and navigate through scenes to gain a better understanding of the data. An interface for the exploration of 3D data must provide adequate controls for the user to inspect objects and scenes. We have developed a user interface for this purpose.

WebSpace is a viewer for 3D content delivered over the web [3]. An emerging standard for web-based 3D content is the Virtual Reality Modeling Language (VRML) [8]. WebSpace is a viewer for VRML data as Netscape, for instance, is a viewer for HTML data.

WebSpace 1.0 is the first commercially available VRML viewer. Designing the user interface presented some interesting challenges; in particular, we needed to combine 3D user interface know-how with emerging conventions on the web. This paper will present the initial design of WebSpace 1.0, with emphasis on what we learned about web-based 3D user interfaces, and how we incorporated these lessons into the WebSpace 1.1 user interface.

Factors Driving the Design

There were 4 main driving factors in designing the UI for WebSpace 1.0.

  1. The user's primary tasks
  2. Existing body of design work for 3D
  3. Burgeoning conventions for the web
  4. Meeting the needs of both novice and experienced users

The User's Primary Tasks

The two main tasks for users surfing the web are content retrieval and content viewing.

The popularity of HTML browsers is due to the ability to follow links quickly and easily retrieving information with a single mouse click. We wanted linking to be the primary task afforded the user in WebSpace. Links constitute not only a mechanism for retrieving information of interest, but also for navigation among 3D scenes and perhaps within scenes. Users, we surmised, would be busy linking from one cyber-place to another. Mouse clicks in the main viewing area of WebSpace were thus reserved for following links.

The other main task for 3D users is viewing data from different points of view. In contrast to the passive activity of reading text in an HTML browser, possibly using a scrollbar to display information that exceeds the window size, 3D content is often best understood when the data can be viewed interactively from different points of view. Viewing controls had to be easily accessible in the WebSpace interface.

In general, two viewing paradigms are used to view 3D data: navigation and examination. In navigation, the interaction dynamics are as though the 3D scene is a fixed, immovable world, and the user maneuvers through the world by walking, flying, or similar means. Conversely, with examination, the dynamics of the user interaction are as though the user is stationary and at a certain distance away from the center of an object. The user can tumble the object to view different sides. This type of viewer is best suited for tasks where the user's objective is to view an object as though s/he were holding it, such as viewing a model of a coffee mug. Because of the varied 3D content on the Web, WebSpace would need to provide user interfaces for both viewing paradigms.

Existing Body of 3D UI Design

A source of design concepts was existing 3D user interface work [2,11,12]. Our previous experience in 3D applications allowed for direct transfer of many UI elements [4,5,6]. For instance, we had a good understanding of the basic functionality needed to navigate and view in 3D. Our challenge was to make these functions accessible and usable to web users with little 3D user interface experience.

We were also able to draw from the interfaces of popular 3D games, such as Doom. Many computer users move through 3D worlds on a daily basis and have attained a certain savvy for navigating with simple keypad commands and mouse gestures. We wanted to incorporate aspects of these interfaces to meet expectations and facilitate the transition from 3D in a game to 3D on the web.

Burgeoning Conventions From the Web

Despite the web's infancy, a few de-facto conventions have emerged. Links embedded in content are a mainstay of HTML web pages, as is single clicking to follow a link. Presenting the Universal Resource Locator (URL) of a link destination when the cursor is over an anchor is another useful convention. Users can quickly get more information about the destination before they traverse a link. Our goal was to replicate the most useful features and conventions of HTML browsers into our VRML viewer where appropriate.

Varying User Experience Levels

Providing a modeless interface that met the needs of both novice and experienced users was a major goal for WebSpace 1.0. We presumed that most users would be novices at 3D navigation and that we would "bias" our UI in favor of novices. This meant making all important functions easily accessible and usable. With novices there was also an element of enticement. A subgoal was to provide a UI that would encourage first-time users to "try out" our product. We wanted users to experiment with WebSpace and eventually become immersed in a 3D world.

Conversely, there are 3D knowledgeable users who find the trappings of a novice interface too cumbersome. We did not want to diminish the 3D experience in any way for these expert users; thus, we needed to provide the same functionality in a fashion that was less obtrusive, but still easy to use.

Design of WebSpace 1.0

Figure 1 presents the user interface for WebSpace 1.0. The areas of most interest are the viewing window, where the 3D content appears, and a dashboard to manage interactive viewing.

FIGURE 1 GOES HERE (mfm_fg1.gif)

Viewing Window

All 3D scenes appear in the viewing window. We believed that finding and following links would be the most critical tasks performed in this window. We tailored our design accordingly, by reserving clicking in the viewing window while the cursor was over a linked object for following links (as in HTML browsers). Where HTML allows for text and images to be link anchors, VRML allows 3D objects in the scene to be link anchors.

HTML browsers often draw link objects in a different color so that the user can easily find them, and we provided two functions to make certain that users could find links in a VRML world. First, when the cursor passes over a link anchor, we automatically highlight the object by brightening its appearance. Following the lead of HTML browsers, we also display the URL for the link at the bottom of the WebSpace window. Additionally, we provided a function to highlight all the links, allowing users to immediately view all links in the visible area of the 3D scene.

Dashboard

WebSpace 1.0 presents a graphical dashboard across the bottom of the viewing window, to provide ready access to the most important navigation tools. The dashboard contains 3 fixtures: the joystick, the pan control, and the seek tool, as shown in Figure 1. A similar dashboard was created for the examination paradigm with the joystick replaced by a virtual trackball.

The paradigm for the navigation tool is that of a user "walking" around a virtual world. The walk paradigm constrains the up direction of the scene and the height of the camera above a plane perpendicular to that direction. The primary functions needed for this walk paradigm are:

All of this functionality is mapped onto a single affordance on the dashboard. The joystick was selected as it is representative of devices with which the naive user would be familiar. Since it supports all of the primary functions, it is positioned in the center of the dashboard. As the user interacts with it, it gives feedback by moving/turning in the direction of motion.

The primary viewing functions map to the joystick as follows:

Also to assist navigational,, "dead" zones were added around the initial mouse click point. These zones consist of a narrow vertical and horizontal stripe of pixels centered at the click point. If the mouse remains in the vertical zone while moving toward the top/bottom of the window, the camera moves forward/backward with no turning. If the mouse remains in the horizontal zone while moving toward the left/right of the window, the camera rotates with no forward/backward motion.

In addition to the primary functions, additional capabilities are provided by the other two dashboard affordances. The pan control on the right side of the dashboard allows the user to move the camera in the plane perpendicular to the viewing direction, providing sidling and height adjustment functionality. The control is oriented in the plane of the window and pivots about its center up/down and left/right. Like the joystick, there are dead zones about the initial click point that allow the camera to be moved straight up (down) by moving the mouse toward the top (bottom) of the window and staying in the vertical dead zone. Similarly, moving the mouse toward the right (left) of the window while staying in the horizontal dead zone causes the camera to move directly right (left). Straying outside of the dead zone allows for compound up/down, left/right movement. The speed of movement is controlled by the distance the mouse is moved away from the initial click point.

The seek tool provides an alternate "click and go" navigation method. In this paradigm, the user clicks the mouse on an object of interest and the camera is smoothly moved toward that object. In WebSpace 1.0, clicking on the seek tool puts the interface into a temporary mode where the next mouse click in the viewing window causes the camera to move toward the point clicked. The camera is moved half the distance to the object and tilts so that the point clicked is in the center of the viewing window. A modal approach was used to preserve the primary interpretation of clicking in the viewing window as following a link.

Other Relevant Interface Items

VRML allows content authors to create specific interesting vantage points in the world, and providing a descriptive name for each. These vantage points are accessed via the Viewpoints menus. Selecting a viewpoint from the menu smoothly animates the camera from its current position to the selected vantage point.

Various mouse and keyboard shortcuts are provided to allow 3D knowledgeable users to bypass the dashboard. For example, the primary functions of moving forward and back, and turning left and right, can be initiated by pressing the keyboard arrow keys.

The buttons along the top of the window are similar to those found in most HTML browsers. These buttons provide for moving forward and backward between scenes, going to the user's home scene, reloading content, and other common web browser functionality. Our intent was to foster transfer of learning from HTML browsers to WebSpace.

To provide additional feedback concerning the destination of a link, author-supplied descriptions of link destinations are displayed above the viewing area. The descriptions are significantly less cryptic than a URL, and can be useful in determining whether to follow a link.

Evaluations and What We Learned

We performed evaluations before and after releasing WebSpace 1.0. Informal usability testing was conducted prior to the release and user observations were completed after the release. The informal usability testing intended to investigate specific navigation issues; we were confident that the examining user interface was simple and clear. So we chose to use our limited time in evaluating navigation.

Informal Usability Testing

One of our biggest concerns during the design process was making certain that users could use the dashboard to navigate through a scene. It was unclear whether users would be able to easily maneuver to get to a location of interest in a scene. We were also interested in identifying whether users understood the controls we had provided on the dashboard (seen in Figure 1).

Therefore, the usability evaluation was designed to collect information on:

Procedure and Results

Users were presented with two scenes: a 3D model of an exhibition booth (with many stations to visit), and a scene with more abstract content, containing an island with irregular surfaces and structures emanating from it.

Testing was performed in an informal fashion. Six 3D inexperienced users used WebSpace to perform the navigation and "explanation" tasks. The tasks were given to the participants by the experimenter. Verbal protocols were used to collect user feedback. For instance, the experimenter asked participants to navigate to a particular location in the booth. The participant would then describe and verbalize thoughts on the process and aspects of the navigation UI that s/he found confusing, problematic, or useful. The experimenter also monitored the participants' actions closely and recorded any subtle problems and issues.

The usability testing uncovered 3 major issues: poor control/display relationship, the need for an overview, and the need for navigation speed controls.

Control/Display Relationship

The control/display relationship between mouse input and movement through the scene was very poor for the typical frame rate average workstations could achieve (in rendering complex 3D scenes). The original control/display relationship was linear; that is, there was a constantly increasing relationship between joystick (mouse) movement and the speed of navigation through the scene. This relationship lead to considerable navigational difficulty because users, especially novices, tend to make gross gestures with the mouse during navigation. For example, a user who is headed to the left of the intended target will move the mouse to the right to correct her course. However, the mouse drag to the right often has a high amplitude. With a linear relationship the mouse drag to the right leads to a sudden and large over-correction to the right. The user then corrects back to the left, but again over-corrects. This pattern continues and users become quite frustrated since they "can't even move in a straight line." With a workstation that can render the scene at 30 fps, the linear relationship will work well; however, with typical rates of 8-12 fps the linear relationship leads to significant navigational difficulty.

The solution was to introduce damping by using an exponential control/display relationship rather than linear. With an exponential relationship the user was free to move the mouse with fairly large gestures without having very large impact on the speed and direction of movement. The exact nature of the curve was fine tuned to create a relationship that was responsive but not difficult to control.

Need For an Overview

As realistic as computing technology can make it seem, navigating a computer-based 3D scene does not provide all the cues that navigating in the real world provides. In particular, the overall context available in real world navigation (or "wayfinding") is not available in computer-based navigation [1,9]. Therefore, it can be difficult for users to develop an overall mental model of a computer-based scene, unless they have prolonged exposure. Our testing showed this to be an issue in WebSpace as well.

While participants could navigate to desired locations, they often commented on not being completely sure of their current location in relation to other locales/landmarks in the scene.

The standard solution to the "lost in space" problem is to provide a map or "bird's eye view" of the world to help users orient themselves. However, we could not automatically create such maps based on the 3D scene information alone. Providing such an aid seemed to be the responsibility of the content creator. Therefore, we took no special measures to handle this issue, except to encourage content developers to provide such navigational aids.

Controlling Speed

3D scenes are of different sizes and shapes; there are no restrictions on how large/small they can be. It is part of the content creator's responsibility to set a travel speed appropriate for the size of the scene being created. For instance, users will want to move very quickly through a model of New York City, but much slower when navigating a studio in Greenwich Village.

VRML allows content creators to set the speed for the scenes they create. However, our test participants helped uncover two situations where the speed was not what they were expecting:

  1. The user wanted to travel slower/faster than the author defined speed (even though the author has specified a reasonable speed)
  2. The author had failed to provide a speed or specified an inappropriate speed.

In both cases users need to have a degree of control over the speed of their movement in the scene. We provided 3 menu items to set speed. One was used to increase speed while the other was used to decrease it. A third was used to return the speed to that originally specified by the author (the "default" speed).

User Observations

We had several opportunities to observe novices using WebSpace 1.0. Most observations were made at ACM SIGGRAPH '95 where attendees used the viewer at kiosks dispersed throughout the convention center. There was a VRML model of the SIGGRAPH exhibition floor with all the booths. Users could use WebSpace to move through the show floor and access information on any booth of interest. Also, chat rooms were created where users could enter and see others (represented as avitars) as well as interact with them.

We also observed individual users at their personal workstations (mostly at Silicon Graphics' Mountain View, CA campus). These users were free to access and navigate any content they preferred and provide us with feedback on the positives and negatives of the product.

User Expectations

The most interesting discovery was that users were not very motivated to employ UI fixtures to navigate a scene. Users expected to simply click in the viewing window and have some sort of navigation take place automatically. Note that this is in direct conflict with our intent to be consistent with web conventions by reserving clicking in the viewing window for traversing links. It was very clear, however, that many novices expected to click on an object or area of interest and be taken there. They seemed to want to use the cursor as a pointer that would whisk them to their location of interest within the scene. This is the functionality we had defined for the Seek Tool, but many users wanted this behavior to be a primary navigation mechanism. Our challenge now was to support this seeking behavior without violating the link following convention.

For those users who were interested in using the dashboard for navigation, most were unable to easily determine how to use the joystick. The intent of the joystick was to offer users a fixture with an inviting physical quality that would compel users to "grab" and push. However, most novice users expected a single click on navigation affordances to result in some form of navigation. We were confident that the interaction dynamics of grabbing and pushing a control forward was the correct approach, but it seemed that we needed to also support the expectation of clicking to invoke behavior.

Impact of Content Attributes

A common mechanism for navigating within an HTML document is following internal links. These are links that position the browser at a different part of the current document. Following internal links is commonly used for coarse navigation, such as accessing the general topic the user is interested in. Once there, users rely on the scroll bar (or potentially text searching) to perform fine navigation to access the exact information they seek. Most users of WebSpace 1.0 drew on this model in interacting with 3D worlds as well. Many anticipated being able to click on an in-scene object to take them to places of interest. The viewpoints menu contains points of interest; however, users prefer to maintain focus on the content rather than access a menu outside the main viewing window. This was a direct instance where content attributes, which were beyond the control of the WebSpace UI, had significant impact on the usability of the product. Most scenes contain links all of which lead to other 3D scenes or HTML documents. Therefore, the user expectation of being able to click on an in-scene object to move to a location of interest in the scene was not commonly supported by content developers.

Another aspect of content attributes was a subtle, yet important, issue of setting user's navigational expectations. Every scene has an entry point, which authors are free to define. Many scenes defined the entry view to be a point above the content, a "bird's eye" view of sorts. While these views help to orient users, they also suggest a flying navigational paradigm. That is, if users enter a scene in the sky (or at some high elevation), they will immediately assume that they can fly around a scene in order to navigate.

However, WebSpace 1.0 supports walk navigation, where the users can move freely through a scene, but their elevation is fixed. The sidling fixture can be used to change elevation, but once the user starts to move forward the elevation is fixed. This is the most common type of navigation paradigm and is the correct functionality for WebSpace. Nonetheless, content often set users expectations that they were flying. Thus their interaction with the navigation dashboard could lead to dissonance since they believed they were controlling a flying, rather than walking, user interface.

Design Changes for WebSpace 1.1

After observing the problems encountered with the initial user interface, we redesigned it to both overcome the shortcomings and to improve the look of the UI (See Figure 2). We moved the dashboard from the scene into a separate area below the viewing window to keep it from obscuring portions of the scene, while maintaining the illusion that the user is looking over the dashboard into the scene. A separate window also allowed us to photorealistically render components of the dashboard while not interfering with rendering performance of the main viewing window. This increases the association to real world devices for the user.

FIGURE 2 GOES GHERE (mfm_fg2.gif)

The importance of navigation, especially seek, was underestimated in WebSpace 1.0. But to allow the novice to follow links and to keep consistency with HTML browsers, we needed to maintain the simple "click to link" behavior. We decided to allow users to indicate their preference for single clicking in the scene: seeking or following links. Buttons on the dashboard allow users to select the current preference. Note that these are not modes; the user always has access to navigating and following links. The preference simply sets the primary functionality of the cursor.

With the "click to link" preference, the user clicks on a linked object to go to that destination, as in an HTML browser. Navigation is done with the dashboard or in the viewing window with keyboard modifiers. The novice starts with this preference since all functions are obvious.

With the "click to navigate" preference, the user can hold down one or more mouse buttons and drag in scene to navigate (walk, pan, or tilt in the Walk Viewer, spin, pan, or dolly in the Examiner Viewer). Clicking in scene seeks to the object clicked on, moving close enough so that the object fills the viewing window (rather than moving half way to the object, as in WebSpace 1.0). Double clicking on a linked object follows that link. This gives a very powerful navigation model to experienced users allowing modeless navigation and link following.

Our observations of WebSpace 1.0 caused us to place one additional component on the dashboard. The Viewpoint menu had proven to be a very effective navigation tool so we put it on the dashboard to make it more accessible. The current viewpoint is displayed, with controls for stepping through the viewpoints, and a lighted button tells the user if s/he is currently at the displayed viewpoint.

All functions on the dashboard have a prompt message displayed when the cursor is over them. These prompts provide information about the behavior of the control (e.g. - "Joystick: drag to walk or turn"). Cursor shapes appropriate to the function are shown as well. In "click to navigate" the seek cursor is normally shown. Clicking in the main window to seek changes this to a "seek in progress" cursor, while clicking and dragging changes it to a walk cursor. In "click to link" the cursor is a pointing finger (chosen to match HTML browsers) when the user is over a link.

Reflections on the Design Process

There are two main items to share concerning UI design/development for WebSpace 1.0 and 1.1. First, it would have been useful to perform more rigorous usability testing with a wider range of users and more varied scenes prior to the 1.0 release. However, the requirements of an extremely tight delivery schedule and our naivete concerning the types of scenes that would be available in the future restricted our ability to do rigorous testing. Nonetheless, it is clear that obtaining feedback from a representative sample of users would have positively influenced the UI or at least better prepared us to anticipate our post-1.0 findings.

Secondly, UI practitioners must always temper their passions for the user interface with the fact that other factors also contribute to the success of a product. In the case of WebSpace, factors such as rendering performance, full support for the VRML specification, and "peaceful" coexistence with HTML browsers were also critical. Given limited development resources, all of these success factors needed to receive attention and the UI is but one of these factors. As a result, not all of the UI changes suggested from post-1.0 evaluations were implemented in 1.1. The next release of the product will incorporate more of the UI enhancements. This is yet more data suggesting that it is very difficult to have a perfectly well balanced and complete UI in the first release of a product.

Acknowledgements

Our thanks to Rikk Carey, under whose direction this project was developed. Thanks to Gavin Bell, who spearheaded the technical specification of VRML. Thanks to Delle Maxwell for her artistic talents and help in creating the WebSpace user interface. And thanks to the entire Silicon Graphics Open Inventor team for their technical expertise and support.

References

1. L. T. Kozlowski, and J.B. Kendall. Sense of Direction, spatial orientation, and cognitive maps. Journal of Experimental Psychology: Human Perception and Performance (1977). Vol 3, No. 4, 590-598.

2. J. D. Mackinlay, S. K. Card, and G. G. Robertson. Rapid controlled movement through a virtual 3D workspace. Computer Graphics (SIGGRAPH '90 Proceedings) (Aug. 1990), F. Baskett, Ed. vol. 24, 171-176.

3. J. Muchowski, WebSpace User's Guide, 1995, Silicon Graphics http://www.sgi.com/Products/WebFORCE/WebSpace/help/help.html

4. J. Muchowski, D. Myers. IRIS Annotator 1.0 User's Guide, 1995, Silicon Graphics.

5. D. Myers. IRIS Showcase 3.0 User's Guide, 1993, Silicon Graphics.

6. D.Myers, E. Deeth. InPerson 2.0 User's Guide, 1995, Silicon Graphics.

7. M. Pesce. Building and Browsing Cyberspace. New Riders Publishing, Indianapolis, IN, 1995.

8. M. Pesce, G. Bell, A. Parisi, VRML 1.0 Specification, August 26, 1995 http://vrml.wired.com/vrml.tech/vrml10-3.html

9. E. K. Sadalla, W. J. Burroughs, and L.J. Staplin. Reference points in spatial cognition. Journal of Experimental Psychology: Human Learning and Memory (1980) . Vol. 6, No. 5, 516-528.

10. Silicon Graphics. VRML Library Archive http://webspace.sgi.com/Archive/Scenarios/

11. P. S. Strauss and R. Carey. An object-oriented 3D graphics toolkit. Computer Graphics (SIGGRAPH '92 Proceedings) (July 1992), E. E. Catmull, Ed., vol. 26, 341-349.

12. R. C. Zeleznik, K. P. Herndon, D. C. Robbins, N. Huang, T. Meyer, N. Parker, and J. F. Hughes. An interactive 3D toolkit for constructing 3D widgets. Computer Graphics (SIGGRAPH '93 Proceedings) (Aug. 1993), J. T. Kajiya, Ed., vol. 27, 81-84.