CHI '95 ProceedingsTopIndexes
PapersTOC

Dynamic Stereo Displays

Colin Ware

Faculty of Computer Science
University of New Brunswick
P.O. Box 4400, Fredericton
New Brunswick, Canada E3B 5A3
Email: cware@UNB.ca

© ACM

Abstract

Based on a review of the facts about human stereo vision, a case is made that the stereo processing mechanism is highly flexible. Stereopsis seems to provide only local additional depth information, rather than defining the overall 3D geometry of a perceived scene. New phenomenological and experimental evidence is presented to support this view. The first demonstration shows that kinetic depth information dominates stereopsis in a depth cue conflict. Experiment 1 shows that dynamic changes in effective eye separation are not noticed if they occur over a period of a few seconds. Experiment 2 shows that subjects who are given control over their effective eye separation, can comfortably work with larger than normal eye separations when viewing a low relief scene. Finally, an algorithm is presented for the generation of dynamic stereo images designed to reduce the normal eye strain that occurs due to the mis-coupling of focus and vergence cues.

Keywords:

Stereo displays, Virtual reality, 3D displays.

Introduction

The stereoscopic depth cue consists of relative differences or disparities between parts of the images available to the two eyes. In normal circumstances this information is effective only for objects less than 25 meters away, and it is optimal for objects that are much closer.

In computer graphics, some objects have no inherent spatial size, others may be representations of mountains or microscopic entities. One way of obtaining an appropriate stereo view is to scale the scene and bring it to an appropriate viewing distance. Another is to change the effective eye separations dynamically. An interesting research question is whether it is possible to create a system in which the stereo disparity is changed dynamically so as to create near optimal disparities for perceiving depth information no matter whether the graphical object is at a great distance or close. The question addressed here is the extent to which changing disparities in real-time is perceptually disturbing. Two new experiments and two demonstrations are reported that show that human perception of depth through stereopsis is highly flexible, large distortions of the correct perspective geometry are possible and these distortions may be changed dynamically without undue perceptual ill effects.

An algorithm is presented which is designed to take advantage of this perceptual flexibility to allow the real-time adjustment of stereo disparity values as the user moves through the image space. The goal is to create a system in which the disparity values are always comfortable.

Stereo Vision

First some terminology and basic facts relating to stereo vision. Figure 1 illustrates the simplest possible stereo display. The eyes are fixated on the vertical line a. A second line b is closer to a in the right eye's image than in the left eye's image. The brain resolves this discrepancy by perceiving the lines as being at different depths as shown.

Retinal disparity is the difference between the angular separation of a and b in the two eyes

.

Vergence is the degree to which the two eyes converge to fixate a target (this is also called phoria).

If the disparity between the two images becomes too great then diplopia occurs. This is the appearance of the doubling of part of the image. Another way of putting this is that the images are no longer fused. Whether two images can be fused or not and the area within which fusion occurs is called Panum's Fusion Area. However the size of Panum's fusion area is highly dependent on a number of visual display parameters such as the exposure duration to the images and the size of the targets. It is also true that depth judgments can be made despite diplopia, in other words, outside of the fusion area, although these are less accurate. For an excellent introductory review of stereo vision from a human factors perspective see [7].

Eye Separation

In stereo photogrammetry and in certain kinds of range finders it is common to create stereo images which have an effective eye separation much larger than any actual eye separation [5]. The reason for this is obvious; human eyes are only placed approximately 6.3 cm apart, which means that stereo information is only a useful depth cue up to 30 meters or so. However, if we can effectively change the eye separation then far more distant objects can be resolved by stereopsis. In viewing a mountain 10 km distant a virtual eye separation of 1 km might be appropriate. If viewing an object at 1 cm (as in a stereo microscope) a virtual eye separation of 1 mm will be more suitable.

FIGURE 1. An illustration of some of the basic geometry relating to stereoscopic viewing.

Cue Conflict

Occlusion is one of the major depth cues. It is a perceptual rule that says that closer objects always occlude (i.e. cover up) further objects. The problem is that when disparity information causes an object appear in front of a screen display the edge of the screen may appear to occlude that object and since occlusion is the stronger depth cue, the conflict is resolved perceptually in favor of occlusion, destroying the illusion of depth.

A more subtle depth cue conflict can arise if we try to change the stereo separation dynamically while moving through a scene. One of the most important depth cues comes from the dynamic flow of information across the retina, and evidence suggests that this is more important to 3D space perception than stereopsis [1,2]. When we are driving along a highway, we have a very strong sense of space yet almost all objects are likely to be outside the range at which stereo disparity is an effective depth cue. Figure 2 illustrates the kind of visual flow field that result from forward motion. This depth cue is also called motion parallax or, in some cases, the kinetic depth effect.

FIGURE 2. The kind of visual flow field that results from forward motion through a 3D environment.

Dynamically changing disparities, should cause changes in the relative depths in a scene. However, if other depth cues are stronger this effect may not be apparent. There is some evidence that the other dynamic depth cues, such as motion parallax, so dominate space perception that altering the effective disparities will be invisible. The perceptual question is, can we fly around a scene dynamically changing the effective eye separation without the users perceiving a rubbery distortion of the scene? Distortion should occur if the brain is a perfect geometry processor. Also, if rubbery distortion does appear is the effect disturbing or is it an acceptable price to pay for optimal stereopsis?

A piece of indirect evidence for the relative weakness of the stereo depth cue comes from a paper by Wallack [10]. In his study Wallack increased the effective eye separation in a telestereoscope which more than doubled the effective eye separation of the subjects. His subjects viewed a rotating wire object. The point that is relevant here is that before the actual experiment Wallack had to discard half his subjects because they failed to perceive any size distortion of the object as it rotated, whereas the disparity-vergence information should have made the object appear to stretch greatly in depth as it rotated. Clearly for those subjects that were discarded the kinetic depth effect (perhaps combined with object rigidity assumptions) completely dominated the percept. What is more after a short period of exposure the shape distortion of the object appeared much reduced even for those subjects who passed the initial test.

The Vergence focus problem

When we fixate objects at different depths, two things happen: the degree of convergence of the eyes changes (called vergence) and the focal length of the lens in the eye changes to create a sharp image on the retina. The vergence and the focus mechanism are known to be coupled in the human visual system. In fact if one eye is covered the vergence of that covered eye changes as the uncovered eye focuses on objects at different distances.

In a screen all objects lie in the same focal plane no matter what the apparent depth. However, the eye may be fooled into thinking that they are at different depths by means of stereo display that provides accurate disparity and vergence information. The problem is that in screen based stereo displays vergence information is provided correctly but focus information is not.

There is some evidence that the failure to correctly change focus information causes a form of eye strain [5]. A recent Japanese study showed that after watching 3D images for a while the eyes lose their ability to refocus quickly [6]. This problem is present in all current generations of stereoscopic head mounted displays and with monitor based stereo displays.

In another study it has been shown that the coupling of accommodation and vergence can be changed [4] and that this change can persist for some time. There appears to be considerable flexibility in the visual system regarding the coupling of focus and vergence. Anytime that a person dons a pair of reading glasses her visual system is forced to make an adjustment to a fixed change in focal length of her eye. This forces a change in the focus vergence relationship. With bifocals this re-adjustment must be continuously effected.

In view of the above observations how may we reduce the problems associated with the decoupling of focus and vergence in stereo display? One solution that seems obvious is to try to make images lie in the vicinity of the monitor screen, to reduce the parallax. This will minimize the focus vergence discrepancy. Valyrus [8] found experimentally that the accommodation vergence discrepancy should not be more than 1.6 degrees. He proposed a guideline based on parallax, which he defined as the spatial discrepancy on the screen between homologous image points from the two eyes. His guideline states that

P <= 0.03D

where P is the parallax and D is the viewing distance, otherwise diplopia will occur. Veron et al. [9] used this formula to derive the guideline that screen based stereo displays should be placed 2.3 metres from the viewer to give an image that it should always be possible to fuse. They assumed that the virtual object would always be placed behind the screen.

Based on a different analysis of the problem Williams and Parrish [12] concluded that a practical viewing volume falls between -25% and + 60% of the viewer to screen distance. They proposed a method whereby objects at different depths can optimally use the available disparity range and show how objects at two or more different distances can be brought into the useful viewing volume. Their scheme parcels out the available disparity so that certain depth ranges are enhanced stereoscopically, while others are reduced in terms of the stereo depth. For example, in a scene with two objects, the distance between the front and back of each objects is allocated a large disparity range, while the empty space between them is made devoid of disparity. What is interesting here is that this approach assumes that disparity is more important for seeing the local 3D shape of the individual objects rather than the 3D relationship between the two objects. Disparity becomes only local depth cue and not a global depth cue. Whether or not this is appropriate it assumes that the brain can tolerate inconsistencies between disparity information and other depth cues.

Summary Of Major Points So Far

A reasonable working hypothesis is that most of our understanding of 3D space comes from depth cues such as occlusion, motion parallax and linear perspective. Stereo disparity provides additional, rather local information about relative depths. Therefore it may be reasonable to devise algorithms to dynamically adjust disparity information so that it is optimal for a particular situation, because the fact that depth cue conflicts will result is unlikely to be noticed.

The following series of demonstrations and experimental studies were all devised to test the validity of this hypothesis and explore the usefulness of dynamic stereo adjustments.

Changing Effective Eye Separations

It is possible to change the effective eye separation by a number of means.

1) By changing the eye separation parameter in a computer graphics stereo display the scene can be flattened or depth enhanced. Given the correct viewing position it is possible to construct a stereo view of a 3D scene such that the images presented to the two eyes are correct for an object in the vicinity of the monitor [2]. We assume that an eye separation parameter is set to 1.0 for "correct viewing" and 0.0 when both eyes get the same image, as in single viewpoint graphics. Clearly, it is also possible to set this parameter to intervening values, or even outside of the range 0-1. If it is negative then depth relationships are inverted (which in not likely to be useful), if greater than 1.0 then stereo depth is enhanced. Thus this parameter has the effect of flattening or depth enhancing the image in terms of disparity cues.

2) By scaling an object and bringing it closer or moving it futher away, the separation of the eyes in relative to the object's size can be changed. This if a 5 km mountain viewed at 10 km distance is shrunk to 0.5 meter and moved to a viewing distance of 1 meter the eyes will be effectively 10,000 times further apart relative to the mountains original size.

3) By use of mirrors or prisms it is possible to actually change the optical separation of the pupils of the eyes. We will not be concerned with this method here.

Equipment Used

All of the studies, except for the last used an Indigo Extreme with the Cyberscope(TM). The Cyberscope consists of a hood that can be placed over a small monitor allowing for the stereo viewing of properly constructed images. The Cyberscope uses front surface mirrors to displace and rotate the images presented to the two eye as shown in Figure 3

FIGURE 3. The cyberscope optically rotates the images from the two halves of the screen, 90 deg clockwise and counter clockwise respectively, and superimposes them. This is done using front surface mirrors to provide perfect optical clarity.

DEMONSTRATION 1: Does Motion parallax dominate stereo disparity when they are in conflict?

This study was directed at the relationship between optical flow information spatial cues, and stereo disparity cues. A special display was created in which the flow of visual information was consistent either with a continuously approaching surface, or with an inflating surface, This scene was constructed to be self similar at all scales of resolution, and it was designed to be constantly expanding about a center at the plane of the screen (Illustrated in Figure 4 and in Color Plate 1, Ware,). Constantly inflating objects are not common but an expanding flow field is present whenever we move forward through the environment. Thus, based on experience observers might be expected to perceive movement of the scene towards them.

However, the scene was viewed in stereo and the stereo depth cues should have been enough to tell the observers that the scene was in fact expanding away from them.

FIGURE 4. A schematic cross section of the recursively defined scene. The truncated pyramid in the center of each group of three, has a group of three truncated pyramids on top of it. The scene is self similar through scale transformations about the dot. The scene was viewed in stereo from above while continuously expanding.

The issue was, would subjects perceive the scene as inflating - which was the only geometrically consistent way of perceiving the pattern - or would they perceive something constantly coming towards them, as would be more consistent with everyday experience.

Observers were asked to look at this display and comment on what they saw. The general consensus on observing this display is that it shows a scene "coming up towards me". This impression lessens somewhat with time, and sometimes the rate of advance appears to slow and speedup, depending on the stage in the animation cycle. However, none of the observers reported expansion, and none of them reported seeing the scene moving away from them, as was in fact happening. This suggests a very powerful dominance of the optical flow information over stereo information.

EXPERIMENT 1: How fast can we change disparity cues without the effect being noticeable ?

In the introduction to this paper a case was made that the perceptual motor system is capable of re calibrating the disparity depth cue mechanism in the presence of other depth cues, such as motion parallax. Another way of describing it is that the disparity mechanism is insensitive to low frequency change. This study is directed to asking the question of how fast this re calibration can take place by measuring the frequence of disparity changes that are just detectable.

Method

In order to investigate this problem a scene was constructed in which a moving carpet dotted with truncated pyramids moved perpetually towards the observer. An image configured for the Cyberscope is illustrated in Color Plate 2 (Ware).

The scene was viewed in stereo and the effective eye separation was changed sinusoidally with an accelerating frequency. To describe this transformation it is useful to examine the extremes. With zero eye separation we have the same images presented to the two eyes but kinetic information consistent with a 3D scene (like looking at a moving television picture). A separation of 6.3 cm is normal and results in a correct stereo display for which disparity cues and the motion parallax are consistent with a 3D scene (making the normal perceptual assumptions about rigidity). Sinusoidal changes in eye separation should result in a sensation of oscillating depth if the brain were to rely primarily on disparity information but this would be in conflict with the rigidity assumption given the linear perspective and motion flow information.

On each trial the change in separation was started slowly and gradually sped up until it became noticeable. This speedup was such that after 50 seconds the eye separation was being changed at 1 Hz. At 100 seconds the frequency would be 2 Hz. There was also a random offset in time to the start of oscillation so that subjects could not anticipate this in their responses. The actual eye separation did not oscillate through the full range but varied with different amplitudes under different conditions. The amplitudes of oscillation were 10%, 20%, 30% The eye separations also varied 6.3 cm, 4.2 cm and 2.1 cm. Thus there were nine viewing conditions given by the product of these sets of settings.

One of the problems with this study was the difficulty in describing to subjects what they were supposed to look for. Subjects do not report sinusoidal depth changes. Instead a kind of paradoxical sideways movement is perceived; it is paradoxical because it is going in both directions at once. Subjects had to be trained to be able to see this phenomenon. Once this was achieved they were instructed to push a mouse button as soon as the oscillation became noticeable. This had the effect of recording the result and initiating the next trial.

Nine subjects who were all undergraduate or graduate students were used as observers.

Results

The results showing the mean frequency at which subjects detected the paradoxical motion are plotted in Figure 5. They show that the oscillation frequency that is detectable varies inversely with the amplitude of oscillation, and inversely with the effective eye separation. Both of these are to be expected since both eye separation and amplitude increase the amount of disparity change over time. The worst case was for a the maximum amplitude and the maximum eye separation, in which case the time to average frequency was 0.3 Hz. This is a remarkably rapid rate of adaptation to changing disparity ratios.

FIGURE 5. Results from Experiment 1 showing how the frequency at which oscillating eye separations are detected varies with different amplitudes and base values.

In practical terms what the results mean is that with a moving pattern eye separation can be changed dynamically as long as it is done gradually, taking several seconds to smoothly change. In this case viewers are unlikely to notice that anything unusual has happened.

EXPERIMENT 2: How do observers adjust their eye separation ?

The second experiment was initially designed to address the issue whether or not there is an obviously correct eye separation setting that is consistent with the geometry of a scene. However, in our pilot study we found that subjects had very little idea of what the "correct setting" was, therefore we changed the task and asked the users to create "the maximum comfortable setting" in terms of eye separation. Subjects were given control over the effective eye separation and instructed to increase the eye separation until diplopia occurred and then move it back to a comfortable value. The moving carpet display was used again for this study.

FIGURE 6. The moving carpet of truncated pyramids was presented in stereo and at different angles to the vertical plane of the monitor screen.

Subjects were given controls that allowed them to adjust the eye separation by depressing one of two keys, one of which increased the eye separation, the other of which decreased the eye separation. Subjects did this with the computer graphics model of the moving carpet set at 8 different angles with respect to the monitor (as shown in Figure 6) and they repeated the procedure twice to provide two settings at each angle.

FIGURE 7. The results 10 subjects participating in Experiment 2

The results are shown in Figure 7. They indicate that subjects could comfortably tolerate much greater disparities with scenes having little depth in them, and small disparities with scene that contained a lot of depth. This suggests that automatically changing the effective eye separation information about depth in a scene is probably a good idea even if it means breaking the rules of consistent geometry to do so. It is also clear that there are large individual difference with respect to the amount of disparity that can be tolerated suggesting that users of stereo displays should be able to customize a disparity parameter for their own comfort.

ALGORITHM FOR DYNAMIC STEREO ADJUSTMENT

The following algorithm was created to allow for the viewing of any scene with automatic adjustment so the stereo values would be in a reasonable range and could change dynamically. This algorithm has three steps.

Step 1: measure the closest portion of the displayed image. (This can be done by sampling the Z buffer).

Step 2: scale the scene about a mid point between the observer's two eyes in such a way that the closest point lies just behind the screen.

Step 3: render this modified scene in stereo using the normal methods for constructing off axis perspective views [3].

The transformation is illustrated in Figure 8.

FIGURE 8. Schematic illustration of the effects of the stereo adjustment algorithm.

This algorithm achieves the following things.

DEMONSTRATION 2: Stereo adjustment algorithm with a large screen

Representatives of the UNB Ocean Mapping Group showed several thousand people the dynamic stereo display using a large format, 60 in diagonal screen, Electrohome projector at the CeBIT trade show in Germany. They all viewed a digital elevation map showing a line of undersea volcanoes in the South Pacific. This is shown as a monocular image in Color Plate 3, Ware. The eye separation was set to be considerably larger than usual, about 24 cm, in order to get appropriate disparities given the approximately 3 meter viewing distance. The interface allowed people to "fly" over the terrain using a six-degree of freedom input device [11]. In general viewers were very impressed by the large format, high resolution stereo display. None reported that the scene appeared to be expanding and contracting as they moved through the artificial landscape.

CONCLUSION

All of the evidence presented here is consistent with the hypothesis that the disparity depth cue is a highly flexible depth enhancement, rather than the primary determinant of 3D space perception. What this means is that in the absence of evidence to the contrary, hyper stereo adjustments are a useful tool in information display. We apparently do not need to be careful about matching the stereo geometry with the actual eye geometry. Rather what is important is to create stereo displays which maximize disparity gradients while maintaining them at a level below that at which diplopia sets in. Given this interpretation it seems to be worth artificially changing scenes so that the stereo information about relative depths is optimized, even though this stereo information may be in conflict with the other depth cues available, such as linear perspective and motion flow. The two advantages to such manipulations will be that disparities can be optimized for depth discrimination in a given scene and vergence-focus conficts can be reduced - which has the effect of reducing long term eye-strain.

References

  1. Arthur, K., Booth, K.S. and Ware, C., (1993) Evaluating 3D Task Performance for Fish Tank Virtual Worlds. ACM Transactions on Information Systems. 11(3) 239-265.
  2. Cutting, J.E. (1986) Perception with an eye for motion. MIT Press, Cambridge, Mass.

  3. Deering, M. (1992) High Resolution Virtual Reality. Proceedings of SIGGRAPH '92. Computer Graphics, 26, 2 (July 1992), pp. 195-202.
  4. Judge, S.J. and Miles, F.A. (1985) Changes in the coupling between accomodation and vergence eye movments induce in human subjects by altering the effective interocular separation. Perception 14, 617-629.
  5. Lippert, T.M. and Benser, E.T. (1987) Photointerpreter Evaluation of Hyperstereographic Forward Looking Infrared (FLIR) Senso Imagery.
  6. Noro, Kageyu, (1993) Industrial Aplication of Virtual Reality and Possible Health Problems. Japanese Journal of Ergonomica, v. 29, 126-129.
  7. Patterson, Robert (1992) Human Stereopsis. Human Factors, 34(2) 669-692.

  8. Valyrus, N.A. "Stereoscopy" Focal Press, London, 1966.

  9. Veron, H. , Southard, D.A. Leger, J.R. and Conway, J.L. 1990 Stereoscopic Displays of Terrain Database Visualization, Proceedings of SPIE Stereoscopic Displays and Applications, Santa Clara, 124-135.
  10. Wallack, H, and Karsh, E. (1963) The modification of stereoscopic depth perception based on oculomotor cues. Perception and psychophysics. 11, 110-116.
  11. Ware, C. and Slipp, L. (1991) Using Velocity Control to navigate 3D graphical environments: a comparison of three interfaces. Proceedings of Human Factors Society Meeting, San Francisco, Sept. 35, 300-304.
  12. Williams, S.P, and Parrish, R.V. 1990 New computational control techniques and increased understanding for stereo 3D displays. Proceedings of SPIE Stereoscopic Displays and Applications, Santa Clara, 73-82.