Abstract
Guiding a remote vehicle when real time image transmission is not possible is an important problem in the
field of teleoperation. In such a situation it is impractical for an operator to attempt to directly steer the
vehicle using a steering wheel. In semi-autonomous teleoperation, an operator designates the path that the
vehicle should follow in an image of the scene transmitted from the vehicle, and the vehicle autonomously
follows this path. Previous techniques for semi-autonomous teleoperation require stereo image data, or
inaccurately track paths on non-planar terrain. STRIPE (Supervised TeleRobotics using Incremental
Polyhedral-Earth geometry) is a new method that I am developing for accurate semi-autonomous
teleoperation using monocular image data. This paper provides an summary of the work I am doing for my
thesis. This includes the development of the STRIPE robotic system, user studies to empirically measure the
accuracy of the STRIPE method under various conditions and with different user interfaces, as well as
measurement of baseline data for traditional steering wheel based teleoperation under low-bandwidth and
high-latency conditions.
Keywords:
remotely operated vehicles, low-bandwidth teleoperation, semi-autonomous
teleoperation, user-interfaces, interfaces for novice users, robotics.
Introduction
Traditional methods of vehicle teleoperation require an operator to be in direct control of the remote vehicle
at all times. Such systems require a continuous stream of images to be transmitted from the vehicle to the
operator workstation, and thus the transmission link must be able to carry a large amount of data very
quickly.
In many situations, such a high-bandwidth and low-latency communication link is unavailable or even
technically impossible to provide. As the frequency and immediacy of images transmitted to the operator
decreases, the operator's ability to directly control the vehicle diminishes. At some point the operator's
accuracy becomes so low that it is not practical to try to control a vehicle in such a manner.
When remote driving is required, but direct control is not possible, a "semi-autonomous" approach must be
used. In semi-autonomous control, the operator provides the high-level decisions about where the vehicle
should go, and the vehicle takes over the task of actually controlling the steering. Because there is limited
bandwidth and potential delays in the transmission system, the operator should be able to direct the vehicle
based on a single image of the scene, and the vehicle should accurately follow the prescribed path for some
distance while a new image is being transmitted. Other semi-autonomous teleoperation systems require
stereo image data [2], or stray from the intended path on non-planar terrain [3]. I am developing the
STRIPE (Supervised TeleRobotics using Incremental Polyhedral Earth geometry) system [1] which uses
only monocular image data and accurately follows a designated path on varying terrain. STRIPE allows a
vehicle to be remotely operated even when the transmission bandwidth and latency properties are such that
it may be tens of minutes or more between images.
Anecdotal evidence suggests that STRIPE operators need extensive practice before being able to accurately
control the system in any situation in which the images do not show an obvious path along which to travel.
Many applications of the STRIPE system require that a novice user be able to quickly learn to use the
system. My thesis work involves the development of both the robot vehicle side of STRIPE, as well as the
operator interface to the system.
THE CURRENT STRIPE SYSTEM
In STRIPE, a single image from a camera mounted on the vehicle is transmitted to the operator workstation.
FIGURE 1. The basic STRIPE setup.
The operator uses a mouse to pick a series of "waypoints" in the image that define a path that the vehicle
should follow. These 2D waypoints are then transmitted back to the vehicle. In order to compute the
appropriate steering directions, the vehicle needs to somehow convert the 2D image waypoints into 3D
world waypoints.
Rather than attempting to compute the 3D locations of all the waypoints in advance, the STRIPE vehicle
module only computes waypoints as it needs them to steer. When the 2D waypoints are transmitted to the
vehicle, they are initially projected onto the vehicle's current groundplane. The resulting 3D waypoints are
used to initiate steering of the vehicle, and the vehicle starts to move. Several times a second the vehicle re-
estimates the location of its current groundplane by measuring vehicle position and orientation.The original
image waypoints are then projected onto the new groundplane to produce new 3D waypoints and the
steering direction is adjusted appropriately. This reproject-and-drive procedure is repeated until the last
waypoint is reached, or new waypoints are received.
STRIPE has been implemented on the Carnegie Mellon Navlab II, an army Hmmwv that has been
reconfigured to be completely computer controllable, and has been successfully demonstrated on a variety
of terrain types. However, several difficulties remain with respect to the user interface.
Operators seem to need extensive practice in choosing waypoints to accurately operate the vehicle in all but
the most obvious of scenes.
FIGURE 2. An "easy" road and a "difficult" intersection.
The picture below on the left shows a scene in which the intended path is easy for novice operators to
discern. However, in off-road situations, or certain on-road situations (e.g. the intersection shown below on
the right), novices seem to have a very difficult time deciding where to place waypoints.
Anecdotal evidence further suggests that novice users have even more difficulty when the orientation of the
camera relative to the vehicle is allowed to be varied. The fact that the images are discrete, and that there is
a significant delay between them, seems to prohibit operators from using the image flow to get a feel for the
camera orientation. Without knowing the orientation of the camera relative to the vehicle, one does not
know where any given point in the image is relative to the vehicle.
USER STUDIES
In order to attempt to decrease the time a novice user needs to practice the STRIPE task before achieving a
reasonable proficiency, part of my thesis work involves a series of user studies to analyze the effect of the
factors described above. I have developed a series of tests that should provide some important empirical
data for this field.
I have designed 3 vehicle test courses to use to determine differences in accuracy of the different systems.
The first course is a slalom type course made up of a series of cones. The second course involves following
a computer generated map to a goal, while avoiding obstacles not noted on the map. The final course
consists of driving along an "obvious" dirt path. Accuracy will be determined by considering the time taken
to complete the test course, the distance travelled along the test course, and whether the requirements of the
test were completed correctly.
Another important factor is the system that controls the vehicle, that is STRIPE vs. steering wheel interface
vs. driver in vehicle (i.e. no teleoperation). While I expect that operators using the steering wheel interface
will perform very poorly under any but the best transmission conditions, no empirical studies of this method
of teleoperation exist, and I feel that this data it is very important to the field.
Another very important pair of factors is bandwidth and latency. Under certain conditions, bandwidth can
be increased and latency decreased (with a potentially higher financial cost). It is important to study the
effects of varying bandwidth and latency on operator performance.
Other factors can also influence an operators ability to accurately pick waypoints. An image with a wider
field of view may give the operator a better idea of where the vehicle is, but it also causes the world
waypoints to correspond to a larger patch of ground (thus reducing accuracy). Conversely, accuracy can be
increased when the field of view is decreased, but this provides the operator with less visual information
about the scene.
In addition, there is a trade-off between image quality and speed of transmission. An operator could be
provided with one high resolution image every minute, or 4 lower resolution images in the same time
period. Color images can also be used, but typically use 3 times as much data as a standard monochrome
image.
Clearly attempting to implement a detailed study of all of the independent variables described above would
be a mammoth task. Therefore, I plan to initially do small-scale tests of the various independent variables to
determine which seem to have the most dramatic effect on an operator's ability to accurately control a
remote vehicle. Following these tests, I will pick the two that seem to have the most dramatic effect on an
individual's ability to accurately operate a remote vehicle, along with the vehicle test course, 3 alternate
STRIPE interfaces, and bandwidth vs. latency, and perform a more detailed series of empirical tests.
CONCLUSION
The STRIPE system allows accurate teleoperation across hilly terrain using low bandwidth and high latency
links, using only a single camera. This research also begins a much needed empirical study of low-
bandwidth high-latency teleoperation.
ACKNOWLEDGEMENTS
This work is sponsored in part by ARPA DAAE07-90-C-R059 and DACA76-89-C0014, and by a NASA
graduate student research fellowship.
References
1. Kay, Jennifer, and Thorpe, Charles. STRIPE: Supervised TeleRobotics Using Incremental Polygonal-
Earth Geometry in Proc. of IAS3 Intelligent Autonomous Systems (Pittsburgh, PA, February 1993) pp.399-
406.
2. Lescoe, Paul, Lavery, David, and Bedard, Roger. Navigation of Military and Space Unmanned Ground
Vehicles in Unstructured Terrains, in Proc. of Military Robotic Applications 3 (September 1991) pp. 61-67.
3. Rahim, Wadi. Feedback Limited Control System on a Skid-Steer Vehicle in Proc. of ANS Robotics and
Remote Systems 5, (Knoxville, TN, April 1993) pp. 37-44.