1 IS Lab, Hiroshima University
1-4-1 Kagamiyama
Higashi-Hiroshima-shi, 739 Japan
(824) 247669
2
HIT Lab, University of Washington
Box 352142
Seattle, WA, 98195
USA
(206) 685-3215
{poup, ichikawa}@isl.hiroshima-u.ac.jp {poup, grof,
weghorst}@hitl.washington.edu
We present results of a formal evaluation of three direct manipulation interaction techniques for picking and positioning objects in VEs: the "classical" virtual hand, ray-casting, and the Go-Go techniques. Our goal was to assess and compare the two most basic metaphors for selection and manipulation in VEs: virtual pointer and virtual hand. The main variables of interest were distance to and size of objects, interaction technique, and visual feedback. The results of the studies suggest that within the user-centered coordinate system, virtual pointing is an essentially two-dimensional metaphor, while the virtual hand is a three-dimensional metaphor. We also found that for some applications the "classical" virtual hand technique appears to be obsolete and may be replaced by the ray or Go-Go techniques without a reduction in user performance. The paper reports these and other experimental results and discusses their implications for the design of VEs.
With a rapid increase in the performance of high-end computer graphics systems as well as transition of 3D graphics on fast and inexpensive PC platforms virtual environment (VE) interfaces have become feasible enough to be used by practitioners in areas such as industrial design, data visualization, training and others [1]. Development of useful VE applications, however, requires optimization of the most basic interactions, in particular object manipulation, so users can concentrate on high-level tasks rather than on low level motor activities [2].
Currently, there is little understanding of how manipulation interfaces should be designed to maximize user performance in immersive environments [3]. Research that systematically investigates the human factors of immersive manipulation tasks, 3D devices, interaction metaphors and techniques, and their design implications remains sparse [2, 4, 5]; consequently VE designers have had to rely on their intuition and common sense, rather than on research results. However, as Professor Brooks has noted [6], "in watching many awful interfaces being designed ... I observed that the uninformed and untested intuition of the designer is almost always wrong."
In this paper we present results of a formal experimental study that evaluates three direct manipulation interaction techniques for picking and positioning objects in VEs: the "classical" virtual hand, ray-casting, and the Go-Go interaction techniques [3, 7]. The goal of the work is to assess and compare usability characteristics of two most basic 3D selection and manipulation metaphors: a virtual pointer metaphor that allows users to interact with objects by pointing at them, and a virtual hand metaphor that allows users to grab and manipulate objects with the virtual representation of their hand. Although object manipulation is among the most ubiquitous human-computer interactions in spatial 3D user interfaces, we are not aware of any formal studies that evaluate and categorize interaction techniques and metaphors for picking and positioning of objects in VEs. Prior research relates primarily to user performance as affected by various input and output devices and their characteristics [8, 9]. In contrast, the focus of this study is on the human factors characteristics of different mappings between user input, captured by input devices, and resulting actions in VEs - i.e., interaction techniques [3].
Object selection and positioning are among the most fundamental interactions occurring between humans and environments, whether it is a "desktops" of 2D direct manipulation interfaces, 3D virtual environments, or the physical world [10, 11]. Prior research on manipulation in VEs, relates primarily to assessment of user performance as a function of input and display devices and their properties. For example, a pioneering study by Ware [12] demonstrating applicability and ease of use of a 3D input device for a six degree of freedom (6DOF) placement task. A study by Zhai and Milgram [8] comparing isometric versus isotonic devices in various conditions of spatial manipulation, suggests that isometric input devices are better for rate control and isotonic for position control. Studies of stereoscopic versus monoscopic displays suggest that stereoscopy improves performance for complex manipulation tasks [13]. The effects of system performance characteristics (such as lag and frame rate) on user manipulation performance has also been studied [9, 14].
Investigation of the human factors of input and output devices has considerable value; however, the lack of systematic research on manipulation interaction techniques, which map the user's actions captured by input devices into resulting actions in VE [3], can significantly limit their appropriate use in VE design. Interaction techniques essentially define the "look and feel" of VEs; a wide variety of techniques can be implemented using the same input devices and quite a few techniques for spatial manipulation have been demonstrated [7, 15, 16, 17]. Still, there have been few attempts to formally evaluate techniques for manipulation in VE, assess their functional capabilities, and compare their relative strengths and weaknesses. A number of surveys have summarized and classified various approaches for designing techniques for spatial input and identified problems and possible solutions [3, 18]. A formal study by Hinckley [19] evaluated and compared several spatial rotational techniques. More relevant here is the pioneering usability study reported by Bowman and Hodges [11] that evaluated several VE techniques for manipulation at a distance. Although this study was somewhat informal and no quantitative data were collected, they provided useful preliminary observations of techniques.
Starting with early techniques that simply mapped position and orientation of the user's hand onto position and orientation of manipulated objects [20], the field has been expanding with more sophisticated techniques such as flash light, aperture, Go-Go, World-In-Miniature technique [7, 11, 15, 17, 21] and others. This variety of techniques, however, is also a source of difficulty. How do all these techniques relate to each other? Which interaction techniques should be chosen for particular task conditions? Which among the parameters of interaction techniques, tasks, and environments should be considered to design efficient VE interfaces? These questions persist and merit careful scrutiny by researchers and practitioners.
Straightforward evaluation and comparison of manipulation techniques is difficult. There are a multitude of different techniques; their performance varies depending on the particular implementation design; and studies of a particular technique implementation may not be readily generalized to other implementations of the same technique, thus significantly limiting their external validity.
On the other hand, many techniques apparently relate to each other and share many common properties. For example, there are more similarities between ray-casting and flashlight techniques, than between ray-casting and techniques that use non-linear mappings to extend the user's area of reach (as in Go-Go [7]). While evaluation of the ray-casting technique might provide insight into techniques similar to ray-casting, such as flashlight, it probably would not help in understanding techniques like the Go-Go. A taxonomy of techniques, classifying them according to their common properties, can be instrumental in understanding relations between techniques and directing their design and experimental evaluation.
Analysis of current VE manipulation techniques suggests that most of them are based on a few interaction metaphors. Each of these basic metaphors forms the fundamental mental model of a technique and defines what users can do (affordances) and what they can not do (constraints) when using the technique [22]. Particular techniques are essentially implementations of the basic metaphors, often extending them in order to overcome some of the metaphor's shortcomings and constraints. For example, the flashlight technique enhances ray-casting by using a spotlight to ease selection of small objects [15]. These improvements often result in new constraints; for example, with the flashlight technique an ambiguity might occur if several small objects fall into the spotlight [17].
In Figure 1 we present a simple classification of VE manipulation techniques according to their basic interaction metaphors. We divide the whole variety into exocentric and egocentric techniques. Originated in studies of cockpit displays [23], these terms are used now to distinguish between two fundamental frames of reference for user interaction with VEs. With the exocentric interaction, also known as the God's eye viewpoint, users interact with VEs from the outside (the outside-in world referenced display [23]). An example is the World-In-Miniature technique, which allows manipulation of objects by interacting with their representations in a miniature model of the environment held by the user [21]. Although the exocentric techniques are interesting and important, their evaluation is outside the scope of this work.
With the egocentric interaction, which is
the most common for immersive VEs, the user is interacting with
VEs from inside the environment - i.e., the VE embeds the user
[23]. Currently there are two basic metaphors for egocentric manipulation:
virtual hand and virtual pointer [3, 11, 16]. With the
virtual hand, users can grab and position objects by "touching"
and "picking" them with a virtual representation of
their real hand. A choice of input devices and mappings between
real hand's position and orientation and virtual hand's position
and orientation are some of the major design factors that define
particular techniques. For example a "classical" virtual
hand technique provides one-to-one mapping between the real and
virtual hands, while the Go-Go technique employs non-linear mapping
functions to extend the user's area of reach [7].

With the virtual pointer metaphor, the user selects and manipulates objects by pointing at them: when the vector emanating from the virtual pointer intersects with an object, it can be picked and manipulated [3]. The major design aspects distinguishing techniques based on this metaphor are definition of virtual pointer direction, shape of the pointer (selection volume), and methods of disambiguating the object the user wants to select. In the simplest case, the direction of the virtual pointer is defined by the orientation of the user's virtual hand, the virtual pointer is a "laser ray," and no disambiguation is provided [24] (Figure 2). Some techniques define direction of virtual pointer direction using two points: position of the user's dominant eye and location of the tracker manipulated by the user [16, 17].
For this study we elect to evaluate those techniques that implement the basic egocentric metaphors, virtual hand and virtual pointer, as closely as possible. Investigation of the basic metaphors allows us to generalize our results beyond the specific implementation of the techniques and apply them to other techniques derived from investigated metaphors. In this section we describe the implementations of techniques to be evaluated in this study.


We use the ray-casting technique for evaluation of the virtual pointer metaphor. Direction of the virtual pointer is defined by orientation of the virtual hand. The working volume of the technique is an invisible infinite ray emanating from the user's hand (Figure 2); a short segment of the ray is attached to the hand to indicate the direction of pointing. To select an object, the user points at it and presses a button on the button device. Two variations of the technique has been evaluated: with and without visual feedback. When visual feedback is applied, the color of objects changes when the virtual ray intersects with them.
Two variations of the virtual hand metaphor were investigated: the classical virtual hand technique and the Go-Go interaction technique. With both techniques the user is provided with the virtual hand, which position and orientation is controlled by the tracker attached to the user's real hand (Figure 3). To select and pick a virtual object, the user intersects the object with the virtual hand and presses a button on the button device. While the virtual hand essentially simulates the way we manipulate objects in the real world (one-to-one mapping), the Go-Go technique uses a non-linear mapping to translate the measured distance to the user's real hand into the controlled distance to the virtual one [7]. The non-linear mapping allows significant expansion of the user's area of reach. Similarly to the virtual pointer, two variations of both hand techniques have been evaluated: with and without visual feedback. With the visual feedback the object changes color when the virtual hand intersects with it.
Experiments that evaluated and compared three interaction techniques for selection and positioning objects in VEs were conducted within the framework of a Virtual Reality Manipulation Assessment Testbed (VRMAT) [25]. The VRMAT is a tool that facilitates rapid design and implementation of a variety of studies of immersive manipulation. It provides definitions of tasks and their properties, suggests experimental procedures including relevant independent and dependent variables, defines metrics and units for their measurements, and so on. In this paper we describe only those aspects of the VRMAT that are relevant to the studies.
Selection and positioning tasks were investigated in these studies. Experimental tasks required subjects to select or position virtual test objects (stimuli) using the interaction technique under investigation. Stimuli for the selection task are solitary test objects located in the user's field of view (Figure 2). After being selected, the test object disappears, informing the subject that the task was completed.
The positioning task requires the subject to place a test object on top of a terminal object indicated by a different color (Figure 3). The positioning of the stimulus can be performed using iterative movements, i.e., subjects can pick, move, and release the test object several times until the task is accomplished. The task is accomplished when the test object is positioned on the terminal with the precision specified a priori by the experimenter. The shapes for both test and terminal objects are cylinders with equal radii, providing a visual indicator of positional accuracy - the better a test object is aligned on top of the terminal, the higher the task accuracy. After successful positioning, both objects disappear, cueing the subject that the task is finished. The next test trial is then presented to the user.
The main independent variables of interest for the selection task were distance to the object, object size, interaction techniques, and visual feedback. Objects' positions and sizes are defined in a user-centered coordinate system similar to that used in Kennedy's classic study of the reaching and grasping envelope of seated U.S. Air Force operators [26]. Position of a stimulus in VE is defined as the length d and orientation , of the vector pointing from the user's chest to the object (Figure 4). Distance d from user to stimulus is defined in terms of virtual cubits, a new unit of distance introduced in the VRMAT [25]. One virtual cubit is equivalent to the length of the user's maximum reach (Figure 4); it is named after the cubit - a unit of measurement used in ancient Rome, equal to the distance between the elbow and the tip of middle finger. The advantage of using virtual cubits is an ease of generalization of results from experimental studies to practical VE development: a stimulus located at a distance of one virtual cubit in the test environment would be located on the boundary of the user's reach for any user and any other VE independently from the computational platform and software used. Virtual cubits also eliminate bias due to anthropometrical differences between subjects.

Size of the stimulus is defined as its non-occluded visual size: the vertical and horizontal angles , the object occupies in the user's field of view (Figure 4) [25]. Like virtual cubits, visual angles are also user-centered units: two objects may have the same visual size, even if they are located at different distances and have different geometrical sizes. The benefit of visual angles is the separation of influence of distance and object size on user performance: if the object's size is defined in terms of visual angles, then varying the distance to the object does not affect it's visual size. Visual angles also allow for easy generalization of results beyond the particular test VE.
The main independent variables of interest for manipulation tasks were initial distance to the stimulus, distance to the terminal position, required accuracy of positioning, and interaction technique. Both initial and final distances are defined in terms of virtual cubits; required accuracy is defined as percentage of the terminal object being overlapped by the manipulated object. Higher percentage of overlap means higher required accuracy of positioning.
Completion time, the time taken to successfully accomplish the tasks, is used as a primary performance criteria. For a selection task this is the time from the moment the stimulus appears until the moment it is successfully selected. For positioning tasks, completion time is measured from the moment the user picks a test object until the moment it is positioned with the required accuracy. Because position tasks allow iterative manipulation, we also measure the time of "net" manipulation, i.e., excluding the time required for each selection. Subjective criteria, such as subject satisfaction, is assessed through post-experimental questionnaires.
Two groups of subjects were recruited from the laboratory subject pool. Ten males and three females served as a subjects for selection task experiments; eight males and four females served as subjects for positioning task experiments. Subjects ranged in age from 19 to 32; all subjects were right handed, as determined by the Edinburgh inventory.
A balanced within-subject (repeated measures) design was used for each task. Subjects were immersed in an environment consisting of a ground plane and a virtual representation of their hand. They wore a 6D tracking sensor on their dominant hand and held a button device (used for picking targeted objects) in the other hand. After donning the HMD subjects were asked to momentarily extend their tracked hand to its full natural reach for "virtual cubit" calibration. The environment then was re-calibrated according to the length of the virtual cubit.
Following a two-minute demonstration and explanation of the interaction techniques and test tasks, subjects had approximately three minutes to practice tasks. During studies of the selection task each subject completed 18 sessions: six sessions for each interaction technique, three sessions with and three sessions without visual feedback. Fifteen conditions were defined in each session manipulating each of three different object sizes (4, 6 and 9 degrees) and five different distances (0.7, 1, 2, 4 and 6 virtual cubits). Studies of the positioning task consisted of nine sessions: three sessions for each interaction technique. Nine conditions were defined for each session: four conditions for positioning at constant distances (0.8, 2.2, 3.5, and 6 virtual cubits), four conditions for positioning with changing distances to the object (from 0.8 to 1, from 1 to 0.8, from 3.5 to 6 and from 6 to 3.5 virtual cubits) and one condition for positioning at a constant distance (0.8 virtual cubits) with high required accuracy (90% of overlap). The rest of the conditions were defined with 80% accuracy.
The order of conditions presented in the experimental sessions was randomized to control for order effects. They were presented one after the other, with a four-second delay between them, until all conditions had been tested. In addition to the on-line performance data, an informal questionnaire was administered after completion of the tasks to assess subjects' preferences and opinions.
The VRMAT testbed used for the experiments was implemented using a custom VR software toolkit developed as an extension of the Sense8 World Toolkit. An SGI Onyx RE2 workstation, equipped with a Virtual Research VR4 head-mounted display and Polhemus Fastrak 6DOF sensors, is currently used. A mouse is used as a button device for selection. The frame update rate is controlled at 15Hz.
Figure 5 and Figure 6 summarize the effects of visual feedback, distance and object size on selection time performance while using the ray-casting technique. The box plots represent the distribution of the five subjects' scores around the median for each condition (collapsing over the orthogonal factor).
|
|
|
|
| Figure 5 Box plots for selection times of objects located at various distances using ray-casting with and without visual feedback (collapsed over object size). | Figure 6 Box plots for selection time of objects of different sizes using ray-casting with and without visual feedback (collapsed over object distance). | Figure 7 Box plots for selection time of objects located at various distances using the Go-Go with and without visual feedback (collapsed over object size). |
![]() |
|
|
| Figure 8 Box plots for selection time of objects of different sizes using the Go-Go with and without visual feedback (collapsed over object distance). | Figure 9 Mean selection times for objects at different distances using Go-Go and ray-casting techniques with visual feedback applied (collapsed over object size). | Figure 10 Box plots for selection time of objects of different sizes using Go-Go and ray-casting techniques with visual feedback (collapsed over object distance). |
As shown in Figure 5, object selection time systematically increases with distance. This trend is supported by our ANOVA analysis (with visual feedback: F4,48 = 15.978, p < 0.0001; without visual feedback: F4,48 = 23.869, p < 0.0001). ANOVA reveals no treatment effect for the visual feedback at close and medium distances (0.7, 1 and 2 virtual cubits, F1,12 = 3.016, p < 0.108). Apparently, visual feedback improves the user performance at far distances (4, 6 virtual cubits, F1,12 = 18.306, p < 0.001).
Collapsing over object distance (Figure 6) reveals the time it takes to select objects using ray-casting systematically decreases with object size (with visual feedback: F2,24 = 51.784, p < 0.0001; without feedback F2,24 = 30.688, p <0.0001). ANOVA also revels a significant effect of visual feedback on user performance for all object sizes (F1,12 = 18.306, p < 0.0001).
Similarly, Figure 7 and Figure 8 summarize the effects of distance and object size on selection time performance using the Go-Go technique. As with ray-casting, collapsing over object size reveals a strong effect due to distance (with visual feedback: F4,48 = 28.253, p < 0.0001; without visual feedback F4,48 = 35.681, p < 0.0001); collapsing over object distance reveals a systematic increase in object selection time with size (with visual feedback: F2,24 = 22.701, p < 0.0001; without visual feedback: F2,24 = 48.761, p < 0.0001). ANOVA reveals no significant main effect of the visual feedback on user performance for the Go-Go technique (F1,12 = 2.690, p<0.127).
Figure 9 compares mean selection performance times for ray casting and Go-Go interaction techniques for various object distances. Visual feedback data is used to compare both techniques on their peak performance. ANOVA reveals that the ray-casting technique results in better performance on distances close to the user (F1,12 = 9.355, p < 0.01); with increased distance both Go-Go and ray casting provide similar performance (F1,12 = 0.008, p < 0.936). Although the ray casting seems to be faster on far distances (6 virtual cubits), ANOVA reveals no treatment effect for technique at this distance (F1,12 = 0.948, p < 0.350).


Comparing selection times for two techniques across size conditions (Figure 10), two-tailed paired t-tests reveal significantly better performance for ray-casting in both the big and medium conditions (big: t = 4.541, df = 12, p < 0.001; medium: t = 3.109, df = 12, p = 0.009). In contrast, the Go-Go technique results in better selection performance for small objects (t=-3.026, df = 12, p<0.01).
Finally, we compared Go-Go and ray casting techniques with a traditional virtual hand technique for selection of objects of different sizes close to the user (0.7 virtual cubits). ANOVA reveals that the Go-Go technique results essentially in the same performance as the virtual hand (F1,12 = 0.22, p < 0.648) for all object sizes. Similarly, ANOVA does not reveal performance differences between ray casting and virtual hand techniques for selection of small and medium size objects (4 and 6 degrees of visual field; F1,12 = 0.38, p < 0.849). However, ray casting results in better performance for selection of large objects (9 degrees of visual field; F2,24 = 7.96, p < 0.002).
Positioning objects from a close to a far distance and vice versa is difficult using the ray technique. The implementation tested does not allow subjects to change the ray length, so they can position objects only through iterative movements. In pilot studies this method required an average of 10 iterations with a mean "net movement" time (i.e., with selection time subtracted) of 33.66 sec., compared to 4 iterations and 11.89 sec. for the Go-Go.
However, the ray-casting technique can be efficient if objects repositioning does not require changing their distance from the user. Figure 11 compares performance of Go-Go and ray-casting techniques for object positioning. ANOVA does not reveal performance differences between the ray casting and Go-Go for medium and far distances (F1,11 = 1.44, p < 0.711). Moreover, mean comparisons using the two-tailed paired t-test reveals better performance for ray-casting at a close distance (t=2.55, df = 11, p< 0.027).
Finally, we compared ray casting, Go-Go and virtual hand techniques for positioning occurring within the area of user reach (Figure 12). According to the box plot presentation, all three interaction techniques result in similar performance in those conditions which require the user to move manipulated object closer or further (F1,11 = 1.359, p < 0.28). When object positioning does not require changing the distance to the object, the ray casting and virtual hand techniques result in better performance than the Go-Go (F2,22 = 8.8, p < 0.002). Generally, the ray casting technique results in poorer performance when a change in the distance is required (F2,22 = 17.786, p < 0.0001), while Go-Go seems to be equally effective in all conditions (F2,22 = 1.309, p < 0.29). There is also a significant treatment effect of required accuracy on positioning performance (F1,11 = 103.243, p < 0.0001). An increase of required positioning accuracy, from 80% (lower accuracy) to 90% (high accuracy) target object overlap, results in a decrease of the performance for all techniques.
None of the subjects had difficulties in using either Go-Go, ray casting, or virtual hand techniques. The Go-Go technique was rated as most enjoyable and intuitive, with ray casting second. This finding is supported by Bowman et al. [11]. Three subjects, however, preferred the plain hand, reporting that it was more familiar and simulated the way they interact in the real world. All subjects were dissatisfied with the performance of the ray casting at far distances and for selection of small objects, and all of them noted that visual feedback was helpful in these conditions. Some subjects noted difficulties in using the Go-Go at close distances - in particular, the distance where linear mapping switches to non-linear. Subjects reported that one of the main sources of difficulty in positioning objects at a distance was limited visual cues, rather than shortcomings of the techniques themselves. Subjects simply could not see if the object was being positioned correctly.
Our findings for ray-casting performance suggest that it is essentially a two-dimensional technique, defined in terms of the user-centered distance/pitch/yaw coordinates used in this study, rather than world-centered Cartesian (x/y/z) coordinates. For object selection within the user field of view and for repositioning of objects at a constant distance from the user, ray casting is an efficient and effective technique. Ray casting is far less useful as a repositioning technique when change in distance is required. Indeed, even within the area of user reach, ray-casting performs better for positioning at constant distance. The Go-Go, on the other hand, resulted in essentially the same performance for all those conditions.
Adding visual feedback does not necessarily improve user performance. For selection of solitary objects located relatively close to the user, ray casting provides essentially the same performance with or without visual feedback. The Go-Go technique resulted in essentially the same performance for most of the task conditions with or without visual feedback. This could be due to the fact that with techniques based on the virtual hand metaphor, the user can see when the hand intersects the object; thus visual feedback is an inherent part of the technique. Visual feedback, however, improves the user's performance for boundary conditions - for example, in selecting small objects located far away using the ray-casting technique. Also, under certain conditions such as selection of occluded objects or objects within a group, enhancing the techniques with visual feedback might improve user performance.
For both interaction techniques we see that as object size decreases, the "target" object is increasingly harder to "hit." This finding is consistent with expectations and appears to represent a "Fitts Law" phenomenon. Similarly to Go-Go, ray-casting selection exhibits a performance falloff due to object distance. Reports by subjects suggest that the decrease in performance of ray casting with increased distance may be due to difficulties with hand-eye coordination and tracker noise. With ray-casting, the influence of distance on user performance decreases with the increase of object size; for a large object the selection time is essentially the same in all tested distances.
Overall, both ray-casting and Go-Go seem to provide essentially similar performance for selection at medium distance; while ray casting is more effective at close distances and in selection of big objects, and Go-Go is more effective in selection of small objects. For the positioning task both techniques seem to results in the same performance when the task does not require a change in distance; however, the Go-Go technique is superior for those tasks requiring changes in distance to object. Other aspects of the manipulation task may interact with these main effects. Object occlusion and density, for instance, may differentially affect interaction performance with the two techniques. The higher performance of the Go-Go for selection of small objects can also be an advantage for selection of partially occluded objects, due to their diminished visual size.
Finally, both ray-casting and Go-Go techniques provided essentially the same or better performance than a classical virtual hand interaction for the conditions of immersive manipulation selected for these studies.
Our findings in this study are consistent with the notion that selection and positioning of solitary objects can be either a 2D or 3D task, depending on whether the object distance is manipulated. Within the user-centered coordinate system used in this studies, the virtual pointer seems to be essentially a 2D manipulation metaphor, while the virtual hand, seems to be a 3D manipulation metaphor. The 2D nature of the ray technique implies that we may be able to apply well-developed guidelines and techniques from 2D graphical user interface design for development of immersive ray-based interaction dialogs. Furthermore, the "classical" virtual hand technique appears to be obsolete and may be replaced by the ray and Go-Go techniques without a reduction in user performance. By affording manipulation at a distance, both the ray and Go-Go techniques provide more functionality relative to the plain virtual hand.
The results of the study suggest that each technique provides its best performance within a certain area of effective manipulation and for certain object sizes. Improvements to the techniques, such as introduction of visual feedback, do not significantly affect user performance for these "standard" conditions, but rather extend the limits by improving user performance at the boundary conditions (Figure 5). Therefore, development of better manipulation techniques, which would allow for effective manipulation at further distances and for smaller object sizes, may not be the only way to build efficient VE interfaces. Instead of improving techniques, developers can take another route: improving spatial design of virtual environments to allow the existing techniques their best performance. For example, the ray casting technique can provide a satisfactory performance even without visual feedback if objects are located within 3 virtual cubits or have sufficient visual size (more than 4 degrees of visual fields). If these conditions are satisfied, the most generic implementation of ray-casting would perform well, resulting in simpler user interfaces.
Certainly for some applications it is not possible to design the VE around the techniques. In this case alternative approaches can be investigated, such as combinations of flying and manipulation or applications of the exocentric techniques. Nevertheless, there are many application domains where designers do have freedom to fit the environment to the interface - for example, VEs for information visualization VEs.
The growing acceptance of VE technology will require more attention to optimizing the immersive interaction in order to maximize user performance. The research reported here is just a small step toward understanding the human factors behind manipulation in VEs and their design implications. Future studies of VE manipulation should further investigate the design aspects of the techniques and their influence on user performance; assess usability of the techniques in other conditions of manipulation tasks; evaluate combinations of manipulation techniques with navigation techniques; and explore possible ways to integrate various techniques into seamless and intuitive interaction dialogues.
This research was partially sponsored by the Air Force Office of Scientific Research (contract #92-NL-225) and a grant from the HIT Lab Virtual Worlds Consortium. The authors want to especially thank Edward Miller for his comments and suggestions as well as his help with the VRMAT development. We would also like to thank Sisinio Baldis, Jennifer Feyma, Prof. Masahito Hirakawa, Jerry Prothero, Atsuo Yoshitaka, and all subjects who participated in the experiments.
Göbel, M., Industrial applications of VEs. IEEE Comt could stand in the way. Proceedings of VRAIS'95. 1995. ACM. pp. 28-34.
Mine, M., Virtual environment interaction techniques. UNC Chapel Hill CS Dept.: TR95-018. 1995.
Herndon, K., A. van Dam, M. Gleicher, The challenges of 3D interaction: a CHI'94 workshop. SIGCHI Bulletin, 1994. 26(4): pp. 36-43.
Durlach, N., A. Mavor, eds. Virtual reality: scien-tificputer Graphics & Applications, 1996. 16(1): pp. 10-13.
Stanney, K., Realizing the full potential of virtual reality: human factors issues tha and technological challenges. 1995, National Academy Press: WA. pp. 542.
Brooks, F., Grasping reality through illusion - interactive graphics serving science. Proceedings of CHI'88. 1988. ACM. pp. 1-11.
Poupyrev, I., M. Billinghurst, S. Weghorst, T. Ichikawa, Go-Go Interaction Technique: Non-Linear Mapping for Direct Manipulation in VR. Proceedings of UIST'96. 1996. ACM. pp. 79-80.
Zhai, S., P. Milgram, Human performance evaluation of manipulation schemes in virtual environments. Pro-ceedings of VRAIS'93. 1993. IEEE. pp. 155-61.
Watson, B., V. Spaulding, N. Walker, W. Ribarsky, Evaluation of the effects of frame time variation on VR task performance. Proceedings of VRAIS'96. 1996. IEEE. pp. 38-52.
Foley, D., V. Wallace, V. Chan, The human factors of computer graphics interaction techniques. IEEE Com-puter Graphics & Applications, 1984(4): pp. 13-48.
Bowman, D., L. Hodges, An evaluation of techniques for grabbing and manipulating remote objects in immersive virtual environments. Proceedings of Symposium on In-teractive 3D Graphics. 1997. ACM. pp. 35-38.
Ware, C., Using hand for virtual object placement. Visual Comp., 1990. 5(6): pp. 245-253.
Spain, E., K. Holzhauzen, Stereoscopic versus orthogonal view displays for performance of a remote manipulation task. Proceedings of Stereoscopic Displays and Appli-cations II. 1991. SPIE. pp. 103-110.
MacKenzie, I., C. Ware, Lag as a determinant of human performance on interactive systems. Proceedings of INTERCHI'93. 1993. ACM. pp. 488-493.
Liang, J., JDCAD: A Highly Interactive 3D Modeling System. Computers and Graphics, 1994. 18(4): pp. 499-506.
Pierce, J., A. Forsberg, M. Conway, S. Hong, R. Zeleznik, M. Mine, Image plane interaction techniques in 3D immersive environments. Proceedings of Sym-posium on Interactive 3D Graphics. 1997. ACM.
Forsberg, A., K. Herndon, R. Zeleznik, Aperture based selection for immersive virtual environment. Proceed-ings of UIST'96. 1996. ACM. pp. 95-96.
Hinckley, K., R. Pausch, J. Goble, N. Kassell, A survey of design issues in spatial input. Proceedings of UIST `94. 1994. ACM. pp. 213-22.
Hinckley, K., J. Tullio, R. Pausch, D. Proffitt, N. Kassel, Usability analysis of 3D rotation techniques. Proceed-ings of ACM UIST'97. 1997.
Ware, C., D.R. Jessome, Using the bat: a six-dimensional mouse for object placement. IEEE Computer Graph-ics&Applications, 1988. 8(6): pp. 65-70.
Stoakley, R., M. Conway, R. Pausch, Virtual reality on a WIM: interactive worlds in miniature. Proceedings of CHI'95. 1995. pp. 265-272.
Erickson, T., Working with interface metaphors. In Read-ings in human-computer interaction: toward the year 2000, R. Baecker, J. Grudin, and W. Buxton, Editors. 1995, Morgan Kaufmann Publishers, Inc.: San-Francisco, CA. pp. 147-151.
Wickens, C.D., P. Baker, Cognitive Issues in Virtual Real-ity. In Virtual Environments and Advanced Interface Design, T.A. Furness and W. Barfield, Editors. 1995, Oxford University Press: New York, NY. pp. 514-542.
Jacoby, R., M. Ferneau, J. Humphries, Gestural Interaction in a Virtual Environment. Proceedings of Stereoscopic Display and Virtual Reality Systems: The Engineering Reality of Virtual Reality. 1994. SPIE. pp. 355-364.
Poupyrev, I., S. Weghorst, M. Billinghurst, T. Ichikawa, A framework and testbed for studying manipulation technique for immersive VR. Proceedings of VRST'97. 1997. ACM. pp. 21-28.
Kennedy, K., Reach capability of the USAF population: Phase 1. The outer boundaries of grasping-reach enve-lopes for the short-sleeved, seated operator. USAF, AMRL: TDR 64-56. 1964.