How real is virtual reality? The answer may come from measuring how participants remember the experience. Virtual Reality Monitoring is the decision process in which people distinguish between real, virtual, and imagined events, as represented in memory. It offers an intrinsic method which addresses the quality of the virtual experience in the context of the technology's human goal; convincing the participant that the experience is real. The success of this endeavor can be discerned by whether a memory of a virtual event is attributed to a real event. Memory becomes the mechanism by which the quality of the experience, and the technology that contributes to that experience, are evaluated.
How do you characterize the quality of the virtual experience? The degree of sensory realism would be one method. Some taxonomies have defined this realism extrinsically, based on a description of how the technology generates the experience. For instance, in describing technologically-mediated experience, Robinett (1992) chose to describe the types of input devices. There are several Display types (e.g., helmet-mounted displays, headphones, force-feedback devices) which supply input to the physical channels which humans can sense (e.g., vision, audition, and touch). Phenomena that can not be detected may be captured and perceived with the addition of Sensor types (e.g., UV detector for ultraviolet light, gas chromatograph for chemical composition). Additionally, other devices can capture output from human motor channels (e.g., head trackers for head movement) so that realistic actions may be performed within the synthetic environment (e.g., changing the point-of-view). Presumably, more devices make for a better experience. Sheridan (1992) makes this assumption explicit. The amount of sensory input may be important in achieving the sense of presence in virtual reality. Likewise, Zeltzer (1992) believes that presence depends on the number and fidelity of available input and output channels. Moreover, the quality of the virtual experience may depend on the ability of the computational model to act and react to simulated events and stimuli. In summary, in these taxonomies, a description of the quality of the virtual experience is embedded in a description of the components of the technology and the quality of the physical interface that they generate.
However, a description of the technology may be insufficient to describe the subjective experience of a virtual reality. There is not necessarily a 1-1 isomorphism between the physical interface and the experience. Physical dimensions, such as time and space, can be distorted in their internal representation (Robinett, 1992). They can be aligned, displaced, differ in scale, or have some other transformation that distorts the actual dimensions. Moreover, humans have the ability to interpret incoming physical signals. From an information-processing perspective, it is commonly held that people may recognize patterns of stimuli, at least in part, due to their expectations (Solso, 1991; Finke, 1989).1 These expectations are task dependent (Sheridan, 1992; Zeltzer, 1992) and can lead to two interpretations of the same physical signals. For instance, the Necker cube (Figure 1.1) can be perceived as showing either the top or bottom face of the cube.
Although the technology and the physical signals it generates are necessary for recreating reality, is the realism sought actually achieved? How does the person actually experience the interface created with this technology? Since experience is not isomorphic with reality, these signals could have been interpreted a number of ways. The only way to determine if the goal of realism is achieved is to acknowledge and consider the human factor.
Figure 1.1: The Necker cube.
What is reality? This question has challenged philosophers for centuries and its answer is perhaps the cornerstone for judging the quality of the virtual interface. Loomis (1992) distinguishes between the physical and phenomenal worlds. Contact with the physical world is mediate. The result of this mediation is the phenomenal world of which we are perceptually aware; it is a construction of our senses. We are aware of the physical world indirectly through inference and reasoning based on precepts.
With this perspective, the original goal of sensory realism becomes impractical. The goal of virtual reality is not to recreate reality, but to convince someone they are in a reality. The human factor is how someone judges the intrinsic qualities of the virtual interface. The participant must be convinced that the phenomenal world from this experience represents their expectations of the real world. In effect, reality becomes a property of the phenomenological experience.
Several models are available which describe some of these expectations and/or how people abstract their own realities.
The concept of presence has often been associated with the sense of spatial reality desired from the technology (Sheridan, 1992; Zeltzer, 1992; Heeter, 1992; Slater & Usoh, 1993; Hendrix, 1994). Presence can be defined as "the state of being in one place and not elsewhere (Webster's Third New International Dictionary, 1966)". With respect to virtual reality, the technology should convince the participant that they are somewhere other than where they actually are. As Heeter (1992) states:
". . . the yardstick to measure presence is applied not to assessing how closely a virtual world mimics real world sensations, but instead to analyzing the kinds of evidence a virtual experience provides to participants that help convince them they are there. Sensory realism is subsumed within this perspective, as one of the means that contributes to the experience of presence."
Presence is also often used synonymously with the term immersion. The concept of immersion embellishes the description of presence by specifying that the sense of being should include the sense of being completely surrounded by a three-dimensional space of which the participant is a part (Barfield, Zeltzer, Sheridan, & Slater, 1995; Kalawsky, 1993; Wells, 1992; Lavroff, 1992). Moreover, the experience should be interpreted as an egocentric point-of-view (Wickens, 1992; Slater & Usoh, 1993).
Similar to the presence model, Ellis (1991) suggests that an environment is the proper metaphor for interpreting the sensory experience produced by the technology's head-coupled, stereoscopic displays; it is the extension of the two-dimensional, desktop metaphor used for the current two-dimensional, static interfaces. A participant should abstract a three-dimensional representation of an environment surrounding them from these surrogate precepts. It should be similar to one that would be obtained from viewing the natural world.
Ellis proposes that the degree of similarity is determined not by the precepts but by the abstraction process. We are "predisposed to process incoming information in ways that normally result in a correct interpretation of the external environment . . ." We construct our environment and our sense of reality from the point of view of the "self" using the content, geometry, and dynamics of the information that is presented. If any of this information is incomplete, our a priori knowledge about these element's structure fills in the gaps. The success of this illusion in virtual reality "depends on the extent to which all of these constructive processes are triggered." Virtualization is the result of this success.
"Virtualization may be defined as the process by which a human viewer interprets a patterned sensory impression to be an extended object in an environment other than that in which it physically exists. A classical example would be that of a virtual image as defined in geometrical optics. A viewer of such an image sees the rays emanating from it as if they originated from a point that could be computed by the basic lens law rather that from their actual location. Virtualization, however, extends beyond the objects to the spaces in which they themselves may move."
Loomis (1992) similarly believes that our representation of reality is determined by a constructive process. Reality (i.e., presence) is an abstraction that is inherently intrinsic and can be understood in the context of the process of distal attribution. Distal attribution (or externalization) is a phenomenon where our perceptual experience, though originating with stimulation of our sense organs, is referred to external space beyond the limits of the sensory organs. Through this process, we identify the contents of the phenomenal world with either precepts that originate externally as opposed to internally to ourselves (i.e., nonself vs. self). The mechanism is modeled in the representation of the linkage between efference and afference. According to Loomis, the contents of perception are attributed distally with the phenomenologically external when efferent and afferent signals are lawfully related, or match. Identification is facilitated through interaction when multiple matches can be made over the course of time.
Conversely, failures to externalize information, and consequently attribute to self, occur when efference and afference do not match or are completely independent of each other. As Loomis describes, there are several phenomena where precepts originating from the external world are actually internalized. For example, positive visual afterimages originate due to the hyperpolarization of the receptors on the retina (Levine & Shefner, 1991). They do not move with corresponding movements of the eyes and/or head, even though the original stimulus is no longer in the same position of the visual field. Consequently, they appear as ghosts and tend to have a subjective quality.
Another example is the intracranial localization of binaural sounds, a failure to localize sounds emanating from the outside world. People normally expect interaural intensity and timing differences (Levine & Shefner, 1991) in order to localize sounds in the environment. Head movements facilitate this localization by providing multiple cues for the same object over time. These sounds are perceived as emanating from within the head when the phenomenological cues are insufficient to match the cues that are normally expected. Devices such as stereo headphones deprive the participant of these cues and inhibit distal attribution.
In order to achieve a convincingly "real" experience in a virtual reality, participants should externalize the precepts that the technology generates and attribute them to distal sources. If the participant's expectations were not met, then the abstraction could have a subjective quality, perhaps similar to the failures described above. This judgment may lead them to attribute the experience to an internal origin instead.
The attribution of reality has also been found in the literature from cognitive psychology. Reality Monitoring (Johnson & Raye, 1981) explains how memories for perceived events (i.e., external sources) are discriminated against memories for thoughts and imaginations (i.e., internal sources). Thus, reality may also be determined based on your memory of events. This occurs frequently in our everyday lives, for instance: