An Exploration of Virtual Auditory Shape Perception

[Table of Contents][Next Chapter]


1. Impetus

1.1. Synthetic Environments & the Reality Superset

One of the primary objectives of virtual environment technology is to maximize the information bandwidth from the environment to the brain. We often speak of the elusive phenomenon of presence as the ideal: if we can interact with a virtual environment in the same way that we do with the natural world we will have established a seamless and intuitive I/O channel the width of the entire sensory-motor system.

Our sensory systems evolved to process the types of stimuli important to the survival of our species. Evolution is the process of accidental engineering. There is no reason to expect evolved systems to be especially efficient or concisely suited to the environment. More likely the end result is an intricate contrivance of overkill and epicycles. With virtual environments we have the opportunity to optimize reality to suit our perceptual systems. Thus ultimate realism is not necessarily the ultimate state for virtual reality. Although realism is a particularly useful and familiar set of constraints, virtual environments need not be so limited.

1.2. Sensory Transcendence and the Impetus for Auditory Shapes

Our eyes are completely blind to the rear half of the world. Our ears have no such limitation. Because of this vulnerability of vision, it has been said that the purpose of the ears is to direct the eyes. If we hear that something is behind us, we know that we can find out what it looks like by turning around, and thereafter we may be able to discover what it is. I suggest that in a virtual environment it is possible to control the audible properties of objects in such a way that our ears will allow us to perceive not only that an object exists , but something of what it looks like as well.

Even in our normal, physical environment, sound sources are rarely point-sources. Mechanical sounds, like the cough of a furnace igniting, the screech of deforming metal, the hum of a turbine coming up to speed, and the thrumming of the wires with the approach of an electric train, all have an "extent" to them. More natural sounds, such as the sounds of wind, rain, ocean waves, and the creaking of trees under the weight of snow, also have size, and in some sense shape as well. Only rarely do we ascribe much importance to the size and shape of a sound. Other than evaluating the threat of the large and dangerous beast that pursues us through the darkness, we generally rely on our eyes to perceive size.

In audition the sensory mechanism consists of two unidimensional arrays of unidimensional detectors: the basilar membranes of each ear. These mechanisms supply a rather weak coupling between our internal three dimensional model and the external three dimensional,[1] environment. Many of the details must be inferred, and restored during post processing. With careful control of the input stimuli, and special attention to specific perceptual pathways, we might be able to synthesize unusual perceptions.

Unintentionally, we do this all the time. For example, a stereo system equipped with headphones is an extremely artificial way of perceiving sound. In a natural environment, all the sounds would have source locations (although these locations may at times be ambiguous, or difficult to decipher). With the headphones we effectively detach the sounds from their sources. If you set your amplifier to "mono" something truly extraordinary occurs: the sound appears to be coming from the inside of your head! How many have experienced this as a naturally occurring phenomena and lived to tell the tale? This odd illusion is not especially useful (in fact audio engineers go to great pains in order to avoid it). If, however, we can hear voices inside of our heads, there may be other more useful perceptions available elsewhere. Where do we look?

If we take sight as the epitome of spatial perception, we might ask the question: is our internal spatial representation for audition the same perceptual space we access with our visual systems? Auerbach and Sperling [1974] addressed this question by postulating separate perceptual spaces and then looking for evidence of a transfer process when auditory and visual spatial information are compared. They performed an experiment which involved locating auditory and visual objects and determining if they were in the same place. They carefully designed their experiment such that they could separate all sources of error. They found that there was no variance in the data that would be characteristic of any errors in transferring information from one perceptual space to another. Their conclusion was that our internal auditory and visual spatial maps must be in the same place. From a biological standpoint, proximal regions in the superior colliculus have been determined to be associated with auditory and visual mental maps [Sparks, 1988]. Whereas this may not be conclusive, it is, I believe, sufficient motivation to look for traditionally non-auditory, "vision-like" spatial perceptions in the auditory domain.