An Exploration of Virtual Auditory Shape Perception

[Previous Chapter][Table of Contents]


11. Conclusions

Sound is a more fluid sense than is vision. This has several implications. One is that there will be variation from listener to listener as to the interpretation of any given sound. Thus attempts to produce "new" sensations may have unpredictable results. However, this may also mean that there is enough latitude in the perception of sounds to allow for many potentially important unexplored regions in the space of auditory events.

In my first experiment, I learned that there are strong cognitive factors in hearing shapes. Without preconceptions of what to listen for, subjects drawings of sound shapes exhibited enormous variety. The only themes that were common across subjects applied to the nature of the stimuli, and the nature of objects that might produce such sounds. White noise evoked images of semi-amorphous objects such as air, steam, and water. The irregular aspect of the sound shapes with interpolation removed also was a common theme in the drawings.

In this exploration, I have established that synthetic spatial cues allow subjects to successfully perform a spatial pattern recognition task with sound. In this way, I have validated the use of virtual sound in further study of auditory shape perception. This is quite fortunate, as much research will likely be necessary before useful auditory shapes will be possible.

The dependence on the number of pixels in the pattern lends credence to my fear that the time-multiplexed shape recognition is a largely cognitive process: it is a "perceptualization" and not a true perception. This is not helpful if the intent in conveying shape information via auditory channels is to reduce the overall cognitive load on people in a complex environment. One investigation which is strongly indicated is a measure of the cognitive effort required to perform the shape recognition task.

My experiments relied heavily on the phenomenon of auditory apparent motion, which is an incomplete representation of sound motion (e.g. it does not include Doppler cues). Further information on the effects of apparent continuity on shape recognition would be quite useful. Similarly, data on the effect of pattern duration and speed would also be useful. Further investigations will be greatly facilitated by the versatility offered by virtual spatialization. The way is also open to examining continuous motion and exploring additional movement cues. For example, the Convolvotron(TM) allows for the possibility of including Doppler shifts.

Although the auditory system has difficulty with the spatial processing of simultaneous events (as exemplified by the precedence effect) there are some indications that these limitations are not insurmountable. If the concurrent sources are differentiated in some way, then they can retain some of their spatial properties [Perrott, et al., 1984, 1989], without incurring further cognitive loading.

One possible avenue to the solution of this problem is to push the speed envelope in shape tracing. The Convolvotron(TM) has the fundamental limitation of a 50 Hz update rate (which means a maximum of 20 pixels/second). If one is willing to give up real-time presentation, then faster rates are easily obtainable. It is not clear however, how useful faster sequences would be, as this brings us right back into the realm of the precedence effect. A comforting fact is that Perrott, Marlborough, & Merril have shown that some concurrent spatial information is retained even in conditions when the precedence effect should fully apply.

I have made a first attempt at avoiding the problem of time dependence by distributing my stimuli over the pitch continuum instead of time. Although this attempt was not successful, I remain optimistic that the system could be improved, perhaps with more carefully engineered stimuli. Perhaps simultaneous temporal and frequency differences will yield better performance.

As evidenced by the difficulty that the subjects experienced in the screening experiment, and the extremely high frequency of vertical confusions, the harmonic complex appeared to be sub-optimal for localization. This stimulus could also be an accessory to the poor performance on the ill-fated concurrent acuity experiment. It may be that the spectrally sparse harmonic series does not contain enough frequency information to be unambiguously localized.

Another question that remains untested after this exploration is whether spatial properties of concurrent sources are preserved when the stimuli are not pure sinusoids. My preliminary investigation of extensity addressed this idea, but failed to produce conclusive data. In a related line of reasoning, it would be useful to know if other perceptual continua are also effective in the preservation of concurrent spatial attributes. If frequency differentiation can cause auditory extensity, what about timbral differences? If extensity can occur with two sources, how about three or four? If so, what is the effect on acuity?

With the use of virtual sound systems we can perform systematic psychophysical experiments with novel types of stimuli. An intriguing idea finds its source in the research of Waugh, et al., 1979. They showed that humans are as sensitive to velocity with audition as we are with vision. Precise auditory velocity judgments are possible over time intervals as small as 30 ms [Perrott, et al. 1979]. To my knowledge there is no research on judgments of concurrent sound-source velocities. If it is determined that humans have facility with such a task, then it may be we could perceive pictures drawn in auditory phase space. Perhaps velocity, and not position, is the appropriate primitive for extending auditory experience.

My experiments, as the first to examine the perception of virtual auditory shape, were required to validate the use of synthetic spatial cues in such a process. As such they needed to be somewhat grounded in the "rear-view mirror" of previously explored techniques. Now that this has been accomplished, the way is open for more innovative procedures. I have barely made a start at exploiting the flexibility of virtual sound to generate a wide range of spatial auditory patterns-- patterns that would be nearly impossible to create with mechanical apparati.