An Exploration of Virtual Auditory Shape Perception

[Previous Chapter][Table of Contents][Next Chapter]


10. Virtual Auditory Field Experiment

One of my main concerns with the mode of presentation in the previous two experiments is its time dependence. Some of the patterns required over 1.5 seconds to be displayed. This means that the listener may have to remember the twists and turns of a pattern, and then think about what shape it might be. Part of the point of presenting shape via audition is to give people less to think about. A time independent auditory shape display might allow the use of auditory space as extended memory instead of putting an extra burden on short term memory.

The virtual auditory field display is a time independent approach to auditory shape presentation. Lakatos [1993a] and I have taken different interpretations of the work of Perrott [1984a]. Lakatos was of the opinion that: "...directional hearing has been shown to be ineffective in the discrimination of several simultaneous sound sources" [Lakatos, 1993a, p. 364]. He cites Perrott's study as evidence. However, I believe that Perrott's study shows how to enable concurrent spatial sensitivity in audition. Perrott found that when the sources are offset by a frequency difference of 43 Hz, the concurrent minimum audible angle was as small as 5deg.. This may be considerably worse than the <1deg. for a single source, but would still be quite usable.

Another advantage of this kind of simultaneous presentation is that it should produce the kind of extended images observed by von Békésy [1960] and Perrott [1984b]. This might "fill out" the images presented, giving them more solidity. Unfortunately, to my knowledge, no studies of extended images with more than two sources have been published.

10.1. Apparatus

The same apparatus was employed in this experiment as in the previous three experiments.

10.2. Subjects

The same seven subjects were used in this experiment as in the previous three experiments.

10.3. Stimulus Combinations

In this experiment in each trial, four 12-partial harmonic complexes were presented with onsets staggered by less than 30 milliseconds. The sustained duration approximately 1 second. These harmonic complexes had fundamental frequencies of 1000 Hz, 1050 Hz, 1100 Hz, and 1150 Hz respectively.

Figure 10.1

Auditory Field Shape

10.4. Procedure

This experiment was much like the previous one: there were 66 trials for each subject: six practice trials, followed by 60 trials during which each shape occurs ten times at random. For the first 36 trials (including the six practice trials) the subject was given feedback. If the correct picture was selected, the button turned green. If they made an incorrect choice, the button selected turned red, and the correct button turned green. The Toolbook interface is shown in figure 10.2.

Figure 10.2

Auditory Field Experiment Interface

10.5. Results & Discussion

Overall performance on this experiment was slightly better than random: 22.9% correct where random performance would be 16.7% correct. A t-test comparing the performance on this experiment to random performance showed that, in spite of the slim margin, the difference is significant: (T = 2.592, DF = 419 P = 0.010). An analysis of variance revealed significant main effects for subjects, shapes, and a significant interaction between subjects and shapes (see table 10.1).

Table 10.1

Analysis of Variance of Auditory Field Experiment

Source               Sum-Of-Squares       DF     Mean-Square      F-Ratio  P       
Subject              2.948                6      0.491            3.180    0.005   
Shape                2.390                5      0.478            3.095    0.009   
Subject x Shape      8.110                30     0.270            1.750    0.010   
Error                58.400               378    0.154                             

However, the performance of one of the subjects was twice as good as any of the others (see figure 10.3). If this subject is treated as an anomaly-- a freak of nature, an abomination on the face of the earth-- then the overall average performance is only 18.9% correct, which is not significantly better than random (T = 1.076, DF = 359, P = 0.283). Without this subject, there are no longer significant effects due to subjects or shapes, but there remains a significant interaction between subjects and shapes (see table 10.2).

Table 10.2

Analysis of Variance of Auditory Field Experiment Without Subject AH

Source               Sum-Of-Squares       DF     Mean-Square      F-Ratio  P       
Subject              0.656                5      0.131            0.908    0.476   
Shape                1.056                5      0.211            1.462    0.202   
Subject X Shape      6.644                25     0.266            1.840    0.009   
Error                46.800               324    0.144                             

Figure 10.3

Identification Performance on Auditory Field Experiment

In order to best view the subject by shape interaction, I have collected the t-test significance levels in a matrix (table 10.3).

Table 10.3

T-Test of Better than Random Recognition for Auditory Field Display

 SHAPE     AH     CD       KH       KS       RJ       RM       TM        All but   
                                                                           AH      
   1      0.203   -        0.038    0.093    *        *        -          0.742    
   2      0.038   -        -        0.013    -        0.203    0.203      0.525    
   3        -     0.404    -        -        0.404    *        0.203        -      
   4      0.001   0.404    0.404    0.203    0.038    0.404    0.093      0.029    
   5      0.2027     -      0.2027     -     -        0.404    *            -      
   6      0.093   -        0.404    *        0.203    0.203    -            1      
  All     0.001   -        0.231    0.145    1        0.525    -          0.283    

Values are significance probabilities

Dashes indicate (not significantly) worse than random performance

Asterisks indicate 0 correct identifications

The only shape that was identified with significantly better than chance accuracy was shape 4. This shape is distinguished from the other shapes in two main aspects. The first is that it is a closed loop, with two pixels overlapping. The second, and probably more important distinction is that it is smaller than the other shapes. Two of the subjects mentioned that they noticed this difference.

A cluster analysis did not produce any useful groupings. The confusion matrix is shown in figure 10.4.

Auditory Field Shape Presented

Figure 10.4

We apologize. This image does not translate to HTML.