Virtual reality's immersive quality can be enhanced greatly through the use of properly cued, realistic sounds. A common method of obtaining realistic sounds is to record or sample a real sound for later playback on demand under computer control. Sound storage formats and playback are covered under separate topics, and sound sampling is described below.
An analog-to-digital (A/D) converter with a microphone and optional sample-and-hold (S/H) circuit on the A/D converter input is used to convert audio sounds into an n-bit wide digital stream of data which can be stored in a computer; other electronic sound sources can be sampled by connecting the audio out of the source into the A/D converter input. [see Figure]
The CPU or control logic periodically causes the S/H circuit to trigger which holds the input stable long enough for a data sample to be converted by the A/D converter and read by the CPU. Without a S/H circuit, the input to the A/D converter will vary, and its "digital" outputs will be unstable when read; some A/D converters have built-in S/H frontends.
There are several types of A/D converters. One of the fastest is a flash or parallel converter which is a group of (2^n - 1) (raise 2 to the power of n, and then subtract 1) "voltage comparators" whose outputs go to a priority encoder to convert the (2^n - 1) inputs into n digital output bits whose binary value is proportional to the "analog" input signal. The analog input goes to one input of all voltage comparators, and the other input taps into a series of equal-valued resistors which form a voltage divider with equal voltage increments. [FLOYD84] [see Figure]
The rate at which sound is sampled is very important, and depending upon the application, very low rates can be acceptable; though an audiophile would like a frequency range of at least 20KHz, most human speech is below 4KHz, and intelligible speech can be distinguished from a bandwidth as low as 2.2KHz (300Hz to 2500Hz) [ARL91]. The sample rate should be at least twice that of the highest frequency of the sound sampled; this is known as the Nyquist rate [ARL91]. Sampling below the Nyquist rate results in under sampling which produces lower frequency aliasing which interferes with the desired sound on playback. [see Figure] To support the full 20KHz human audio spectrum, sound samplers will use at least a 40KHz sampling frequency; most adults can not hear frequencies this high. As mentioned earlier, speech is predominantly below 4KHz, and though sampling at 40KHz may produce a higher quality sample, a 5KHz sampling rate is sufficient for intelligible speech and only takes 1/8th of the storage space that a 40KHz sample would require.
The number of digital bits per sample coming from the A/D converter determines the granularity or resolution of the sampled sound; more bits per sample allow more accurate samples to be taken of subtle changes in the source sound in addition to increasing the "signal-to-noise ratio" (S/N). Since an A/D converter with n output bits can only represent 2^n discrete amplitudes, the A/D converter's digital output has a "stair-step" characteristic - it is quantized. [see Figure]
Due to the difference between analog (continuous) and digital (discrete) signals, digital samples differ from the analog source signal. This difference at any given time is the quantization error which introduces noise into the sampled data. The quantization error can be decreased by increasing the number of bits per sample to increase the digital resolution. For example, 4 bits gives 16 discrete amplitude levels and a S/N of 24dB (higher means less noise) whereas 16-bit samples (CD quality) give 65536 discrete amplitudes and a S/N of 96dB. The disadvantage of increasing the number of bits per sample is that the storage requirements increase proportionally along with the complexity of the A/D converter.
[ARL91]: The ARL Handbook for Radio Amateurs, The American Radio Relay League, 1991, 68th edition, pp.7.1-7.11, 8.20-8.23.