A Virtual Retinal Display For Augmenting Ambient Visual Environments
by Michael Tidwell
![[Previous Chapter]](/icons/button-back.gif)
![[Table of Contents]](/icons/button-contents.gif)
![[Next Chapter]](/icons/button-next.gif)
Chapter 3: Characteristics of an Ideal Augmented Vision Display
3.1 Introduction
The augmented vision display visually augments physical reality.
It is therefore desirable that the display be able to mimic reality,
or substitute for it, on demand. The ideal display, therefore,
matches the visual acuity of the human eye. Also, the field of
view of the ideal augmented vision display corresponds to the
field of vision of the human visual system. Furthermore, the display
should be comfortable when worn by the viewer making form factor
and other ergonomic considerations important.
3.2 Field of View
The field of vision of a single static human eye is approximately
140 [deg.] from the nasal to the temporal limits of vision [8].
The vertical field of vision is approximately 90 [deg.]. The highest
resolution of the visual system occurs over a region only 2-4
[deg.] in extent. This region, called the fovea (also called the
"focal" visual system), contains mostly cone receptors
which are responsible for discerning detail and color in daylight
vision. The remaining receptor area (sometimes called the "ambient"
visual system) contains mostly rod receptors and detects motion
and other spatial information. The rods cover a region from a
few degrees off the axis of the eye to 70 [deg.] away from the
axis. One might mistakenly assume that a display could be designed
so that the center portion of the display is "high"
resolution and so that the periphery is "lower" resolution.
However, resolution in the peripheral field of view is necessary
for accurate detection of motion [8] and to control of saccadic
eye movement. Some manufacturers offer displays with higher resolution
inset areas and lower resolution backgrounds. The high resolution
inset moves according to eye position. The registration of an
inset with actual eye position can be difficult and the lower
resolution periphery degrades the viewer's motion and visual search
ability. The ideal display, however, allows for the fact that
the eye rotates about its axis, or gimbals, as a person looks
around a scene. In fact, the ideal display matches the resolution
of the eye over the entire field of vision (i.e. one minute of
arc resolution over a 140 [deg.] horizontal monocular field as
shown previously).
3.3 Resolution
Under ideal illumination and contrast conditions, the angular
resolution of the focal (or foveal) vision system is approximately
one arc minute. This means that the human eye can detect objects
separated by approximately one arc minute in bright light of around
104 [cd/m2]. For example, normal, or "20/20"
vision can discern objects separated by about 1.5 [mm] at a distance
5 [m] from the viewer. To achieve roughly one minute of arc resolution
over 140 [deg.] horizontal by 90 [deg.] vertical, the ideal display
should contain 8400 horizontal and 5400 equivalent "pixel"
points [8].
3.4 Estimated Retinal Illuminance and Contrast
3.4.1 Overview
The interrelated issues of estimated retinal illuminance and contrast
are treated together in this section. The ideal display luminance
and modulation contrast relative to the outside environment are
derived. Some background information is presented to facilitate
the understanding of the relationship between modulation contrast
and estimated retinal illuminance.
3.4.2 Ambient and Display Estimated Retinal Illuminance
For a see-through display with total estimated retinal illuminance
IT:
IT = ID + IA
where ID = estimated retinal
illuminance contributed by the video display and IA
= estimated retinal illuminance contributed by the ambient
light (from outside environment). A graphical representation of
the previous relationship is shown in Figure 3.1.
Figure 3.1: Graphical representation of estimated retinal illuminance
from ambient light and display light vs. arbitrary display coordinates.
In Figure 3.1, IA is represented as the average
ambient estimated retinal illuminance.
An average is assumed for purposes of analysis in this section
but the most general scene will have non-constant luminance across
its field and will contribute a non-constant estimated retinal
illuminance across the retina. The case of non-constant estimated
retinal illuminance is more difficult in the analysis and should
be a separate exercise. Also,
IA = R x
pupil area (mm2) x scene luminance (cd/m2)
x TC
where R is the Stiles-Crawford effectivity ratio as in
Section 2.3.1 and TC is the transmittance of
the combiner element in the display. Also, in the case that the
eye's pupil is smaller than the exit pupil of the display,
ID = R x pupil area (mm2)
x display luminance (cd/m2) x (RC)eff
where (RC)eff = the effective reflectance
of the combiner element in the display. Extending this concept
to a retinal scanning display, the optical power per unit steradian
can be calculated as,
(ID)VRD = pupil
area [mm2] display radiance [W/sr-m2] Vl
where Vl
= the photopic relative luminosity value for conversion
from radiometric quantities to photometric quantities, and the
display area is the area of the exit pupil.
For example, if the radiometric power measured in the exit pupil
by a photodetector is 200 [nW] at 650 [nm] wavelength and the
horizontal and vertical fields of view are 40 [deg.] and 30 [deg.]
respectively, then,
ID = 200 10-3(2p)
(30/180) (40/180) (0.11)
= 5.1 10-3 [trolands].
3.4.3 Contrast, Contrast Ratio, and Estimated Retinal Illuminance
The ideal augmented vision display will have a contrast, C,
between the display and the ambient scene above a certain critical
value, Cc. In other terms,
(ID - IA)/ID
Cc.
To see the dimmest portions of the video display satisfactorily,
the previous expression should be:
(IDmin - IA)/IDmin
Cc.
Rearranging,
(1 - Cc) IDmin
> IA.
Usually Cc 0.2 for text
and alphanumeric information and Cc
0.96 for high information content images [50]. In the latter case,
IDmin 25 IA
and in the former,
IDmin 1.25 IA.
Furthermore, if the display contrast ratio is CR, and Cc
= 0.96 then,
IDmax = (CR) IDmin
25(CR) IA
and for the case of CR = 100,
IDmax 2500 IA.
In a bright daylight scene of 104 [cd/m2],
IDmax 2.5 x 107 [cd/m2].
4.9 x 107 [cd/m2] is near possibly damaging
levels for the eye. It becomes apparent that some ambient light
must be filtered to view high information content images in bright
ambient light with a see-through display. The fundamental reason
is that bright daylight is the upper operating range for the eye
in terms of brightness and any high information content image
must be much brighter than the ambient light. There is significantly
less problem viewing text and alphanumeric data satisfactorily.
There is also less difficulty viewing images under controlled
lighting conditions (i.e. indoors).
3.5 Color
The color characteristics of the ideal see-through display correspond
to the CIE chromaticity primary wavelengths of 650 [nm] for red,
530 [nm] for green, and 460 [nm] for blue. These wavelengths for
the red, green, and blue channels respectively allow for the greatest
color saturation and most on-balance white.
3.6 Binocular Stereo Overlap
Ideally, a see-through display has binocular overlap matching
the 120 [deg.] overlap of the between human eyes [25]. The display
built for this thesis is, however, monocular, and binocular related
issues are not discussed in depth.
3.7 Ergonomics
Any head mounted display such as an augmented vision VRD must
be worn by the user. Weight is an important consideration as fatigue
may become an issue after extended use for a system that is too
heavy. Also the moment, or torque, in all directions, about the
center of inertia of the head-spine system is important. For example,
a one pound display produces half the moment if the distance from
the display's center of inertia to the head-spine system's center
of inertia is six inches versus twelve inches. Some applications,
medical applications for example, will have more stringent weight
requirements than others. More scientific analysis by qualified
engineers and scientists is required to fully understand ergonomics
of augmented vision head mounted displays as they relate to individual
applications.
3.8 Degree of Augmentation
A unique characteristic of the augmented vision display is that
it could be switched from completely transmissive (all real environment
and no virtual environment) to completely opaque (no real environment
and all virtual environment). The ideal display has the capability
of switching fully from one state to the other and all points
in between.
3.9 Variable Accommodation
The ideal see-through display focuses each resolution element
independently.
The display then has variable accommodation. People have both
convergence and accommodation cues which dictate depth perception.
Discrepancy between the two in a display can cause disorientation
and even illness as demonstrated in flight simulators [47]. A
display with variable accommodation removes the discrepancy by
matching the accommodation to the convergence cue at each resolution
element location.
3.10 Summary of Performance Requirements
A summary of system requirements for the ideal augmented vision
display would be as follows (Table III.1):
Table III.1. Performance characteristics for an ideal augmented
vision display.
| Performance Characteristic
| Value |
| Horizontal Monocular Field of View
| 140 [deg.] |
| Vertical Monocular Field of View
| 90 [deg.] |
| Horizontal Binocular Field of View
| 180 [deg.] |
| Vertical Binocular Field of View
| 90 [deg.] |
| Binocular Stereo Overlap |
120 [deg.] |
| Angular Resolution | 1 [arc min.]
|
| Horizontal Pixel Elements (Monocular)
| 8400 |
| Vertical Pixel Elements (Monocular)
| 5400 |
| Estimated Retinal Illuminance
| 0 - 105 [trolands]
|
| Color | Red = 650 [nm]
Green = 530 [nm]
Blue = 460 [nm]
|
| Augmentation | Variable from 0-100%
|
The display designer must decide what is useful versus what is
possible. A treatment of potential applications for an augmented
reality VRD in Chapter 5 sheds light on what may be useful. Fortunately,
acceptable performance in many applications is far less demanding
than the ideal. In fact, in many applications the performance
requirements for an augmented vision system are less demanding
than those for an inclusive, or non-see-through, system where
the entire scene is computer generated.
Human Interface Technology Laboratory