From: dstamp@watserv1.waterloo.edu (Dave Stampe-Psy+Eng)
Subject: 386 renderer progress
Date: Mon, 6 Jan 1992 23:50:56 GMT
Message-ID: <1992Jan6.235056.21572@watserv1.waterloo.edu>
Organization: University of Waterloo



As you know, I'm working on a high-speed 3D graphics renderer for the
386 and 486 PC compatibles.  As lots of progress was made in the last
few weeks, it's time to report on the present state of the project and
gather some feedback.

The design goals of this project (in addition to those stated above)
are:

 10 fps or higher (300-500 polys onscreen)
 1000+ polygon "world" database 
 wireframe, filled polygon and (perhaps) lighting
 should require no extra hardware (run on stock 386 or 486 with VGA)
 adaptable for both desktop and eyephone projects
 simple interface for software
 available as source and libraries for non-commercial use

As you can see, the goals look rather ambitious.  But I can now state
that they are achievable.  Several possible modes and their predicted
speeds are listed here.  All speeds are computed for a 486/25 PC with
an average-speed VGA card:

Wireframe, 320x200, 16 color: 20 fps (10 fps for stereoscopic viewing).
This is the "lowend" mode, useful for fast user interfaces and 
development.

Filled polygons, 320x200, 16 colors: 12 fps with 400 polys or so 
onscreen.  The 16-color mode increases the speed, but the few colors
means that object colors must be picked carefully for visibility.
Objects will tend to look patchy.

Filled polygons, 320x200, 256 colors: 7-10 fps achievable.  256 colors
lets you preselect object colors (as in the SuperScape demo) or use a
simple lighting model.  Gourand shading is also an option, but will
slow the renderer to 5-7 fps or less.

These speeds are achieved by the use of 32-bit integer math throughout,
and use of a scanline renderer to offset the access speed bottleneck
caused by the VGA card.  The scanline renderer also removes the need
for expensive depth-ordering processing, but adds its own overheads.

For those interested, I'll describe the renderer specifics later.  I
will now discuss the world database design.

The world consists of objects, each consisting of polygons which
join sets of points.  Each object can be rotated or moved as a 
unit by renderer support routines.  To have objects with joints
or hinges, use several objects or redefine the object itself.
Each object is defined in object-centered coordinates, which
can then be placed in world-space by a function call.  This 
minimizes the cost of object motion on the renderer, and puts
it on the user's code where it belongs.

The renderer utilizes 2 parts of the object data structure: the
definition and the bounding volume.  The bounding volume can be
eliminated, or be as simple as a sphere enclosing the object.  
This method has a very low cost as only a center point and a radius
need to be processed.  Optionally, more complex bounding volumes
may be specified.

The object definition is only processed if the bounding volume
falls within the view area (viewing volume, or viewport).  This can
be used to eliminate many of the objects early (I assume 40% as
an average case) which will vary with your position in the "world".
Several object definitions may be automatically selected based on 
the distance of the object from the viewpoint, which can be used
to set an absolute maximum on the number of polys to be drawn.

The next step is to eliminate  polygons that face away from the
viewpoint, and will be hidden.  This can be done by 3 multiplies
using the precomputed polygon normal, and the viewpoint-to-poly
vector.  Another 40% or more of the polygons are eliminated.

If lighting is desired, another 3 multiplies and a divide provide
the cosine function for the point light source, and an addition
supplies the diffuse light.  One more multiply, and a clip gives 
a 5-bit brightness which is combined with a 3-bit color code to
specify 1 of the 256 screen colors.

Points are now converted from world to viewport coordinates.  This 
is done using 3 3-element vectors rather than a 4x4 homogenous
transformation matrix, resulting in much greater speed.  Each point
is flagged as converted, so it is only converted once.

Depth clipping is only done for closeness (hither) as yon clipping
adds some extra costs.  By clipping to a Z (depth) grater than 1,
the X and Y viewport coordinates may be converted to screen 
perspective coordinates without overflow. 

Now the 32-bit X and Y coordinates must be clipped to screen coordinates.
A semi-recusive Sutherland-Hodgeman clipper seems ideal for this, as
it is simple and fast.  Special point lists are used to prevent repeated
clipping of the same edges, and also prevents duplicate points from
being generated.  This extra processing allows the scanline renderer
to reach its peak speed as well.

Once a set of clipped polys is produced, the special data structures
required by the scanline renderer can be generated.  Then the scanline
renderer processes the data, line by line, from the top to bottom of
the screen.  No pixel on the screen is written to twice, and drawing
speed is very high.  The screen does not even have to be precleared.

Progress so far:  All the needed areas have been researched and
skeleton code analyzed for speed.  C versions of most code is being
tested, and will be integerized.  Then an assembly version will be
developed.

The scanline algorithm is by far the most complex part of the 
development.  Due to the donation of some Pascal source code
by Keith Harp, I have decoded the Hamlin and Gear paper and
written a C version, which is nearly debugged (1001 special cases
need to be checked).  Of course, it must still be integerized.
Due to the way a scanline algorithm works, I think that there will
be an occasional visiblity error due to roundoff, but they
should be very rare and scene-dependent.  

Bernie Roehl is currently developing the 3D transformations, 
object level conversions and clipping, and a simple file format
which is to be used for debugging and development.  Once this
is done, clipping will be added, and everything converted to assembler.

I'd like to hear any comments on the renderer, esp. the object formats
and manipulations.  Are they flexible enough?  Anything to be added
in terms of manipulations?

--------------------------------------------------------------------------
| My life is Hardware,                    |                              | 
| my destiny is Software,                 |         Dave Stampe          |
| my CPU is Wetware...                    |                              | 
| Anybody got a SDB I can borrow?         | dstamp@watserv1.uwaterloo.ca |
__________________________________________________________________________
