From: dstamp@watserv1.waterloo.edu (Dave Stampe-Psy+Eng)
Subject: Re: VGA: 6400 polys/sec
Date: Thu, 31 Oct 1991 17:25:44 GMT
Message-ID: <1991Oct31.172544.21211@watserv1.waterloo.edu>
Organization: University of Waterloo



frank@cavebbs.gen.nz (Frank van der Hulst) writes:

>I suggest you'd probably be better off in 320*200*256 colour mode (mode 13h),
>since that allows you to write one byte per pixel, rather than needing to do
>a read, AND, OR, write on 4 different bit-planes.
>
>Incidentally, just today I figured out how to get 4 pages of screen memory
>in 320*200*256 colour mode. The bad news is that you loose the flat linear
>addressing system.

You CAN write several bit planes at once with a VGA card: that's why the 
16-color mode is fast.  The disadvantage is that when you want to write
a part of a byte of pixels, you have to read the byte first to get it
into the VGA card latches.

Which gives me an excuse to post a summary of 3 video modes, and their
write time characteristics on a VGA system:

All the following modes are 320x200 resolution.  The times are for a 486/25
and a Paradise VGA card (about average speed, some cards run 2 or 3 times
as fast but don't expect more than 60% improvement over speeds listed here).

320x200x16 color: 8 pages possible, allowing stereo with double-buffering,
as well as background screens.  4 planes per buffer, each using 8K of address
space.  All 4 planes may be read or written at once on-card, allowing fast 
clears and copies.  During drawing, color is specified by a VGA register,
and the mask is the contents of the CPU byte being written.  You must read
a byte before "writing" it, as this updates the 4-plane latches on the card.
You can skip the read if you're writing all the pixels in the byte.

320x200x256 color (linear addressing): only 1 page makes this mode less
useful for VR applications, as flicker during drawing is bad.  But the 64K
pixels map directly to the address space, so it's fast for single-pixel
writes.  The color is the contents of the CPU byte.

320x200x256 color (nonlinear addressing): allows 4 pages, just right for 
double-buffered stereo.  The pixel addressing is such that each page
takes 16K of address space, but adjacent pixels end up on different bit
planes.  This can be made into an advantage for fast clears, copies and
long horizorntal lines (useful for filled polygons), as 4 pixels can
be processed at once.  The disadvantage is that the plane mask register
must be used to mask pixels, and it's relatively expensive to access it.
Switching from linear to nonlinear mode takes about 30 uS, and only lets
you access 1 page.

The times I used are:
CPU write to buffer: 1.3 to 1.5 uS (depends on whether REP MOVSB usable)
CPU read,then write: 2.5 uS
CPU VGA register access: 2.5 uS

These are probably average for a standard VGA card and fast CPU.

I'm going to present the results in tabular form, for different length
horizontal lines (1,5,16,48) and for full-screen REP MOVSB clears and
copies.  The 1-pixel writes are typical of Bresenham lines, the longer
horizontal lines are typical of poly fills.  Lines consisted of average-
cases, so different cases were weighted by liklihood and summed.  Cases
differed by whether the ends had to be masked or not.

I will also present times in pixels/uS and mS/screen (if lines of that
length were used to fill the whole screen).

16 color mode:
full-screen clear: 10.4 mS
full-screen copy:  20.8 mS
1 pixel: 0.4 pix/uS, 160 mS/screen
5 pixel: 1.32 pix/uS, 48.5 mS/screen
16 pixel: 2.46 pix/uS, 26 mS/screen
48 pixel: 3.97 pix/uS, 16 ms/screen 
limiting case: 8x speed of 256 color linear

256 color, linear mode:
full-screen clear: 83 mS
full-screen copy: N/A
1 pixel:   0.67 pix/uS,  96 mS/screen
5 pixel:   0.77 pix/uS,  83 mS/screen
16 pixel:   "              "
48 pixel:   "              "

256 color, nonlinear mode:
full-screen  clear: 20.8 mS
full-screen copy:   41.6 mS
1 pixel:  0.2 pix/uS, 320 ms/screen
5 pixel:  0.81 pix/uS, 79 mS/screen
16 pixel: 0.95 pixel/uS, 67 mS/ screen
48 pixel: 1.72 pixels/uS, 37 mS/screen
limiting case: 4x speed of 256 color linear

Analysis:  256 color linear is faster only for 1-pixel line graphics,
and has the significant disadvantage of only 1 page.

256 color nonlinear is faster (on average) for every case of more than 4
pixels (i.e. 6x6 triangle fill) than 256 linear, but always slower than
16 color mode.

16 color mode is fastest when there are more than 2 horizontal pixels to be
filled (3x3 triangle fill).  If you're doing filled poly rendering, this is the mode to use.

IF you had a very fast VGA card then 256 color linear mode might be OK.
The 256 color nonlinear is borderline but nice.  It works marginally for
poly fills and is inappropriate for wireframe, unless a more complex
algoritm than Bresenham is used.

16-bit mode has about 60% of the speed of 256 color linear mode for
line drawing, and beats it for poly fills by about 3 to 4 times.

Comments? Questions? Conclusions?


--------------------------------------------------------------------------
| My life is Hardware,                    |                              | 
| my destiny is Software,                 |         Dave Stampe          |
| my CPU is Wetware...                    |                              | 
| Anybody got a SDB I can borrow?         | dstamp@watserv1.uwaterloo.ca |
__________________________________________________________________________
