Chapter 3 - Multi-User Systems
In this chapter we will explore many of the existing networked multi-user
systems, which enable real-time interpersonal communication, in a roughly
increasing order in terms of system complexity. This will provide us with a history
of multi-user systems, as well as some insight into useful approaches for
incorporation into the GreenSpace system. Before delving into the systems
themselves, we will briefly examine common network communication protocols
and the types of network communication models used for virtual environments.
3.1.1 Common Network Protocols
For the various users in a distributed multi-user application to share the
same virtual space and interact, their host machines must communicate with each
other via a network. While there are many different protocols available, two of the
most commonly used are the Transmission Control Protocol (TCP) and the User
Datagram Protocol (UDP) [Goss94]. TCP guarantees reliable delivery of
messages sent, while UDP makes no guarantees; TCP is much slower, and
therefore less suitable for real-time communication, than UDP. Real-time
applications, like multi-user virtual environments, which require the high-speed of
UDP must be careful not to rely on all message transmissions being successful,
or use a hybrid TCP/UDP approach to send slow but reliable messages when
3.1.2 Centralized Network Model
A client/server, or centralized network model, burdens a single host with
the task of communicating with each of the clients to determine and report the
current state of the system. The server simply maintains the database, while the
clients handle computation and rendering. This is typically the easiest approach to
implement, but is not scalable [Goss94]. As the number of users increases, the
performance between the server and each of the clients decreases.
One way researchers have found to overcome this scaling problem, is to
create multiple communicating servers. Each client communicates directly with
the closest (in terms of network distance) server, which takes care of
communicating updates with the other servers, who in turn communicate with
each of their clients. This increases the complexity of maintaining a coherent
database, but decreases the impact of adding new clients (as long as there are
3.1.3 Distributed Network Model
A serverless, peer-to-peer, point-to-point, or distributed network model
makes no distinction between clients and servers. Each peer in the simulation
maintains a local copy of the database as well as handles computation and
rendering. When changes are made to the database, the peer must communicate
that change to all other peers in the system. This approach also has a scaling
problem because the number of messages being sent by each peer steadily
increases with the number of peers. If all n peers in a virtual environment
simulation update their position and report the change to all other peers each
rendering frame, then a total of n(n - 1) messages are sent over the network
during each frame [Goss94].
Techniques have been employed to help reduce the number of messages
sent by determining which peers will be interested in any given database update
3.1.4 Broadcast Network Model
Rather than sending update messages to each of the other peers in turn,
the broadcast network model allows each peer to send a single message that is
received by all other peers in the system. While this results in fewer total
messages being sent over the network, broadcast communication has the
negative side effect of sending each message to everyone on the network,
including those not participating in the virtual environment simulation. This can
cause an overwhelming burden to processes on the network.
Multicast is a subset of broadcast, whereby only the peers registered to
receive messages on a particular multicast channel do so, rather than each
process having to determine if they are interested in the broadcast message after
receiving it. The MBONE (Multicasting Backbone) is a virtual network layered on
top of portions of the physical Internet that support the routing of IP multicast
packets [Casn94]. While broadcasting messages to the entire Internet for a single
multi-user virtual environment system would be out of the question, the MBONE
can be used to send multicast messages between disparate peers in a virtual
environment system. For example, consider that a machine in the HITLab and a
machine in the UW Computer Science department have both registered to receive
messages on a particular multicast channel that a machine at Fujitsu in Japan is
sending messages to. Each message that Fujitsu sends will move up their natural
network hierarchy to the backbone portion of the Internet which will carry their
messages to the USA and then to the UW. Only when each message reaches the
UW are copies made to branch off to the HITLab receiver and the UW Computer
Science department receiver. The routing hardware and software at participating
MBONE sites keeps track of which of their neighbors, if any, are registered to
receive messages on a particular multicast channel.
Now that we have a basic background of the network communication
protocols and models, we can begin to examine the multi-user communication
Internet Relay Chat (IRC) is a client/server based multi-user chat system
that provides a solely textual interface for communication [Rose95]. Many
different channels exist, or can be created, for different topics of discussion.
Channels are often thought of as rooms, or virtual spaces, in which interpersonal
communication takes place. However, it is not possible to navigate between those
spaces in any other way than hopping between them. There is no explicit spatial
arrangement to the channels.
When users start their client program and select a nearby server, they can
connect to any of the IRC communication channels. Once connected, the textual
messages they send from their client are received by their server and sent to all of
the other IRC servers. Similarily, all the messages they receive from other users
have filtered through the IRC network of servers from the other users' clients to
MUDs (Multiple User Dimensions, Multiple User Dungeons, or Multiple
User Dialogues) [Smit95a] are client/server based textual virtual environments
publically accessible on the Internet. Because MUDs employ a simple centralized
network model, the server can only handle a limited number of users at any given
time (usually between 50 and 100).
The MUD server maintains a database of the MUD universe, which users
connecting through clients can navigate within, interact with, and typically build
upon. The first MUD was created at Essex University in 1979 as a multi-user
version of a classic text-adventure game [Bart90]. Since then, many different
types of MUDs have been created, including many that are more oriented towards
discussion than adventuring. There are now well over 250 MUD servers open to
the public via the Internet, each with hundreds or thousands of active users
In theory, everyone in the universe could be effected by the the actions of
everyone else in the universe and should therefore be notified of all changes to
the world database; locality will always be a compromise [Ande91]. Not only is this
impractical, but we generally are not directly affected by the actions of entities
which are both physically far away from us and not connected to us by any other
direct means (such as a teleconference).
The environment of a MUD is constructed with a spatial metaphor of
interconnecting rooms with portals between them. While there are sometimes
ways to teleport directly between non-adjacent rooms, a MUD universe is
generally considered to have a spatial layout. There are typically various modes
of interpersonal communication, but the default mode relies on the room
When a user is in a room, they are able to look at objects or other users in
that same room. A user looks at an object or user by typing a command at the
keyboard and receiving a textual response from the server. While standing in a
room users usually receive messages indicating the motion of users through that
room, but not within it. A room is an atomic measure of space, so a user cannot
move without leaving their current room. When one user speaks, by typing a
command such as "
say hello", everyone in the same room receives that
message, thereby hearing that user. People outside of that room, do not receive
There are usually ways to communicate directly with other users by using a
tell" or "
page" command, or sending messages to large groups of users no
matter what room they are in by using a "
shout" command or a communication
line, much like an IRC channel, within the MUD. Nonetheless, the communication
of actions and words of a user typically do not extend past their current room.
The following subsections examine a few different types of MUDs that have
extended the modes of communication.
Bram Stolk hacked an extension to an LP style MUD server to send
polygon data along with the traditionally text-based descriptions of rooms, objects,
and users [Taka93]. A special BSXMUD client is then used to render the
polygonal data as well as to handle the text input and output as if it were a normal
MUD [Smit95b]. Rooms are no longer atomic measures of distance, as a
BSXMUD user can move around within rooms and have their new position
continuously sent to update the other users within that room.
MOO was developed at Xerox PARC and stands for MUD, Object-
Oriented, of which LambdaMOO is one of the first and few publically accessible
servers [Smit95b]. Pavel Curtis, the administrator of LambdaMOO, has spent
some time observing the social interactions that occur on this server [Curt92]. He
has observed that the majority of users spend most of their time engaging in
conversation with other users. Chance encounters frequently occur as people
wander through the MUD and these encounters frequently result in conversation.
People also plan meetings and conversations; the environment is clearly
conducive to a variety of communication styles.
3.3.3 MUDs Grow Up
In order to overcome some of the shortcomings of text-based MUDs while
applying MUD technology to non-entertainment applications, the Social Virtual
Reality project at Xerox PARC is adding audio, video, and shared tools on top of
the MOO system [Curt93]. They first defined a window-based interface to offload
as much low-level interactions from the server onto the clients (such as scrolling a
scroll bar on a text widget). Because there is a central server which maintains the
MUD database and mediates communication between all clients, shared tools
were natural to add. Audio was added to their MOO system by associating a
multicast audio channel with each room. When users actually speak while they
are virtually within a room, the other users present in that room hear what they say
(the same as they would have if the user had typed a textual message). Video
was added for users in a similar fashion.
Two systems have been built which take advantage of these new
communication features: Astro-VR, for use by serious researchers in the
astronomical community, and Jupiter, for use by researchers at various Xerox
facilities. Both of these systems are suitable for both planned and casual
3.3.4 MUDs as Sytems Tools
A system administration group at Northeastern University has
experimented with using a MOO as a tool for improving group communications
[Evar93]. After experimenting with other communication tools, they chose to use a
MUD because they are:
- interactive in real-time,
- a networked service,
- inherently multi-user,
- extensible from within,
- exclusive to the users specifically granted access, and
- capable of maintaining a history, with a proper client.
They took advantage of the spatial metaphor inherent in MUDs to create
public areas where people would feel free to "hang out" to discuss topics as they
arose and private areas where people could go when they stepped away from
their terminal or were too busy working to discuss things on the MUD. They have
found that using the MUD has improved their ability to communicate within the
group and to work more effectively from remote locations.
On corbuMOO, the designers consider an object to be composed of
various attributes for different modalities of perception, such as sound, picture, or
text [Grah95]. When a user wishes to perceive the objects around them, they
must choose which attribute modalities for an object to consider, and then their
client program will display the object according to the queried modalities. When
the client is not capable of displaying a particular modality, then it is simply
ignored. This allows users to use clients of varying capabilities to explore the
same virtual environments.
MultiVerse is a non-immersive, multi-user, client/server, X-Windows based
graphical virtual environment system designed for entertainment purposes
[Gran93]. Several simplistic games were included with the free system source
code, including a take-off of a popular W Industries game called Dactyl
Nightmare. The system was easily extensible, but due to its client/server structure
it was not scalable.
Cyberterm (CT) is a client/server based multi-user 3D, yet non-immersive,
virtual environment system which was developed for use with PCs and modems
[Snos95]. This is an object oriented system in which objects, including users,
communicate through message passing. The system may consist of a single or
many servers, with each object residing on a single server. Interpersonal
communication can occur through internal email, posted bulletin board messages,
or real-time textual conversation. Other users may also be able to see a virtual
representation of you move through the environment as you explore.
DOOM is a multi-user non-immersive virtual environment action game
which was released as shareware for PCs and ported to the SGI platform
[Leuk94]. While this is an impressive system for a PC, the networking architecture
does not scale well. Each networked participant, of which there is a maximum of
four, generates a position update message every frame. This results in 30
messages per second being generated from each peer on an SGI network, even
when none of the users are moving [Mace95a]. Users are able to communicate
with each other by visually moving around, firing their weapons (which results in a
loud sound), or by typing text messages to each other. While the visual and audio
cues drop off with the distance between the players, the textual communication is
accessible no matter what the virtual distance between users.
3.7 LucasFilm's Habitat
LucasFilm's Habitat was one of the first attempts to create a large-scale
multi-user graphical virtual environment [Morn91]. The interface was a
combination of 2D graphics and text which BSXMUD may have been modelled
after. Like BSXMUD, the virtual world was spatially partioned into regions, like
rooms, that users could move within and between. When users were in the same
region, they could see each other move and communicate with each other
through the textual interface. In contrast to BSXMUD, which uses the Internet for
network communication, LucasFilm's Habitat connected client computers to the
central server by 300 baud modem connections. Even with this impoverished
network bandwidth, virtual communities grew and flourished.
Through this experiment, which was deemed a success by its creators,
several important lessons were learned, including [Morn91]:
- an object-oriented data representation is essential and
- the implementation platform is relatively unimportant
These lessons encourage us to define the models of virtual environments
at a behavioral, rather than presentation, level. Whether we are using a VT100
dumb terminal or a high-end SGI platform, we should be able to mingle together
with other users in the same environments without regard to their system
The Open Inventor object-oriented 3D toolkit, developed by SGI, is based
on this very principle [Wern94]. The Open Inventor toolkit is window system
independent and allows for the creation of 3D objects and interactive applications,
rather than simply the drawings of objects. Rendering or manipulation are then
possible at the client level within the limitations of the client resources.
VPL Research's Reality Built for Two (RB2) was the first commercially
available immersive virtual reality system, which was composed of a Macintosh,
two SGI Iris workstations, a HMD, a Dataglove hand flex sensor, and an magnetic
tracking system [Blan90]. Two such systems could be networked together with
point-to-point message passing [Funk95]. Clearly not a scalable solution, but it
was the first.
3.9 Networked SPIDAR
SPIDAR is a 3D haptic interface device constructed with strings, pulleys,
and motors [Ishi94]. Networked SPIDAR allows two users, both using the SPIDAR
interface, to collaboratively design 3D objects in the same graphical virtual work
These researchers claim that a collaborative space can be separated into
two component spaces: dialogue and object. The dialogue space is defined to be
the space in which the users discuss the design (as if looking at each other across
a conference table), while the object space is where the users manipulate objects
in the collaborative environment (like on the surface of the conference table).
While this may be true for manipulating objects which are relatively small
compared to the operators manipulating them, this is not at all apparent for the
collaborative architectural design, for instance.
In general, the space in which the users are manipulating objects cannot
be separated from the space in which the users are verbally or visually
communicating with one another.
Division, Ltd., produces both hardware and software systems, including the
dVS virtual reality operating environment. dVS provides an immersive visual and
auditory virtual environment software system capable of supporting multiple users
easily, due to its distributed architecture [Grim92]. This is a very general system
suitable for a wide range of applications. Unfortunately, their simple peer-to-peer
communication model does not scale well to arbitrary numbers of users and it is
not clear that database consistency is easily maintained.
3.11 CMU STUDIO
The STUDIO for Creative Inquiry at the Carnegie Mellon University (CMU)
developed a multi-user immersive virtual environment system for educational,
entertainment, and industrial applications [Loef93]. This system enabled a small
number of users to join together in a virtual teleconference, though the
participants of the system were separated by the Atlantic ocean (CMU and
Munich). Though these initial explorations were done using low-bandwidth point-
to-point connections, they plan to switch to a client/server model and use high-
bandwith network connections such as ATM (Asyncronous Transfer Mode)
[KimB95] networks. Their current system does not scale well and it is not clear
that their future plans will improve that significantly.
IBM Research developed the Virtual Reality Distributed Environment and
Construction Kit (VR-DECK), which promises to be an easy to use system that
allows for the creation of multi-user virtual environments employing a variety of
specialized I/O devices, including HMDs, gloves, speech recognition interfaces
and so on [Code93]. This is an object oriented system utilizing distributed
message passing for communication. Worlds are built using a 2D X-Windows
graphical user interface, then explored immersively.
The Virtual Envrionment Operating Shell (VEOS) is a software suite which
provides extremely general support for immersive virtual environment
applications, at the expense of performance [Bric93][Coco93]. Entities in VEOS
are actually heavy-weight Unix threads which may be distributed and maintain a
consistent database through point-to-point communications. From the standpoint
of separability this approach was exemplary, but was clearly neither efficient nor
The Mercury participant system was developed to overcome some of the
disadvantages of VEOS, by tightening the loop between the user's sensors and
displays [Mink93]. While this improved visual performance, the communication
model remained impoverished.
Prolix was developed as a MUD-like text-based user interface to VEOS
virtual environments [Taka93]. Users without the benefit of specialized resources
like HMDs or high-resolution graphical displays were able to participate in virtual
environments through a vt100 terminal. The textual actions of a Prolix user still
effected the database so that immersive users could perceive them as being a
part of the simulation. Textual representations of immersive users were presented
to the Prolix user as well. This does show that VEOS partially achieved the
recommendations from LucasFilm's Habitat, by decoupling representation from
As described in the chapter on Applications, SIMNET is a real-time
interactive battlefield simulation system. Objects broadcast changes in their states
to all other objects in the simulation, then the receiving objects decide what to do
with the data [Calv93]. While a vehicle in the simulation may be hundreds of miles
away from all other vehicles in the simulation, they still receive an update when
the trajectory of that vehicle changes. Dead reckoning is used to determine the
current position of objects in the simulation based on their last reported position
and velocity. The broadcast communication wastes cpu time on hosts which are
unable to perceive the event they are being notified of; broadcast communication
is not suitable for internetworks [Mace95a].
The DIS application protocol has been derived as a standard from the
SIMNET protocols and is based on the transmission of Protocol Data Units
(PDU). All data sent between applications, including audio and video streams,
must be decomposed into PDUs before being transmitted. This requires that
everyone in the simulation must examine each PDU to determine if it is relative to
them, then disregard it or begin to compose it into its original form. Due to these
restrictions, SIMNET and strictly DIS compliant simulation systems are not
expected to be able to scale beyond 1000 users [Mace95a].
NPSNET was previously described in the Applications chapter to be a DIS
compliant, multi-user military simulation system for use over the Internet. Unlike
SIMNET, NPSNET uses a multicast, instead of a broadcast, network
communication paradigm. A software area of interest manager (AOIM) partitions
the virtual environment into a collection of small scale environments and uses
spatial, temporal, and functional classes to determine membership in multicast
groups [Mace95b]. Multicasting removes the burden of determining if each PDU is
relevant to an entity by filtering most irrelevant PDUs at the network level.
For instance, the entire terrain database can be decomposed into an
interlocking set of cells with each cell having a unique multicast address
associated with it. As a vechicle moves through the terrain, the vehicle's host
sends state update messages to the multicast channel associated with the cell it
is currently in and listens for PDUs in the cell it is in as well as the adjacent
surrounding cells within some predefined radius of the vehicle. NPSNET uses
hexagonal cells, due to their uniform orientation and close approximation of a
This spatial multicast approach can significantly reduce the cpu
requirements of hosts participating in the simulation. Functional multicast groups
can also be joined to simulate radio broadcasts and other non-spatial
Distributed Interactive Virtual Environment (DIVE) [Carl93], developed in
Sweden, is an immersive multi-user virtual environment system which employs
replicated databases and point-to-point communication to maintain consistency.
Of course, this network communication approach has prevented DIVE from
scaling beyond a small number of users.
DIVE remains of interest because of its participant model and interpersonal
communication facilities. In this system each user has a 3D head icon which
follows the position and orientation of the user's position tracked physical head.
The viewpoint of each user is displayed as eyes on the virtual head. Awareness of
other users is negotiated in real-time based on the COMIC spatial model of
interaction for large virtual environments [Benf93].
This spatial communication model introduces the following abstractions:
- medium, such as audio, visual, or text.
- aura, is defined to be the sub-space which bounds your presence within a
given communication medium.
- focus, the more an object is within your focus, the more aware you are of it.
- nimbus, the more an object is within your nimbus, the more aware it is of
- adapter, an object which modifies your aura, focus, or nimbus.
For each medium your aura will have a potentially different size and shape.
When your aura collides with the aura of another user, communication in the
medium of that aura is negotiated through a quantifiable awareness level,
computed as a function of your nimbus and focus relative to theirs. Fortunately,
the user need not explicitly be aware of the complex calculations being made, as
the user moves through the environment various cues will inform them of their
relative awareness with other users.
The spatial model is made more extensible through the inclusion of adapter
objects [Benf93], such as a microphone tool which could increase the size of a
user's aura and nimbus for the audio medium.
Even with this elaborate communication model, DIVE is not scalable to a
large number of users because each user must still attempt to detect aura
collisions with all other users in the system, even if they are virtually much further
apart than the extent of their auras.
MASSIVE (Model, Architecture and System for Spatial Interaction in Virtual
Environments) is an immersive virtual space teleconferencing system that also
implements the COMIC spatial model [Gree95a]. MASSIVE provides access to
visual, auditory, and textual interfaces which can be used alone or in combination.
Like DIVE, MASSIVE uses a peer-to-peer communication model and therefore
does not scale well to a large number of users. Work is currently underway to
increase the number of potential users to greater than 100 with multicast network
3.18 MR Toolkit
The Minimal Reality (MR) Toolkit provides software support for multi-user
immersive virtual environments [Shaw93]. Internally, a client/server model is used
for communication between I/O devices and the application; point-to-point UDP
communication is used to maintain consistency between different users. This is
clearly not a scalable solution as the number of messages being sent over the
network will tend to increase as the square of the number of users. For this
reason, these researchers are considering switching to a multicast
The BrickNet toolkit provides support for multi-user immersive virtual
environments as well as the sharing of objects and dynamic object behaviors
[SinG95]. This allows for the creation of complex cooperative design and learning
applications. BrickNet is based on a client/server model with multiple servers each
serving multiple clients (much like the IRC network architecture).
The Waterloo Virtual Environment System (WAVES) also tries to overcome
the limitations of the client/server architecture by replacing the server with multiple
communicating message managers [Kazm93]. These message managers
determine which message managers or hosts should receive information received
from other sources.
Researchers at AT&T Bell Laboratories present yet another name for this
multiple communicating server approach. Here it is called RING [Funk95]. This
system supports interaction between large numbers of users in virtual
environments where there is "dense occlusion" (such as in buildings and large
cities). RING uses precomputed line-of-sight calculations to determine which
entities can perceive changes in a particular entity's state (such as position), then
only sends updates to the relevant entities.
While this approach can greatly reduce the number of irrelevant messages
being sent through the network, it only applies to visual, not auditory,
communication and increases the latency of message passing due to the multiple