3D, multi-user collaboration on the internet is currently enabled
through five component technologies: 3D world modeling technology,
communications technology, Web browser technology, server technology, and
client/server connection technology. Each plays its part in providing the
multi-user experience to each collaborator. The world model defines which
visual objects are part of the collaboration and defines each object’s
geometry, appearance, location, orientation, and scale. Communication
technology defines how collaborators communicate: via text, voice, or email
while participating in a multi-user world. Browser technology defines how each
collaborator interacts with the world model and how the other communications
technologies are integrated with the world model interaction. Server technology
coordinates the collaboration to make the collaboration process sensible to
each collaborator. Client/server communications technology enables the browser
to communicate with the server(s). This chapter provides a component by
component review of the current state of 3D world based collaboration over the
internet.
The
World Model
An author typically uses a 3D modeling tool to create the world model.
The model is created visually using a computer program with a special graphical
user interface that makes it easy to point and click on points in 3D space to
insert objects and define their geometry appearance, location, orientation, and
scale. VRML models can be
created
quite effectively using the 3D Studio Max modeling package from Kinetix or the
Alias modeling package from Alias|wavefront. Both these packages use four
simultaneous views to model objects: top view, front view, side view, and
perspective view. Figure 3.1 shows an example of a world modeling package. The
state of the art today in virtual world modeling allows world model developers
to attach object behaviors to the objects being modeled. Modeling tools are
used to create each collaborator’s personal representation, called an avatar,
which enables non-verbal communications within the multi-user world.
Figure 3.1 The Alias’ 3D Modeling Interface
Once the model is created, a world model must be delivered to each
participant in order that each collaborator has access to the same images on
his or her monitor while collaborating.
The internet has improved world model delivery significantly as a
collaborator today only needs access to the internet in order to gain access to
the latest world model. VRML is a standard that defines a world model and
stores it in a simple ASCII or UTF8 text-based file[12]. VRML files can be
delivered using http: Web server delivery strategies. Because of the World Wide
Web craze of the early 1990s, Web servers have become optimized for delivery of
HyperText Markup Language (HTML) documents over the internet. HTML documents
are simple text based documents which define how a Web page is viewed by a Web
user in a Web browser [13]. HTML documents can obtain components of a Web page
from other Web servers, allowing efficient use and reuse of text and graphical
information. Within the VRML standard, object geometry, appearance, location,
orientation, and scale are defined for each object in the virtual world. VRML
then puts the objects together to make a 3D scene. VRML uses the same Web
servers that HTML documents have used so successfully. No additional enhancements
are needed to deliver 3D models anywhere on the Web. And, VRML has the same
facility for obtaining world components from anywhere on the Web and including
them in the current 3D world that HTML has with text and 2D graphics.
VRML objects can also be stored on and loaded from a local hard drive
or CD-ROM drive. Other shared world model providers use other modeling
technologies and deliver the content locally before a collaborator begins to
use the technology. Video game technologies have traditionally required local
storage of the game world before a player connects to a shared experience. The
technology is downloaded using the internet or is purchased in a physical store
in diskette or CD-ROM format. VRML has been designed specifically for Web
server delivery which is more efficient for rapidly changing world content. The
collaborator always accesses the latest world model because she obtains it from
a Web server that only houses the latest world. Web server delivery continues
to evolve and it seems sure that other technologies will provide on-line,
server-based 3D world model delivery similar to VRML.
The world model author develops the model while considering its final
file size and its complexity. The file size is considered relative to expected
storage capacity and available Random Access Memory (RAM) of a user’s computer
as well as estimated download times for internet accessed files. The world
complexity is considered relative to the 3D multi-user client’s capability to
manage the complexity within a reasonable frame loop explained in the Browser
section of this chapter..
Communications
As seen in Chapter 2, collaborators have been using different
technologies to enable communications during the collaborative process. These
communications technologies are all becoming available in 3D shared multi-user
worlds. As to their coordination with the 3D world itself, many voice, text, or
email implementations are best considered as separate, parallel computing
processes. Telephony, voice-enabled or text-based chat, and email can be used
in separate windows or frames in order for collaborators to discuss the 3D
world they are sharing. Also, collaborators can choose to use separate tools
and share information through a separate communications server that does not
rely on any coordination with the shared world.
There is some benefit to integrating the communications within the 3D
world. If a user is represented by an avatar modeled as a 3D object, others are
aware of his location in 3D space. If the chat environment is aware of the
user’s location, it can take that information into account to modify
communication messages. An obvious example of this is the voice-enabled chat
environment provided in On-Live! Technologies Inc.’s virtual world technology.
In that environment, the volume of a collaborator’s voice is dependent on his
or her distance from another collaborator. This set-up is a natural interface
which provides easy one-on-one communications. Two collaborators need only find
a location away from other users and they will only hear each other. If six
users stand in a circle, they all hear each other at the same time with similar
volume.
Even text-based communications can take advantage of the 3D locations
of the collaborators. Distance from others can dictate which chat channels a
user is privy to. A new visitor on the scene can quickly determine who is
available for a chat session based on their location in 3D space. For example,
blaxxun interactive’s Passport client provides a beam feature that allows one
collaborator to quickly move to a location in the world that is directly in
front of another collaborator identified by name. His or her avatar is also automatically oriented to face toward
the other collaborator. At that point, it is clear that a one-on-one chat
session is being pursued and the collaborators can negotiate to establish it.
For integrated communications scalability, the communication process can run on
a separate server or as another process on the same server.
The collaborator communications decision is a difficult one for a 3D
multi-user developer. The more
communication bits coming over the internet to a collaborator, the more
bandwidth he or she needs to be able to manage them in real time. To develop
for a 14.4 or 28.8 kbs modem connection, trade-off decisions have to be made.
For example, On-Live! Technologies Inc.’s Traveler viewer does not provide a
body for a collaborator’s avatar since the animation of the mouth bits and
voice bits occupy a significant portion of the bit budget supported on a slow
internet connection with limited bandwidth.
The
Browser
Most clients used by 3D world collaborators use two separate
applications. General Web navigation is provided by a Web browser originally
designed for HTML document presentation. 3D model specific navigation is
provided by another application that communicates with the Web browser through
a defined specification; Netscape Communication Corporation’s plug-in
Application Programming Interface (API) being the most popular. Almost all
popular VRML viewers, for example, connect to the popular Netscape
Communications Corporation’s Navigator and Microsoft Corporation’s Internet
Explorer Web browsers through the plug-in technology originally provided by
Netscape Communications Corporation. The Web browser communicates with Web
servers to request and send information on the VRML viewer’s behalf. The VRML
viewer then uses the data received to create the 3D world seen by the user.
Yet, stand-alone 3D VRML viewers do exist and provide refreshing, creative
clients. OZ Interactive Inc.’s OZ Virtual browser is an example of a VRML
browser that handles basic Web navigation without the help of another
application.
The client is responsible for parsing the world model as it is
delivered from a Web server, determining a beginning viewpoint, rendering the
scene based on that viewpoint, and then maintaining changes that occur based on
the user’s interaction with the scene or server-based messages sent to the
client. Incoming bits represent incoming text or voice communications, avatar
location changes, behaviors in the shared world initiated by any collaborator’s
actions or timers embedded in the world. The local collaborator’s viewpoint is
managed locally within his or her client. Outgoing bits include changes to the
local avatar position, behaviors activated by the local collaborator, and text
or voice messages sent for communication with other collaborators.
Most of
the obvious differences between browsers is a result of different choices in
the look and feel of the user interface. This chapter will take a look at the
basic capabilities of most 3D multi-user enabled browsers. blaxxun
interactive’s Passport [14] (Figure 4.1), On-Live! Technologies Inc.’s Traveler
[15] (Figure 4.2), Sony Corporation’s CyberPassage [16] (Figure 4.3), and OZ
Interactive Inc.’s OZ Virtual [17] (Figure 4.4) are all popular 3D multi-user
world browsers. New versions appear approximately every month and, as a result,
make any description of their capabilities outdated soon after putting the
words on paper. The following comments
are as of February 1997.
Figure 4.1 The
Passport Viewer (courtesy of blaxxun interactive)
Figure 4.2 The
Traveler Viewer (courtesy of OnLive! Technologies Inc.)
Figure 4.3 The
CyberPassage Viewer (courtesy of Sony Corporation)
Figure 4.4 The
OZ Virtual Viewer (courtesy of OZ Interactive Inc.)
All four browsers handle voice and/or text chat simultaneously while
providing a shared world model to each connected user. Each provides an avatar
which can be seen by others as a representation of each user. All are working
to incrementally incorporate the VRML 2 standard. The VRML 2 standard provides
object behaviors such as change in location, orientation and scale, change in
color and lighting and appearance and disappearance. The user has control of a navigation mode such as walk, fly, or
examine mode, the ability to turn on or off a default headlight attached to his
or her virtual head, the ability to bookmark an exact location and orientation
in a world, and the ability to enforce collision with other objects or disable
it. The user moves around in the world using the arrow keys on the keyboard or
by way of mouse movement within the world itself or relative to a control panel
provided within the interface.
These clients provide each collaborator the ability to choose their
avatar from a avatar collection. Some allow each avatar child object (such as
hat, shirt, shoes, etc.) to be changed or colored separately, and some allow
the user to provide their own avatar following some guidelines and VRML 1 or 2
design.
The client is carefully engineered such that a certain minimum frame
rate is maintained if the minimum recommended CPU, RAM, video board, and
internet connectivity technologies are used by a collaborator. The frame rate,
or number of times per second the world is re-rendered to the screen, is a
critical success factor for most users. Internally, the client makes constant
tradeoffs between available changes provided by all incoming bits, user
movements, and mouse clicks. The higher the frame rate, the better the
experience. In an ideal situation, all state changes are easily handled by the
client frame loop with time left over. Then, the client can increase the frame
rate above the minimum rate used internally, perform some other function, or
just wait for the next frame loop to begin. When all state changes can’t be
handled within the minimum frame rate loop, the client can throw out some of
the changes or queue them for later processing. Each client developer creatively
programs these trade-off decisions which then become more important as the
world complexity increases, number of collaborators increase, and active
behaviors increase.
The
Server
In its most basic form, the server simply connects users together in
order to send changes from one collaborator’s world model to the other
collaborators’ world models. A collaborator connects to a server over the
internet, is delivered a current model of the world from the server or another
collaborator, and then interacts with the world by sending updates of his or
her actions and receiving updates from others’ actions.
In its most complicated form, the server can be transforming the world
itself and communicating its changes along with changes from other
collaborators. The server can contain logic that monitors each collaborator’s
actions and regulates it in any way. For example, blaxxun interactive’s
CyberHub server can mute a collaborator on behalf of another collaborator’s
request. Today, most of the server
capabilities are tied to a specific client technology. Connecting to a server
without a specific client technology makes little sense.
It is possible to architect a 3D shared world without a server if each
client knows how to communicate with all other peers. Such server-less
multi-user worlds, called distributed worlds, usually are enabled with a
broadcast or multicast communications environment. Greenspace is an example of
such a multi-user world delivery environment [18]. Greenspace clients are
connected over a dedicated network such that changes within one collaborator’s
world are communicated only to certain peers. Since the internet IP multicast
communications protocol is still in its infancy and internet broadcasting is a
tremendous waste of messaging, centralized servers have been used extensively
to both deliver 3D virtual worlds and maintain communications between users for
internet implementations.
Client/Server
Communications
In a client/server architecture, each function point is strategically
placed at the server or on the client based on the ability and capacity of each
technology. Technologists take advantage of client/server architecture to split
the development effort among mutually-exclusive programming efforts. Client
specialists work to make the client more user friendly and capable. Server
specialists work to make the server more secure, fast, and capable. As long as
the client/server communications piece is defined ahead of time, each can work
on independent timelines because the latest server will work with the latest
client and vice versa. Client/server development strategies are seen everywhere
internet enabled multi-user worlds are being created.
Sony Corporation, Silicon Graphics, Inc., Chaco™, Intel Corporation,
blaxxun interactive, and Netscape Communications Corporation all focus on
either a multi-user client or server or both. The clients continue to improve
to add new features and become more efficient. The servers continue to become
more secure, fast and functional. And, often, the clients are upgraded monthly
while server releases appear every six months. The most troubling constant is
the latency brought on by an internet connection which requires developers to
respect a certain time lag between a server sending bits and a client receiving
them.
The design of the client/server communications piece is critical to the
technology’s success. The VRML community continues to extend VRML to handle
client/server communications. Yet, VRML viewer developers are creating client
application programming interfaces (APIs) that let the browser communicate with
a server written in any programming language. Both approaches show much promise
for dynamic multi-user virtual world development. The next two sections
contrast the two approaches to client/server communications.
Extending
the VRML standard
The Living Worlds standard extends the VRML 2 standard through the
PROTO and EXTERNPROTO node keywords in order to add multi-user capabilities to
a VRML 2 scene [20]. The VRML 2 PROTO
node allows an author to encapsulate all characteristics and behaviors of a
VRML 2 object and make that prototype available to all other VRML scenes by way
of the Web. Another author can use the same PROTO node in his or her scene by
referring to it in an EXTERNPROTO node. The EXTERNPROTO node includes a field
that points to the original PROTO node on the Web. The Living Worlds standard
prototypes new nodes necessary for multi-user communication and object shared
behaviors and makes them a standard interface for multi-user server developers
to develop server software that is able to communicate with the multi-user
worlds loaded in each visitor’s Web browser.
Within the Living Worlds architecture, VRML 2 authors can use a
prototype node called Zone to include VRML objects in a multi-user area. The
Zone node is a grouping node which tells the world server where multi-user
behavior is to be enabled. VRML objects can be added and removed from the Zone
nodes on the fly using ROUTE statements which are an integral part of the VRML
2 standard. The most interesting nodes to add to the Zone group are
SharedObject nodes because a SharedObject demonstrates its behaviors to
everyone within view of the object. A good example of a SharedObject is a pair
of dice that can be rolled in a multi-user game. Those dice would be added to a
Zone that contained all the game pieces, game board, and game table. The game
table would be a simple VRML object with no shared behaviors. Any object that
is a child of a SharedObject node can at best only demonstrate behaviors to the
local collaborator that initiates them unless a local timer is provided to each
collaborator during initial world acquisition. These local timers can enact
behaviors in each collaborator’s world without the server.
A SharedObject can demonstrate the standard behaviors defined in VRML
2. If a multi-user server developer wants to create new technologies that can
be enabled in a Zone, the Living Worlds standard provides a PrivateZone node
which can contain a MuTechZone node and many PrivateSharedObjects nodes, each
which can contain a MuTechSharedObject node. The word MuTech is short for a
multi-user technician. These four nodes are all made into prototypes using VRML
2 syntax and as such only require a standard VRML 2 .wrl file to enable
multi-user world interaction on each client. Unless, of course, the multi-user
technician requires additional executable files to reside at the client in
order to participate in their unique technology. Those files are downloaded
once over the internet from the server provider and then accessed by all
subsequent VRML 2 scenes.
The Living Worlds standard-setters realize that some of the features of
multi-user technology are better provided by the browser. Still, until the
browser developers make those features available, an alternative way of
providing rich multi-user experiences is provided through the Living Worlds
standard. A Living Worlds-like methodology could be provided to any internet
multi-user world syntax where the world model itself includes the logic to initiate
the server routines referenced. Such a methodology requires the server to be on
the more sophisticated end of possible server types (versus a simple message
pass-through server) until the world generation syntax matures. VRML is just
one standard world syntax that is getting the most publicity today.
Client
based APIs
An alternative approach to providing multi-user capabilities on the
internet is to open up a world to other processes that call basic functions
available within the client [21]. Considering this approach, the VRML 2
standard becomes important only for its ability to define the world objects’
appearance, geometry, location, orientation, and scale and the ability to
change those parameters on the fly. Outside programs can determine the changes
to be made and then call the world functions available in the world viewer
client that then make the changes, including adding new objects and removing
existing objects.
The processes that request changes can be written in any programming
language and reside on the client or a server to which the client is connected.
This approach allows much flexibility for the software developer. The VRML 2
external interfaces usually work as follows:
1.
The
programmer creates variables in his or her program(s) that point to nodes in
the VRML 2 scene graph.
2.
The
programmer uses the variables in her program(s) to change variable states based
on programming logic or event processing (available events are triggered by
timers, mouse clicks, and proximity to objects).
3.
Periodically,
the programmer requests the VRML 2 scene graph to update its state (and render
the scene) based on the changes taking place within his or her program(s).
In this case, each event is communicated to a server and sent to all
connected clients that can see or are otherwise affected by the event
processing. The server is written in an appropriate language and receives event
notifications from a client when a client validly changes its copy of the
world. With this architecture, the server can be simple or complex as behavior
generating processes can be placed at the server or the client. The server need
not even keep its own version of the world if it is designed using a simple
message pass-through architecture.
Considerations
The benefit of having a standard such as VRML 2 is that content authors
can create content which is viewable by an audience using all kinds of
different browsers. If the browser is standard compliant and the author’s work
is standard compliant, the content should work on the browser without ever
being specifically tested by the author on that browser. Many authors and
browser developers have subscribed to the VRML 2 standard in order to reap
these expected benefits.
The Living Worlds standard is an extension of the mentality of the VRML
2 standard creators. The Living World’s task force is developing a standard
that encompasses how an author can identify shared behaviors and multi-user
ability within the VRML scene graph file itself. Then, using the standard, the
multi-user server developers can create a standard compliant server and
multi-user functionality will be assured by the world author without
specifically testing the world on each server. In the long run, if the standard
is written well, Living Worlds could easily obtain the success that VRML 2 is
currently enjoying. In fact, Living Worlds may become the cornerstone for VRML
3. Such a standard is useful if a united, connected, cyberspace is to be
provided by many different interests.
In the short run though, using an external interface from the VRML 2
browser to other programs appears to be gaining a ground swell of interest. The
Java programming language is meeting the needs for other Web based technologies
and is being used for creating object behaviors in a VRML scene [19]. The Web
is such a dynamic and ever-changing medium that developers are bound to keep
pushing the technology through their own creative client programs and server
technologies. An external interface in the VRML viewer client lets this rapid
development process happen. Multi-user 3D world technology can improve rapidly
because the five component technologies can be worked on separately by
specialists and an improvement in any one of the world model builder,
text/voice communications, Web browser, server, or client/server communications
technologies improves the whole process.