4.1 Client / Server Architecture
To overcome the problem with using different input devices in a VRML world, a client / server architecture is used. How does this solve the problem? In order to answer that question I must first describe how a VRML application works. When you visit a Web site on the Internet you will begin by downloading an HTML page to your browser (such as Netscape). If you choose a VRML link, the VRML will be downloaded to your browser and a VRML plug-in (such as Cosmo Player) will be started to display the VRML. If the VRML file is interactive, there will also be an associated Java program that will be downloaded as well. The downloaded Java program will run as an applet on the local machine. The VRML file (displayed with the plug-in) and Java applet run in tandem to provide an interactive experience. The VRML displays the virtual world's objects, and the Java applet controls object's properties and provides a mechanism for interaction. Once the files have been downloaded, the VRML and Java run together on the user's local machine (see Figure 1 below).

Figure 1: Running a VRML and Java Application
Another thing that a VRML world can do is create a world from two separate worlds and combine them together to form the final world. In this case each individual world would be downloaded and then run together on the user's local machine (see Figure 2 below).

Figure 2: Running Multiple VRML and Java Applications
In addition to downloading a VRML and Java application and running it on the local machine, the application can also continue to receive data from the server. For example, in the case of multiple users sharing a virtual world, there needs to be a central point (or server) which transmits world changes to all of the clients. In this case all clients would download a copy of the VRML and Java application and run it locally, but would continue to receive update information from the server (see below).

Figure 3: Running Application while Receiving Update Information
The VRML and Java application receives these server updates via a socket. A socket is a software mechanism for connecting processes over a network. It establishes a communications path such that distributed systems can exchange data. Java has extensive support for sockets. What is important about sockets is that the point origin of the socket is not important. There is no mechanism that protects an applet from receiving socket data from any machine. This includes the same machine that the applet is running on. Because of this one can overcome the problem of using input devices with VRML and Java applications.
In order to provide a device interface to a VRML application one needs to overcome the security issues of the applet. The solution is to "fake" out the applet. The user will still download a copy of the VRML world as usual with the required Java application. Once the application (or applet) starts it will connect to the device server. This server will send device information over a socket to the client such that the user can use new devices to interact in the VRML world. But in order to be able to use devices that are connected to the same machine on which the applet is running, the device server must run on the same machine as the client. Thus, the solution is to build a client / server architecture that is run on the same machine (see below). (Note: Running a server on the local machine does have an impact on the system performance. See Section 6.4 for a discussion of system tests and test data.)

Figure 4: Running Application Connected to Device Server
Although it may seem that this is a design flaw of Java applets, it is not. The applet continues to provide the same protection to the system. What is important to remember is that the reason for secure applets is to stop programs that are downloaded over the Internet from accessing the local machine. However, when you receive data over a socket, the data is also isolated from the machine's components (since the program is isolated from the machine). In the case of creating a server and sending data to an applet on the same machine, there are two separate applications that need to be run: the browser and the server. The browser runs the applet, and the server reads the devices. The fact that the server transmits data to the applet on the same machine is not a security violation as the server cannot start without the user’s intervention (i.e. the applet cannot spawn the server).
In addition to the above application (see Figure 4) one can also run the same configuration while connected in a multiple user world. In this case, the VRML application would be connected to both the World Server and the Input Device Server (IDS). In the case of a multiple user world, each user that would want to use non-traditional input devices would need to run the IDS on their local machine.
This solution provides a clean implementation that could have uses beyond just the VRML community (see section on Future Work).
I will now address the design constraints of building a client / server architecture.
4.2 Client / Server Design ConstraintsWhen looking at a client / server architecture there are a number of factors that need to be considered.
Bandwidth refers to the amount of information that is transmitted from server to client. This will vary greatly from application to application. In the case of HTML pages, the quantity is relatively low as only ASCII characters need to be transmitted. Also, the quantity of text tends to be low on most web pages. But if the HTML pages contain many images then the bandwidth required becomes high, as images contain a lot of data. Thus, in the same application (e.g. web page viewing), the bandwidth can be high or low depending on the use.
It is generally desired to keep the bandwidth as low as possible. Communications is usually the bottleneck in most distributed systems. Since lowering the bandwidth is directly proportional to the speed of the system, it must be kept as a high priority in the design stage. One of the ways to reduce the bandwidth of the IDS (Input Device Server), is to process a lot of the data on the server side. If the data is processed on the client side, then all device data must be sent to the client. Alternatively, if the data is processed on the server side, only the essential information is transmitted to the client.
4.2.2 Client InterfaceThe client interface refers to the data format that the client expects. In the case of a simple HTML browser, the format expected is Hyper Text Markup Language (HTML). Not only does the browser expect this format, but it also requires that it be sent as ASCII text. This interface completely dictates the design of the client. The server in this case is not greatly affected by this client as it is only responsible for sending files to the client. However, the design of the software used to create the HTML files, is completely dictated by the HTML interface format. Therefore, much care must be taken to ensure that the interface will change as little as possible since small changes in the interface could require many design changes on both the client and server configurations.
The IDS developed in this thesis is an attempt to create a simple interface (for a complete explanation refer to The Application section). As discussed later, the interface is set up such that the exact device data is not required. Instead, only the meaning of the device data is required. For example, instead of saying that "the information is from a glove and here is the data", the information transmitted would say "move forward" (For a complete description of the interface refer to Table 2: Grammar for Messages).
4.2.3 RobustnessRobustness refers to the ability of the system to adapt easily to new situations or applications. Netscape has provided an open architecture that facilitates the creation of new plug-ins. These plug-ins provide a mechanism for supporting a wide variety of applications and data viewers.
It is important in the design of the IDS to create a robust system capable of handling new applications with low overhead. The architecture that is explained later will address this issue.
4.2.4 ScalabilityScalability refers to the ability of a system to grow. The Internet, for example, is a network that is very scalable. At the moment millions of computers are networked together and can "talk" to one another.
Scalability for the IDS refers to the ease of adding new devices to the system with little additional work. As will be seen, the architecture proposed provides a system in which the addition of devices is a trivial matter.
4.2.5 Client / Server LoadingClient / server loading refers to the amount of work that is allocated to the server or the client. It is a design consideration that encompasses the design considerations discussed so far (bandwidth, client interface, robustness, and scalability). In a client-weighted system, the majority of the "work" is performed on the client side. Typically in a client weighted system, raw data is sent from the server to the client in a variety of different formats. The client is responsible for identifying the source of the data and reformatting the data into an acceptable internal format. This method allows for a highly flexible system where each client can completely customize its environment. The client is considered to be "smart". The Netscape Navigator web browser is an example of a client-weighted system. In this case the server is simply responsible for sending out data as HTML, VRML, Java, Shockwave, …, formats, and the client (Netscape browser) is responsible for knowing what to do with this data. This has lead to the growing number of "plug-ins". These software add-ons are used to interpret the variety of incoming data formats and convert them to a viewable or audible format.
In a server-weighted system, the majority of the work is performed on the server side. The server is responsible for formatting the data in a consistent manner such that the client can understand. The client will support a standard interface that the server must abide by. The client will not be able to process any data that does not come in the required format. The server in this system is considered to be "smart". An example of a server-weighted system is an X terminal window. The Xterm can only process information that is presented to it in the required format. If a user wanted to implement a menuing system using an Xterm, such as with the Emacs text editor, the menu would need to be "faked" by being constructed on the server side and sent to the client as if it was a regular Xterm console.
In the case of the Input Device Server (IDS) there are definite advantages and disadvantages to having either a client weighted or a server weighted system.
Table 1: IDS Characteristic for Client / Server Weighted System
|
Client Weighted System |
Server Weighted System |
|
Increased bandwidth (negative) |
Decreased bandwidth (positive) |
|
Increased flexibility (positive) |
Decreased flexibility (negative) |
|
Increased dependence (negative) |
Decreased dependence (positive) |
|
Scaling is difficult (negative) |
Scaling is easy (positive) |
|
Less robust (negative) |
More robust (positive) |
From the table above, one can see that the main advantage of a client-weighted system is that it is very flexible. This may be attractive for an application such as viewing HTML pages, however, it is not of extreme importance to the IDS. HTML pages can contain a wide variety of data and it has been designed in such a way as to provide plug-ins that allow the client to view data that is not in the standard HTML format. As mentioned previously VRML and Shockwave are non-standard formats that require a plug-in for viewing. Since different Web hosts may want to present data to their clients in unique formats this "open-standard" was provided. The IDS runs locally on the client machine. There is no reason to provide a client-weighted system with client plug-ins as the server is on the client's machine. Instead one can develop a more robust system, by providing a scaleable server that provides an equivalent plug-in at the server (see Future Work for server plug-ins). This isolates the application from the details of the input devices. Although this presents a larger amount of work on the server it is of no consequence, as the work would have gone to the client instead on a client-weighted system.
Overall, the server-weighted system provides the best solution. It isolates the application from the input devices and provides a robust and scaleable architecture. With a server-weighted system, the addition of new input devices becomes very trivial.
There are several key components to the overall system. The physical devices, the device interface, the mapping of the device data, the server, the client, the connection between the server and the client, and the application. Each one of these components is described in further detail in the following sections. Figure 5 below shows the overall system architecture.

Figure 5: System Architecture Main Blocks
4.3.1 Physical DevicesThe physical devices include the actual physical device up to the connector. That is, if the device, such as a magnetic tracker, contains its own internal software that runs outside of the computer, then we include that as part of the physical device. In essence it is anything up to the communications port of the computer. [Note: In this thesis I interfaced a glove (5DT Glove; see list of manufacturers in the Appendix D), magnetic tracking device (Polhemus Fastrak), and a keyboard. Voice recognition was emulated using a dialog box, so there was no physical device. In the actual implementation the physical device would be the microphone.] 4.3.2 Device Interface and Data Mapping
There are several steps to go from raw data at the communications port to useful data that can be sent to the client in a standardized format. It may be more helpful to visualize these steps as a set of layers; similar to the OSI (Open Systems Interconnect) model used in network theory. These layers are processed from bottom to top. As shown in Figure 6, the lowest layer is the physical device, while the highest layer is formatted data for the client.

The first step is to get the data from the serial ports. This is what I refer to as the Device Driver Layer in Figure 6. The device driver software is responsible for sending the correct set of commands to reset, connect, and read data from the physical device. A device driver is typically included with the purchase of any hardware device.
The next step is the device interface layer. At this level the data is converted from the raw data that is generated from the device driver to something more meaningful. "Meaningful" is a term that will depend greatly on the task. In the case of this thesis there were a number of different operations performed on the data. In all cases I was using the device data to either navigate in a virtual world, or control the attributes of different objects within the virtual world. In the case of navigation, the data from the device driver was converted to movement information. Here, I generated speed and direction information from the sensors. In the case of object positioning with the tracker, I did nothing. The data from the device driver exactly matched what was needed to place an object at a specific location in the world. In the case of the glove, gesture information was extracted from the glove. This involved software that took the raw data from the device driver and matched it against previously recorded hand gestures. In this case the useful information is recognized as one of a set of hand gestures what in turn controlled operations within the world.
The next step is to map the new device data to the required format that is expected by the client. The client will expect data in a particular format. The data from the devices must be mapped to this format. As will be seen in the next section, all device data is mapped to either navigational data or object data. The advantage of mapping device data in a standardized format is that it allows different devices to control the application, while the client itself does not need to know how to map the device data. That is processed by the server. On the other hand, by having the server responsible for mapping device data to the required client application format, the server must know how to map the data. As we will see in more detail in the Input Device Toolkit chapter, we are now working at a higher level of abstraction. We are not dealing with an interface that relies heavily on the specifics of the input device, instead the interface relies heavily on the task.
4.3.3 The ServerThe server is a software object that allows clients to connect to it. It typically manages a list of connected clients and sends the appropriate information to them. The server can do the following: Open a connection to accept client connections, wait for a new client to connect, and send and receive client data. (Note: Refer to Appendix A for a detailed explanation of the server implementation.)
4.3.4 The ClientThe client is a software object that connects to a server to receive data. The client can do the following: Connect to a server, and receive and send server data. The client is what is connected to the end application. (Note: Refer to Appendix A for a detailed explanation of the client implementation.)
4.3.5 Connection between Server and ClientThe connection between the server and client is the medium by which the information propagates. On a physical level, the connection can be an Ethernet wire, or via wireless methods (e.g. radio frequencies, infrared LANs, etc.). On a connection level, the connection can be via sockets.
4.3.6 ApplicationThe application is the end user of the data. It is connected to the client and uses the data which it receives from the server, via the client.
4.4 Summary of Design IssuesIn summary, my proposition is that a server-weighted architecture will provide a scalable and robust system that will facilitate the addition of new input devices and creating new interaction techniques.