Prediction of Simulator Sickness
in a Virtual Environment

[Table of Contents]


Complete Analyses for Phase I Regression Analysis

Recall the following abbreviations for the independent and dependent variables used in these analyses:

AGE: age of the individual

GENDER: gender of the individual; coded -1 for male and 1 for female

MRA: Mental Rotation Ability; assessed by score on Cube Comparison Test

PREPRO: mean of the two pre-exposure Prototype values
GENAGE: product of GENDER and AGE
GENMRA: product of GENDER and MRA
GENPRO: product of GENDER and PREPRO
AGEMRA: product of AGE and MRA
AGEPRO: product of AGE and PREPRO
MRAPRO: product of MRA and PREPRO
TOTAL: Total Severity score on SSQ
NAUS: score on Nausea subscale of SSQ
VIS: score on Oculomotor Discomfort subscale of SSQ
DIS: score on Disorientation subscale of SSQ
LNTOTAL: natural logarithm of (TOTAL+1)
LNNAUS: natural logarithm of (NAUS+1)
LNVIS: natural logarithm of (VIS+1)
LNDIS: natural logarithm of (DIS+1)

A significance level of a = .05 was used throughout except where noted. Only first-order variables and all two-way interactions were considered in the analyses. Thus, there were ten possible regressors.

In order to meet the assumptions underlying the use of linear regression, the regression analyses were conducted using transformed sickness scores (e.g., LNTOTAL = ln[TOTAL+1] ) rather than the actual SSQ scores (e.g., TOTAL). Furthermore, because of additional violations of the assumptions due to the categorical-like nature of the subscale scores, a formal regression analysis was conducted only for the transformed Total Severity scores and none of the transformed subscale scores.Finally, because of the extreme AGE value associated with the individual, one point was eliminated from the regression analysis. Thus, the regression analysis for LNTOTAL was conducted using only 39 observations.

The first step in the analysis was to produce scatter plots of LNTOTAL versus each of the ten regressors. Examination of these scatter plots revealed no clear relationships between LNTOTAL and any of the regressors although some relationships were suggested. Specifically, there appeared to be a slight negative relationship between LNTOTAL and MRAPRO and slight positive relationships between LNTOTAL and GENDER, PREPRO, GENAGE, GENMRA, GENPRO, and AGEPRO.

Pearson correlations among regressors were examined. Aside from correlations between variables which were functions of the same variable (e.g., GENAGE and GENMRA) and correlations in which one of the variables was a function of the other (e.g., GENDER and GENAGE), there were several significant correlations between regressors. GENDER was significantly correlated with both PREPRO (r = .3437, p = .032) and AGEPRO (r = .3416, p = .033) and PREPRO was significantly correlated with GENAGE ( r = .3658, p = .022). In addition, two other correlations were significant at the a = .10 level: the correlation between PREPRO and GENMRA (r = .2711, p = .095) and the correlation between GENMRA and AGEPRO (r = .2781, p = .086). Because none of these correlations were especially strong, they were not expected to cause problems with multicollinearity in a regression model.

Correlations between LNTOTAL and each regressor were also examined. Significant correlations (at or near the a = .10 level) and their corresponding p-values are given in Table 5.

Table 5:
Significant Correlations Between LNTOTAL and the Reggresors

LNTOTAL .2835 (.080) .2947 (.069) .2621 (.107) .3426 (.033)

Note that all of these slight positive relationships had been suggested by the scatter plots. Not all of the relationships suggested by the scatter plots, however, represented significant correlations. The four significant correlations suggested a direct relationship between GENDER and LNTOTAL. Relationships between LNTOTAL and AGE, MRA, and PREPRO, however, appeared to occur only through their interaction with GENDER.

The next step was to let Minitab try to select the best model using sequential variable selection procedures. Although the results of such model selection techniques should not be accepted as final, the results can be examined in conjunction with other more detailed techniques. Used in this capacity, they can be helpful in providing an additional view on the total picture and this is the way they were used in the analyses presented here. Stepwise, backward elimination, and forward selection methods were tried. The stepwise and forward selection procedures both stopped having selected the model of LNTOTAL on GENPRO. The backward elimination procedure stopped having selected the model of LNTOTAL on AGEMRA, AGEPRO, and MRAPRO.

The next step was to perform a best subsets regression. In this procedure, all possible regressions were computed for each subset of models containing one to ten variables. Note that the ten-variable model is the full model containing all of the regressors. For each model, the R2, Adjusted R2, Mallow's Cp and s values were reported.

R2 is a measure of the proportion of variance explained by the model. As such, high values of R2 are desirable. However, R2 can be increased merely by adding more variables to the model. The Adjusted R2 is R2 adjusted for the degrees of freedom (Minitab, 1994) and does not necessarily increase as the number of variables in the model increases. The usual pattern is that, as variables are added to the model, Adjusted R2 increases to some point. After this point, the Adjusted R2 decreases, signaling, in effect, diminishing returns in explained variance associated with the addition of more variables to the model.

It is desired that Cp be small and close to p, where p is the number of coefficients (i.e., number of variables + 1 for models including an intercept). A value of Cp equal to p suggests that the model contains no estimated bias; whereas a value of Cp much larger than p suggests that the model is heavily biased (Myers, 1986).

Finally, s is the square root of the mean square error (MSE), thus, small values of s are desirable. The Minitab manual (Minitab, 1994) notes that when comparing models with different numbers of regressors, choosing the model with the highest Adjusted R2 is the same as choosing the model with the smallest MSE.

Based on these criteria, one model stood out from the rest. This was the model of LNTOTAL on AGE, GENMRA, AGEMRA, AGEPRO, and MRAPRO. This model contains the three variables found in the model selected by the backward elimination procedure. It also contains one of the variables which was found to correlate significantly with LNTOTAL. The Adjusted R2 value peaked at 24.3 with this model. The R2, Cp, and s values for the model were 34.3, 2.5, and 1.1332, respectively.

This model was then investigated further. The equation for this model is as follows:

LNTOTAL = 3.27 - 0.162 AGE + 0.0191 GENMRA + 0.00656 AGEMRA + 0.0277 AGEPRO - 0.0323 MRAPRO

Some statistics associated with this model are given in Table 6.

Table 6:
Statistics for Model of LNTOTAL

F Value
p-values for the coefficients R2 MSE
bAGE: .069
bGENMRA: .048
bAGEMRA: .002
bAGEPRO: .003
bMRAPRO: .001
34.3% 1.284

As can be seen, this model is significant. It explains 34.3% of the variance in LNTOTAL. All coefficients are significant or approach significance.

Variance Inflation Factor (VIF) values were computed for all regressors in the model. These values provide an indication of linear associations among regressors which might lead to multicollinearity problems. If any VIF value exceeds 10, Myers (1986) suggests that there may be cause for concern. The VIF values for the model ranged from 1.1 to 7.5 which do not suggest a problem with multicollinearity in the model.

As a check on the underlying normality and equal variance assumptions of the model, four diagnostic residual plots were produced: a normal probability plot of the residuals, an I chart of residuals, a histogram of residuals, and a scatter plot of the residuals versus the fitted values. These plots suggested no problems with the underlying assumptions of the linear regression model.

Finally, an analysis of residuals was performed to identify high-influence points. Residual diagnostics used were standardized residuals, hii values, Cook's D, and DFITS values. Standardized residuals are helpful in identifying data points which are extreme in their y value. The hii values - or HAT values - are the diagonal elements of the X(X'X)-1X' matrix and are used to identify data points which are extreme in their x value(s). Cook's D is used to identify data points which have high influence on the b's. Finally, the DFITS values represent the change in the fitted value, in standard deviation units, if the ith point is removed. As such, they represent a combination of diagnostics which forms a measure of how unusual an observation is (Minitab, 1994).

There is no unmistakable criteria with which to declare that a residual diagnostic value implies a high influence point. Several guidelines, however, are suggested. Because standardized residuals have variance 1, observations with absolute values exceeding 2 may be unusual (Minitab, 1994). HAT values exceeding (2*p)/n to (3*p)/n; Cook's D values exceeding the .50 percentage point of an F distribution having p numerator and n-p denominator degrees of freedom; and absolute values of DFIT exceeding 2*sqrt(p/n) may all suggest unusual observations (Myers, 1994). Note that for these criteria, p refers to the number of coefficients in the model (i.e., for the model presented here, p = 6) and n refers to the sample size (n = 39 for this model).

Comparison of the diagnostics obtained for the model to these criteria yielded six points which exceeded the recommended criteria. Those points and their diagnostic values appear in Table 7.

Table 7:

standardized residual
.308 - .462
Cook's D
3 1.94252 0.133815 0.097157 0.79891
10 -0.66757 0.342586 0.038705 -0.47778
11 -1.75498 0.432961 0.391949 -1.58593
18 0.52572 0.310910 0.020783 0.34920
26 -2.08841 0.073395 0.057577 -0.62130
28 1.96622 0.254048 0.219442 1.20257

None of the six points had extreme Cook's D values so it did not appear that any of the points were exerting undue influence on the regression coefficients. Observation 26 stood out for the value of the standardized residual. At -2.08841, however, it only barely exceeded the criteria and was not given any further attention. Observations 10, 11, and 18 stood out for their HAT values, which exceeded the conservative criteria. Because none of these values exceeded the more liberal criteria, none were deemed problematic. Observation 11 also had an unusual DFITS value. This individual was a male who, at 32 years old, was slightly older than the rest of the sample. His MRA and PREPRO values - 13 and 2.0, respectively - were both within one standard deviation of their respective means. His TOTAL score, however, was 0.00. The combination of these values for the independent variables and TOTAL were likely somewhat unusual given the model but were not considered problematic.

Observations 3 and 28 also stood out for their DFITS values. At 0.79891, the DFITS value for observation 3 only barely exceeded the criteria and was not given any further attention. The DFITS value for observation 28 was 1.20257. This individual was a 22 year-old male. His MRA value - 38 - was the highest obtained in the sample. His PREPRO value - 5.0 - was within one standard deviation of the mean. His TOTAL score, however, was 26.18. As with observation 11, it was likely the combination of his values for the independent variables and his sickness score was somewhat unusual for the model but did not appear to be problematic.

The final conclusion was that the model of LNTOTAL on AGE, GENMRA, AGEMRA, AGEPRO, and MRAPRO performs and fits the data very well.