What is variance inflation factor




















We can choose to remove either predictor from the model. The decision of which one to remove is often a scientific or practical one. For example, if the researchers here are interested in using their final model to predict the blood pressure of future individuals, their choice should be clear. Which of the two measurements — body surface area or weight — do you think would be easier to obtain?! If indeed weight is an easier measurement to obtain than body surface area, then the researchers would be well-advised to remove BSA from the model and leave Weight in the model.

Therefore, the researchers could also consider removing the predictor Pulse from the model. Let's see how the researchers would do. Aha — the remaining variance inflation factors are quite satisfactory! That is, it appears as if hardly any variance inflation remains. The response y measures the heat evolved in calories during the hardening of cement on a per gram basis.

The remaining pairwise correlations are all quite low. Use your calculated value, carried out to 5 decimal places, in answering the next question. Minitab will actually calculate the variance inflation factors for you. It should seem logical that multicollinearity is present here, given that the predictors are measuring the percentage of ingredients in the cement. Why does this happen? The individual t-tests indicate that none of the predictors are significant given the presence of all the others, but the overall F -test indicates that at least one of the predictors is useful.

This is a result of the high degree of multicollinearity between all the predictors. We learned that one way of reducing data-based multicollinearity is to remove some of the violating predictors from the model.

Are the variance inflation factors for this model acceptable? Breadcrumb Home 12 The VIF for the predictor Weight , for example, tells us that the variance of the estimated coefficient of Weight is inflated by a factor of 8.

For the sake of understanding, let's verify the calculation of the VIF for the predictor Weight. Therefore, the variance inflation factor for the estimated coefficient Weight is by definition:. Again, this variance inflation factor tells us that the variance of the weight coefficient is inflated by a factor of 8.

So, what to do? One solution to dealing with multicollinearity is to remove some of the violating predictors from the model. If we review the pairwise correlations again:. We can choose to remove either predictor from the model. The decision of which one to remove is often a scientific or practical one. For example, if the researchers here are interested in using their final model to predict the blood pressure of future individuals, their choice should be clear.

Which of the two measurements — body surface area or weight — do you think would be easier to obtain?! If indeed weight is an easier measurement to obtain than body surface area, then the researchers would be well-advised to remove BSA from the model and leave Weight in the model. Therefore, the researchers could also consider removing the predictor Pulse from the model.

A multiple regression model is used in a situation where a person wants to examine the effect of multiple variables on an outcome. Here, the dependent variable would be the outcome that is tested with the independent variables.

The independent variables would form inputs into the model. The existence of high intercorrelation between variables makes them less independent. Thus, intercorrelation between variables in a multiple regression model creates problems in testing the variables. It makes it difficult to determine how much the combination of independent variables impacts the dependent variable or the outcome of the regression model.

Even small changes in the data or in the structure of the regression model can lead to large and, sometimes, erratic changes in the coefficients of variables. VIF is a statistical tool which helps in testing a regression model for correctness. It tests how the behaviour of an independent variable is altered due to a correlation with other independent variables.

Thus, it helps in identifying the severity of the issues to facilitate adjustment to the model. High intercorrelation between variables may produce results which are not significant statistically. It may also lead to double counting of variables.

However, VIF can be used for testing economic data variables.



0コメント

  • 1000 / 1000