MULTIVARIATE ANALYSIS

Multivariate analysis deals with the mathematical application of statistics to a regression function to determine the effects of changes in a group of variables on other variables in the function. As a result, multivariate analysis can suggest, with a degree of predictive capability, what can be expected to happen when those variables change.

While it involves many kinds of tools, multivariate analysis is primarily a mathematical approach to decision making. It has many applications, including problems in engineering, traffic management, biology, economics, marketing, and even ethics and behavioral psychology. It can quantify how changes in one or more areas of a complex problem will affect an outcome over time, and indicate whether those changes will alleviate or exacerbate a problem.

For example, an airline company may use multivariate analysis to determine how revenue from a certain route might be affected by different fare prices, load factors, advertising budgets, aircraft choices, amenities, scheduling choices, fuel prices, and employee salaries. An agricultural engineer would use multivariate analysis to gauge crop yields based on different soil qualities; choices of seed, fertilizer, insecticides and planting schedules; amounts of sunlight and rain; and changes in temperature.

In these examples, route revenue and crop yields are dependent variables determined by sets of independent variables, such as fare prices and seed choices. A change in any one of the independent variables will produce a change in the dependent variable. But in some cases, independent variables may affect other independent variables.

For example, lowered airplane fares may affect load factors, and these might affect aircraft choices. Similarly, different fertilizers might affect seed choices, and these might affect planting schedules. Attempts at improving the outcome of the independent variable may have a short-term positive impact but may prove deleterious in the long term.

Multivariate analysis is not synonymous with classical simultaneous equations methodology, although this form of modeling is an essential component of the analysis. Multivariate analysis includes prescriptions for simplifying parameters and drawing relationships between data over different time periods.

The application of computer processing power greatly advanced the science of multivariate analysis by freeing analysts from the tedious task of computation. As a result, computer programs have become the necessary basic instrument of multivariate analysis.

One of the greatest problems faced by analysts of multivariate data is overparameterization. An overzealous analyst may be inclined to include many types of microphenomena into the analysis that may have little or no bearing on the result.

Part of the art of multivariate analysis is knowing what variables may be excluded. While this simplifies the computations involved, it may also create greater variation in the observed data, making it difficult to identify reliable regression lines. The only way to work past these dilemmas is to repeatedly test the models.

One form of overparameterization has to do specifically with time series analysis. This becomes evident in multivariate analysis of biological and sociological phenomena; current conditions may be primarily dependent on the results of previous time series.

For example, the rate of reproduction of trees in a forest depends not only on how many trees there are, but how many there were, and how many there were in successive time series before that. Weather conditions 10,000 years ago may be responsible for climatological changes that affect present rates of growth. In some cases, the analyst may be faced with a practically infinite regress. When faced with such chicken-and-egg problems, the analyst must determine where to draw the line, or limit the parameters of the analysis, without corrupting the reliability of the analysis.

[ John Simley ,

updated by Kevin J. Murphy ]

Binder, Michael, and M. Hashem Persaran. "Decision Making in the Presence of Heterogeneous Information and Social Interactions." International Economic Review 39, no. 4 (November 1998): 1027.

Kurtz, Norman R. Statistical Analysis for the Social Sciences. Revised ed. Allyn & Bacon, 1998.

Lind, Douglas A., and Robert D. Mason. Basic Statistics for Business and Economics. New York: McGraw-Hill Higher Education, 1996.