# ECONOMETRICS

Econometrics applies the methodology of mathematical statistics and tools of statistical inference to the empirical implementation of models of economic activity postulated by economic theory. Econometrics is a major field within the discipline of economics (other major fields are economic theory, macroeconomics, industrial organization, labor economics, public economics, development economics, and international economics).

Though empirical studies within economics can be traced as far back as the 17th century, the field of econometrics is relatively young. The Econometric Society was established in 1930 and the journal of the Econometric Society, Econometrica, was founded in 1933. In Econometrica's first issue, the journal's editor, Ragnar Frisch (1895-1973), noted that econometrics is the unification of statistics, economic theory, and mathematics.

Econometrics has several overarching objectives. The first objective is to operationalize empirically economic theory. Economic theory is concerned with relations between variables of economic interest and the factors believed to determine those variables. A simple example of such a relationship is that of a demand function. Economic theory would postulate, for example, that the quantity of a certain type of automobile demanded depends on, among other things, the price of the vehicle. In particular, theory would stipulate that fewer units of the automobile will be demanded, the higher the price of the vehicle, other things being equal. Econometrics allows the economist to gather data on automobile demand and automobile price and fit a relationship between these two variables.

A second objective of econometrics is to test economic theory. Returning to the automobile example, once a relationship has been established between demand for the automobile and its price, the economist can then use the methods of statistical inference to determine whether price and demand are related to each other in the hypothesized inverse direction. Moreover, the economist can use tools of statistical inference to estimate the responsiveness of the variable of interest to a change in the determining variable.

A third objective of econometrics is to predict future movements in the economic variable of interest on the basis of the econometric model. In the automobile example, the fitted relationship may suggest that a price increase of \$1,000 will lead to a significant decline in demand and therefore revenue. Based on this prediction, then, the manufacturer presumably would decide to hold the line on future price increases.

The cornerstone model of econometrics is the classical linear regression model. To give a generic example, consider:

In this example, Y is the variable of interest (the dependent variable) and the Xs are variables that explain or determine the level of Y (the independent variables). The βs are impact coefficients, each respectively indicating the degree by which Y changes given a one unit change in the corresponding X variable, α is known as the intercept term and gives the predicted value of Y if the Xs are all equal to 0. Finally, ϵ is known as the disturbance term. It is appended to the model in order to allow for the analyst's inability to literally capture all factors that might conceivably determine Y in the real world. In other words, ϵ incorporates factors such as unforeseen events that affect Y and factors that determine Y that were inadvertently omitted from the deterministic part of the specification or that simply were not measured or were not measurable.

It is the disturbance term that gives the model its statistical flavor. In the classical case, it is assumed that ϵ has a normal probability distribution with an expected value of 0 and a constant variance of σ 2 . If this assumption is correct, then Y itself is normally distributed. The classical linear regression model also assumes that the disturbance term is not related in a systematic fashion across observations of Y and further assumes that the disturbance term is unrelated to the values of the Xs.

The above equation is a model of how the variable Y behaves. In actually implementing the model, the econometrician possesses individual observations of Y and of the X s. The goal of empirical implementation is estimation of the parameters of the model—α and the βs. This estimation is achieved via the method of least squares, which chooses the set of α and βs that minimizes the sum of squared deviations of the actual Y s from the Ys predicted by the model itself. In other words, the method of least squares generates a model of Y that fits the observed data best, as long as the assumptions mentioned above (known as the "ideal conditions") are met.

If the ideal conditions are satisfied, then the parameter estimates (i.e., the estimated values of α and the βs) will be normally distributed. Moreover, the expected values of the parameter estimates will equal the parameters themselves (this property is known as unbiasedness) and they will have a finite variance. In practical terms, these statistical properties allow the econometrician to make a couple of important types of inference. For one, the analyst can conduct a hypothesis test to determine on the basis of the estimated value of a particular parameter, whether the variable in question has an effect on Y. The reliability of such an inference will be high as it is convention in econometrics to conduct hypothesis tests such that the chance of drawing an incorrect inference is very low (usually less than 5 percent). In addition, a second type of inference that can be drawn using parameter estimates with the aforementioned properties is construction of a confidence interval for the parameter in question. A confidence interval is a range within which there is a high degree of certainty that the true value of the parameter lies. As long as this range is relatively narrow, then the inference will be meaningful.

As the name suggests, the "ideal conditions" of the classical linear regression model apply under ideal circumstances. In many empirical contexts, these assumptions are met, at least approximately. As long as this is the case, then the classical linear regression model is a sensible representation of reality. In many empirical situations, however, at least one of the ideal conditions does not hold. To give some examples: the relationship between the dependent and independent variables may not be linear; the disturbance terms may not have constant variance; the disturbance terms may be correlated with each other to some degree; the disturbance terms may be correlated with some of the independent variables; some important determining variables may have been omitted from the regression equation; the disturbance terms may not be normally distributed with zero mean.

Much work by econometricians over the last 50 years has been devoted to developing methods and techniques to deal with the above noted problems. "Generalized least squares" estimators have been developed to deal with problems that arise when the disturbance terms are correlated with each other (autocorrelation) or when the disturbance terms do not have constant variance (heteroskedasticity).

If the disturbance terms are correlated with some of the independent variables, then the regression equation suffers from the problem of simultaneity bias. In other words, the independent variables determine the dependent variable, but it is also the case that the dependent variable determines at least some of the independent variables. In order to deal with this problem, econometricians have developed "simultaneous equations" estimators. These techniques include two-stage and three-stage least squares.

Though there are many ways within the context of the linear regression model to deal with the problem of a nonlinear relationship between the dependent variable and the independent variables, some cases arise in which at least one of the parameters is inherently nonlinear. In order to deal with this problem, econometricians have developed "nonlinear least squares."

"Panel data techniques" have been developed to take advantage of the rich information contained in data sets that follow cross-sectional units of observation across time.

In some instances the dependent variable is not continuous and quantitative. Rather it reflects some qualitative aspect of a variable. For example, it may take the value of I in the presence of some condition (say, if an individual is a labor force participant) or 0 in the absence of the condition (if the individual does not work). In such a case, we say the dependent variable is qualitative. A vast array of techniques has been developed to deal with the many different manifestations of qualitative variables that can arise in empirical work.

Finally, time series econometrics has developed as a very substantial subfield within econometrics to deal with the role that time plays in many economic relationships. In this realm, one finds distributed lag models, ARIMA models, and error correction models. This area has been a particularly active field of study since the mid-1970s and many of the most important recent advances in econometrics have taken place here.

[ Kevin J Murphy ]

Davidson, R., and J. G. MacKinnon. Estimation and Inference in Econometrics. Oxford: Oxford University Press, 1993. Frisch, R. Editorial. Econometrica 1, no. 1 (1933): 1-4.

Greene, W. H. Econometric Analvsis. 3rd ed. Upper Saddle River, NJ: Prentice Hall, 1997.

Kmenta, J. Elements of Econometrics. 2nd ed. New York: Macmillan, 1986.

Maddaia, G. S. Introduction to Econometrics. New York: Macmillan. 1988.

Pindyck, R. S., and D. L. Rubinfield. Econometric Models and Economic Forecasts. 3rd ed. New York: McGraw-Hill, 1991.