Regression analysis employs algebraic formulas to estimate the value of a continuous random variable, called a dependent variable, using the value of another, independent, variable. Statistical methods are used to determine the most correct estimate of that dependent variable, and whether the estimate is valid at all.

Regressions may be used for a wide variety of purposes where estimation is
important. For example, a marketer may employ a regression to determine
how sales of products might be affected by investments in
**
advertising.
**
An employer may perform a similar analysis to estimate an
employee's job evaluation scores based on the employee's
performance on an aptitude test. A biologist can even use a regression to
see how temperature changes might affect the rate of reproduction in
frogs.

While closely related, regression differs from correlation analysis in an important way. Where regression is used to estimate the value of a dependent variable, correlation measures the degree of relationship between two variables. In other words, correlation analysis can indicate the strength of a linear relationship between variables, but it is left to regression analysis to provide predictions of the dependent variable based on values of an independent variable.

A simple regression analysis is one in which a single independent variable
is used to determine a dependent variable. The relationship between the
variables is assumed to be consistent, or linear. Figure 1 shows examples
of linear, nonlinear, and curvilinear scatter diagrams, as well as one
where there is no consistent relationship between
*
X
*
and
*
Y
*
variables.

The equation that represents the simple linear regression is

where
*
Y
_{
i
}
*
= the value of the dependent variable in a certain observation, i;

α =

β = the slope of the regression line;

e

The values of both the independent variable
*
X
*
and the dependent variable
*
Y
*
are provided by a survey, or set of observed numerical samples. These
sets of numbers are maintained as ordered pairs—a range of values
of
*
Y
*
is indicated for each value of
*
X.
*
The value
*
e
_{
i
}
*
represents the sampling error associated with the dependent random
variable

Some assumptions must be satisfied to perform the regression analysis.
First, if we plot the values of
*
X
*
on a scatter diagram, the sampling error ei, or variance from a mean,
must be reasonably consistent for all values of
*
X.
*
In other words, for each value of
*
X,
*
the variation in values of
*
Y
*
must be reasonably consistent. This quality is called homoscedasticity.

Second, observed values of the random variable and amounts of random error must be uncorrelated, a condition usually satisfied by random sampling of the dependent value.

A simple regression analysis uses only one independent variable. There are many situations, however, where a dependent variable is determined by 2,3,5, or even 100 independent variables. As a result, it becomes difficult to represent the relationships between the variables in a visual model.

For example, a simple regression with two variables can be represented on a graph, with one variable measured on the X axis and the other on the Y axis. But add a third variable, and the graph requires a third dimension, X2. As a result, the regression line becomes a regression plane.

Add a fourth variable, and the regression can no longer be represented visually. Conceptually, it has four dimensions, also called hyperplanes or arrays. The same applies for regressions with even more variables; eight variables require eight dimensions.

These relationships can be expressed in complex mathematical formulas. They are no longer simple regressions, but multiple regressions.

*
[
*
*
John
*
*
Simley
*
,

*
updated by
*
*
Kevin
*
*
J.
*
*
Murphy
*
*
]
*

Foster, D. P., R. A. Stine, and R. P. Waterman.
*
Business Analysis Using Regression.
*
New York: Springer-Verlag, 1998.

Golberg, M.
*
Introduction to Regression Analysis.
*
Computational Mechanics Inc./WIT Press, 2000.

Rawlings, John 0., G. Sastry, and David A. Dickey.
*
Applied Regression Analysis.
*
New York: Springer-Verlag, 1998.

Also read article about **Regression Analysis** from Wikipedia

1

khalid Ghailan

Oct 13, 2006 @ 11:11 am

Im a researcher in health economics. how can i get access to some of your materials especially those relating to Multiple regression?