Hypothesis Testing 618
Photo by: Kheng Guan Toh

Hypothesis testing, the backbone of the scientific method, is a methodology for evaluating a business or economic theory. A hypothesis is a proposition or statement about the world—derived from any source, from whim or fancy, from accumulated knowledge, from dominant or heretical ideas, from prejudices, or from guesses—that is capable of being confronted with facts and is thus capable of being refuted or confirmed by those facts. In any field of science, from physics and chemistry to economics and sociology, practitioners often pursue questions using this method, generally referred to as the scientific method. The overarching process involves the formulation of hypotheses (statements), testing them against the facts, and rejecting those statements that are refuted or reformulating them in accordance with information derived from the testing. Business and economic applications of hypothesis testing include researching consumer behavior, formulating economic models, and evaluating corporate strategies, among many others.


In many of the natural sciences, hypothesis testing takes place in the context of controlled laboratory experiments (so as to isolate a particular phenomenon or causal effect). For example, a medical researcher may wish to test the proposition that smoking causes lung cancer. In order to properly test the hypothesis, he or she might try to look at identical individuals in identical environments, with the only difference (assuming that all other factors could be controlled) being that one group smokes while the other group (the control group) does not. If the group that smoked eventually developed lung cancer, the researcher could conclude that his or her hypothesis was confirmed.


By contrast, in the social sciences, investigators often resort to secondary analysis; statistical methods are employed to analyze data because social phenomenon are rarely, if ever, amenable to laboratory-type experiments. Hypotheses are tested using statistical techniques in order to infer conclusions about a population from information obtained from a subset (or sample) of that population. Statistical inference (based on laws of probability) is then used to test whether a particular observed phenomenon is due to chance.

For example, we might wish to test whether the observation that men's wages on average are significantly higher than women's wages is not a random event characteristic of a particular sample of the men and women we surveyed. To test this we would formulate a null hypothesis that the true mean wages of men and women are equal. To err on the conservative side, null hypotheses generally assume there is no relationship between the factors being observed; the logic being that it is a lesser mistake to fail to find a relationship than to assert falsely that there is one. In our example, this would mean we assume there is no difference in pay attributable to sex. If the statistical evidence is strong enough, however, we reject the null hypothesis and accept the alternative—that the differences are not due to chance.

We could then discuss why this might be the case. This is where the controversy would arise. Hypothesis testing may allow the researcher to find a connection between observed phenomena, but a simple correlation does not necessarily identify or explain the causes or dynamics of that relationship. In other words, it would be premature to conclude that sex discrimination is the cause, even if we have concluded that wages are materially different. To test the discrimination theory, a new hypothesis—and a new means of testing it—would have to be devised. Of course it is one thing to posit a hypothesis and quite another to devise a meaningful test of it. In this example, while it may be easy enough to prove a correlation between sex and pay, it would be much more complicated to demonstrate how the difference is put into effect; the project would likely involve a series of additional hypotheses relating to specific, measurable indicators of discrimination and other factors that could affect wages.

In econometrics, the branch of economic statistics that most often deals with hypothesis testing, an investigator might assume some relationship between variables for purposes of statistical testing. For example, a tax on corporate income might be posited to be passed on to consumers in the form of higher prices. One way of testing this hypothesis would be to test the hypothesis that prices are correlated with the tax. Other common hypotheses tested are that the quantity of a good demanded depends on the price of the good. Another repeatedly confirmed hypothesis is that variation in the money supply in an economy is associated with variation in the price level of the economy. In all of these cases correlation is easily shown—that is to say, all of these hypotheses have been largely confirmed. Again, the drawback in this type of analysis is that while hypothesis tests can establish correlation between variables, they cannot explain how and why systems function as they do. For example, does a change in the money supply lead to a change in prices? Or, conversely, does a change in prices lead to a change in the money supply? Does some other variable, or variables, lead to a change in both the money supply and the price level? Differing reasonable explanations abound. Thus, while certain confirmed hypotheses might exhibit substantial predictive power, in order to gain a more complete understanding of any subject, empirical testing must be embedded in a larger context of historical and theoretical reasoning about the world.

Much of the research in the social sciences (and various business applications) relies on statistical methods that allow the researcher to make general statements about a population from information derived from a sample. These statistical methods then allow the researcher to separate the effects of systematic variation of a variable from mere chance effects. As mentioned, this technique is especially useful in the social sciences because many phenomena cannot be isolated or controlled in a laboratory-type setting, as in the physical sciences. Many tests of economic hypotheses, for example, take the form of testing parameters of linear regression models. To illustrate, suppose an economic relation is hypothesized to take the form

where Y is supposed to represent observations of the dependent variable and X is supposed to represent observations of explanatory (or causal) variables. The quantity B is a coefficient that expresses the relationship between the independent variables and the hypothesized dependent variables, while e is a vector of residual terms that are assumed to be independent of one another (or random). Hypothesis tests could then be formulated by placing restrictions on one or more of the coefficients and testing whether certain variables (alone or in concert) have an effect on Y. Thus, one might hypothesize that consumption expenditures are related to income, or wages, wealth, and certain other variables. We could then posit the null hypothesis that, for example, consumption is not a function of income, holding other variables constant (i.e., that the coefficient for B is zero) Then, if the null hypothesis is rejected, that would imply that a measurable portion of the variation in consumption expenditures (captured in the parameter B) is explained by the variation in income.


In spite of claims that scientists are passive recipients of facts about the world, the questioning of how and which hypotheses are tested is, in fact, a complicated social process that involves issues of a particular society's collective or dominant values—our perceptions of the world shape our understanding of the world and our understanding of the world contributes to how we in turn act upon our world. In other words, the type of questions that are asked is itself a product of many factors, including the inherited historical knowledge, dominant values, and ideology of a particular society. Without doubt, this knowledge influences the society's technological and social trajectory.

But the dominant notion of the scientist is one of the neutral observer trying to make sense of a complicated world. In their labors, scientists obtain information about the world and formulate propositions in the form of refutable hypotheses. The goal is to find regularities concealed by random disturbances. In this way, primary causal relations may be separated from those phenomena that are generated by chance. The accepted hypotheses are then accretions to scientific knowledge. Often the people who test hypotheses are separate from those who think about and interpret the results of empirical tests. Thus, for example, one often finds theoretical physicists and theoretical economists as distinct from applied economists and applied physicists. In any case, it is the facts that speak to the observer. Not surprisingly, then, one of the most fundamental notions of positivist science is the separation of analytical (often called metaphysical or logical) arguments (not directly observable) from empirical (by definition testable) statements. One of the crudest versions of this method elevates prediction as the best way to judge the validity of a theory, regardless of its assumptions. Whether prediction is the most desirable test of the validity of any theory is not, of course, a settled issue.

Thus, one of the philosophical tenets of the method of positivist science puts forward a view of the scientific investigator as the neutral observer of historical and physical phenomena, one who assumes the role of selecting and testing facts. Of course, facts always require interpretation. Indeed, some would argue that we can't separate science from ideology, as everyone speaks from some point of view, but we can openly recognize perspectives for what they are. In this sense it may be inaccurate to view science as a strictly neutral observation of the world, particularly in fields where there are many competing interpretations of the facts; of course, this doesn't mean that basic and uncontested scientific ideas need to be scrutinized by every lay observer.

Within any particular natural or social science, hypotheses that have been confirmed (by replication and verification) and accepted are often elevated to the status of laws. Laws are valued because they have substantial predictive power and because they can account for certain regularities in nature or society. These laws, however, do not explain the regularities, the facts; they only describe them. In other words, to explain why a phenomena occurs we turn to a larger context, typically to abstract forces for which often no direct observational evidence exists but which may be discerned by the array of phenomena generated by these forces. For example, one cannot observe gravity directly but one can observe (measure, test etc.) the phenomena that the force of gravity generates in different contexts (e.g., a person jumping off a building will fall at a particular speed, the moon revolving around earth will travel a particular path at a particular speed). The strength of hypothesis testing lies in its ability to glean patterns in an apparently chaotic world, thereby directing the researcher towards which phenomena to look for and what questions to ask.

[ John A. Sarich ]


Journal of Econometrics, monthly.

Kennedy, Peter. A Guide to Econometrics. 4th ed. Cambridge, MA: MIT Press, 1998.

Lehman, E.L. Testing Statistical Hypotheses. 2nd reprint ed. New York: Springer-Verlag, 1997.

Other articles you might like:

Follow City-Data.com Founder
on our Forum or Twitter

Also read article about Hypothesis Testing from Wikipedia

User Contributions:

Comment about this article, ask questions, or add new information about this topic: