Why Look under the Hood

Originally published in The Philosophy of Economics: An Anthology Second edition. Edited by Daniel M. Hausman. Cambridge: Cambridge University Press, pp. 17-21. Original pagination shown in square brackets.

Methodologists have had few kind words for Milton Friedman's "The Methodology of Positive Economics" (1953c) yet its influence persists. Why? One answer is that methodologists have missed an important argument, which economists have found persuasive. Unlike Hirsch and de Marchi (1990), I am concerned in this note with this argument not with "what Friedman really meant."

Friedman declares, "The ultimate goal of a positive science is the development of a "theory" or "hypothesis" that yields valid and meaningful (i.e., not truistic) predictions about phenomena not yet observed." (p. 7) This is the central thesis of instrumentalism. But from a standard instrumentalist perspective, in which all the observable consequences of a theory are significant, it is impossible to defend Friedman's central claim that the realism of assumptions is irrelevant to the assessment of a scientific theory. For the assumptions of economics are testable, and a standard instrumentalist would not dismiss apparent disconfirmations. Indeed the distinction between assumptions and implications is superficial. The survey results reported by Richard Lester and others, which Friedman finds irrelevant and wrong-headed (pp. 15, 31f), are as much predictions of neoclassical theory as are claims about market phenomena.

But, like Lawrence Boland (1979), I contend that Friedman is not a standard instrumentalist. Consider the following passages:

Viewed as a body of substantive hypotheses, theory is to be judged by its predictive power for the class of phenomena which it is intended to "explain." (pp. 8-9)

[p. 218]

For this test [of predictions] to be relevant, the deduced facts must be about the class of phenomena the hypothesis is designed to explain;... (pp. 12-13)

The decisive test is whether the hypothesis works for the phenomena it purports to explain. (p. 30)1>

Friedman rejects a standard instrumentalist concern with all the predictions of a theory. A good tool need not be an all-purpose tool. Friedman holds that the goal of economics is "narrow predictive success"--correct prediction only for "the class of phenomena the hypothesis is designed to explain." Lester's surveys are irrelevant because their results are not among the phenomena that the theory of the firm was designed to explain. On just these grounds, many economists dismiss any inquiry into whether the claims of the theory of consumer choice are true of individuals.

I suggest that Friedman uses this view that science aims at narrow predictive success as a premise in the following implicit argument:

1. A good hypothesis provides valid and meaningful predictions concerning the class of phenomena it is intended to explain. (premise)

2. The only test of whether an hypothesis is a good hypothesis is whether it provides valid and meaningful predictions concerning the class of phenomena it is intended to explain.2 (invalidly from 1)

3. Any other facts about an hypothesis, including whether its assumptions are realistic, are irrelevant to its scientific assessment. (trivially from 2)>

If (1) the criterion of a good theory is narrow predictive success, then surely (2) the test of a good theory is narrow predictive success, and Friedman's claim that the realism of assumptions is irrelevant follows trivially. This is a tempting and persuasive argument.

But it is fallcious. (2) is not true, and it does not follow from (1). To see why, consider the following analogous argument.

1'. A good used car drives safely, economically and comfortably. (oversimplified premise)

2'. The only test of whether a used car is a good used car is to check whether it drives safely, economically and comfortably. (invalidly from 1')

3'. Anything one discovers by opening the hood and checking the separate components of a used car is irrelevant to its assessment. (trivially from 2')

Presumably nobody believes 3'.3 What is wrong with the argument? It assumes that a road test is a conclusive test of a car's future performance.

[p. 219]

If this assumption were true, if it were possible (and cheap) to do a total check of the performance of a used car for the whole of its future, then there would indeed be no point in looking under the hood. For we would know everything about its performance, which is all we care about. But a road test only provides a small sample of this performance. Thus a mechanic who examines the engine can provide relevant and useful information. The mechanic's input is particularly important when one wants to use the car under new circumstances and when car breaks down. Obviously one wants a sensible mechanic who notes not just that the components are used and imperfect, but who can judge how well the components are likely to serve their separate purposes.

Similarly, given Friedman's view of the goal of science, there would be no point to examining the assumptions of a theory if it were possible to do a "total" assessment of its performance with respect to the phenomena it was designed to explain. But one cannot make such an assessment. Indeed the point of a theory is to guide us in circumstances where we do not already know whether the predictions are correct.4 There is thus much that may be learned by examining the components (assumptions) of a theory and its "irrelevant" predictions. Such consideration of the "realism" of assumptions is particularly important when extending the theory to new circumstances or when revising it in the face of predictive failure.5 Again what is relevant is not whether the assumptions are perfectly true, but whether they are adequate approximations and whether their falsehood is likely to matter for particular purposes. Saying this is not conceding Friedman's case. Wide, not narrow predictive success, constitutes the grounds for judging whether a theory's assumptions are adequate approximations. The fact that a computer program works in a few instances does not render study of its algorithm and code superfluous or irrelevant.

There is a grain of truth in Friedman's defense of theories containing unrealistic assumptions. For some failures of assumptions may be irrelevant. Just as a malfunctioning air-conditioner is insignificant to a car's performance in Alaska, so is the falsity of the assumption of infinite divisibility unimportant in hypotheses concerning markets for basic grains. Given Friedman's narrow view of the goals of science (which I am conceding for the purposes of argument, but would otherwise contest), the realism of assumptions may thus sometimes be irrelevant. But this bit of practical wisdom does not support Friedman's strong conclusion that only narrow predictive success is relevant to the assessment of an hypothesis.

One should note three qualifications. First, we sometimes have a wealth of information concerning the track record of both theories and

[p. 220]

of used cars. I may know that my friend's old Mustang has been running without trouble for the past seven years. The more information we have about performance, the less important is separate examination of components. But it remains sensible to assess assumptions or components, particularly in circumstances of breakdown and when considering a new use. Second, intellectual tools, unlike mechanical tools, do not wear out. But if one has not yet grasped the fundamental laws governing a subject matter and does not fully know the scope of the laws and the boundary conditions on their validity, then generalizations are as likely to break down as are physical implements. Third (as Erkki Koskela reminded me), it is easier to interpret a road test than an econometric study. The difficulties of testing in economics make it all the more mandatory to look under the hood.

When either theories or used cars work, it makes sense to use them--although caution is in order if their parts have not been examined or appear to be faulty. But known performance in some sample of their given tasks is not the only information relevant to the assessment of either. Economists must (and do) look under the hoods of their theoretical vehicles. When they find embarrassing things there, they must not avert their eyes and claim that what they have found cannot matter. Even if all one cares about is predictive success in some limited domain, one should still be concerned about the realism of the assumptions of an hypothesis and the truth of its irrelevant or unimportant predictions.


Boland, Lawrence, 1979. "A Critique of Friedman's Critics," Journal of Economic Literature 17: 503-22.

Friedman, Milton, 1953c. "The Methodology of Positive Economics." pp. 3-43 of Essays in Positive Economics. Chicago: University of Chicago Press.

Hammond, J. Daniel, unpublished. "Early Drafts of Friedman's Methodological Essay," delivered at the History of Economics Society Meetings, June, 1991.

Hirsch, Abraham and Neil de Marchi, 1990. Milton Friedman: Economics in Theory and Practice. Ann Arbor: University of Michigan Press.


* I would like to thank John Dreher, Martin Finkler, Daniel Hammond, Erkki Koskela, Michael McPherson and Herbert Simon for useful criticisms and suggestions.

1. See also pp. 15, 20 and 41.

2. Notice that (2) does not say that the only test of a hypothesis is whether its predictions are valid. It says that the only test is the validity of only some of its predictions, namely those concerning "the class of phenomena the hypothesis is intened to explain." This is overstated, and (to repeat) I am not concerned to provide the best interpretation of Friedman's whole methodology. In his essay Friedman concedes a role for assumptions in facilitating an "indirect" test of a theory: "Yet, in the absence of other evidence, the success of the hypothesis for one purpose--in explaining one class of phenomena--will give us greater confidence than we would otherwise have that it may succeed for another purpose--in explaining another class of phenomena. It is much harder to say how much greater confidence it justifies. For this depends on how closely related we judge the two classes of phenomena to be,..." (p. 28). The last sentence still limits the relevance of the correctness of predictions concerning phenomena that are remote from those which the theory is designed to explain, and Friedman clearly believes that the evidential force of indirect tests is much less than that of tests concerning the range of phenomena that the theory is

[p. 221]

intended to "explain." Daniel Hammond (unpublished) has argued that these qualifications were not part of the original draft of the essay.

3. Those who do should get in touch. I've got some fine old cars for you at bargain prices.

4. Friedman partially recognizes this point when he writes (according to Hammond here echoing criticisms George Stigler and Arthur Burns offered of an earlier draft), "The decisive test is whether the hypothesis works for the phenomena it purports to explain. But a judgment may be required before any satisfactory test of this kind has been made, and, perhaps, when it cannot be made in the near future, in which case, the judgment will have to be based on the inadequate evidence available." (1953c, p. 30)

5. With what seems to me inconsistent good sense, Friedman again partly recognizes the point, "I do not mean to imply that questionnaire studies of businessmen's or other's motives or beliefs about the forces affecting their behavior are useless for all purposes in economics. They may be extremely valuable in suggesting leads to follow in accounting for divergences between predicted and observed results; that is, in constructing new hypotheses or revising old ones. Whatever their suggestive value in this respect, they seem to me almost entirely useless as a mean of testing the validity of economic hypotheses." (1953c, p. 31n)