On The Book of Why

With the semester over, I am finally able to get back for a bit of less-academic writing, in this case about my bedtime reading: a review of The Book of Why  by Judea Pearl.

I was put onto this book by the Nature Podcast who reviewed it on their books segment. Pearl’s ideas had been known on the fringes of the statistics community for years and largely dismissed as “Well if you call part of a model causal, the inference is rather circular”. But enough else was happening in causality in statistics that it seemed a good juncture to actually look at how badly wrong that statement was.

As it turns out, it’s not far off. But that doesn’t mean that Pearl’s ideas are devoid of content (or of no relevance to statistics). In fact, I find myself fairly violently conflicted about his project, which might make for a more-interesting-than-usual review.

The first thing to say is that this is excellent bed-time reading for a statistician. In fact it’s hard to work out who else might be the intended audience. It requires far too much statistical background for a general audience (even, I expect, for most computer scientists) but had the right level of informal discussion for me to read when too tired for technical work. I fear that may have limited its sales, but certainly worked well for me.

The second is that it is worth slogging through the opening chapters. These are prototypical examples of CS salesmanship (“the new science of causation”???), both over-promising the remainder of the book, and under-rating the contributions of others, particularly in statistics. I spent a good deal of time wanting to thrown the book across the room as Pearl airily dismissed Fisher as being completely uninterested in causation (What did he think randomized trials were for?) and dismissed the entire field of statistics as following from that tradition (Had he not heard of Don Rubin or Jamie Robbins? How did he think he was going to get inferences out of data, whether he called them causal or not? What about structural equation models?) As it turns out later, Pearl does in fact discuss RCTs as a gold standard (though not as the only standard) and both Rubin and Robbins play large roles. Indeed, it’s hard to find a non-historical figure in the book who isn’t either Pearl’s student (and he’s admirably supportive of these) or a statistician! I rather suspect Pearl should have been a statistician, though I don’t know that he’d be prepared to admit that.

But, after plowing through the aggravation, we get down to business. I’ll divide the book into two broad topics. The first of these is, to me, relatively uncontroversial (at least in parts): causality has not (surprisingly) been formalized in terms of probabilistic mathematical models. The do-calculus (I find this name awkward, but at least it’s descriptive) provides this and, along the way, some insight about what relationships one should examine in data if you are looking for downstream causal effects.

Don’t control for mediators” is, for example, a statistically counter-intuitive statement, though it makes causal sense. (Except the statistician in me is more inclined to condition, and then reconstruct the causal effect — I haven’t yet worked out if that leads to a loss of efficiency or the other way around). He also shows how to view a set of not-obviously-causal problems through this lens, which is certainly interesting and useful. I’ll leave it to others who have looked at the do-calculus more carefully to assess the philosophical contribution (although things are slipperier for continuous quantities than for the discrete values that Pearl — clearly a computer scientist here — finds more comfortable), but I’ll readily admit to surprise at the realisation that it hadn’t been formalized a century or so ago. Statistical application or not, that is an important contribution within philosophy. Some of this really is slippery, though; I’d like to see the Rudin potential-outcomes framework re-written in do-calculus (I think this is possible) or how this relates to the much weaker notion of Granger causality. I’m less convinced that Pearl’s “ladder of causation” (Maybe step-ladder? It has but three rungs) is really as clear as all that, but I’ll accept it as a means of introducing the lay person to thinking about things.

Along the way, he skewers statisticians’ traditional interpretation of their own linear models. See my complains on this at Weasel Words where it’s not only causal relationships that are ignored (contrary to Pearl, I think a comparison of individuals is still interpretable even though not causal), but those are, too.

The second issue is much more controversial: that scientists are far too afraid of causal language (and in particular, that this is statistician’s fault). The argument for this claim goes something like

  1. Scientists do, in fact, think that they are finding causal relationships.
  2. They are hampered in discussing this by depressing party-pooper statisticians forever warning about hidden confounders.

And in fact I agree with both these and recall thinking 1. on many occasions myself! Indeed, I also find myself in emphatic agreement with Pearls’ frustration at statistician’s refusal to go outside of their own (fairly narrow) toolbox and incorporate an understanding of the domain that they work with. See, for example, my comments in Interpretation and Mechanistic Models  although I will come back to that (and have been recently guilty myself of having too little time to develop enough depth to really work with a problem).

BUT: these claims, and the benefits of causal understanding, are easy to make with the rather pat models that Pearl produces. It’s much harder in the real world where, as in nutrition or public health, the unlicensed causal interpretation of findings and the constant attenuation of cautionary languages (eg recently by Andrew Gelman) produces features and fads and much more malicious effects, too. My sister‘s reaction to the set of ideas was that encouraging more unsupported causal interpretation would be disastrous in public health.

And in fact, this really is the nub: Pearls’ analysis works beautifully once you have agreed on what the causal relationships are. But almost everywhere, that agreement doesn’t exist (for a nice interaction with the fairness debate, see this paper). Even in The Book of Why, I kept looking at the already-overly-simple models and saying “I’m not sure that’s right.” How only earth are we to agree on causal relations in real scenarios? Every example in Chapter 9 illustrates this for me.

Now I’m pretty sure that Pearl’s response would be that the solution to this isn’t to avoid discussing causation, but to make it explicit. That is to say “A causal claim is being made here, let’s explicitly discuss that and the evidence for it.” He even (although only once, and in passing) countenances establishing tiers of evidence for causal effects: randomized trials, from observational data, etc.

And I’m generally sympathetic to “Let’s be honest about what we think is going on.” even after accounting for “But human’s have a horrible tendency to run away with an idea.” And it might be worthwhile to produce a somewhat more formalized framework for discussing causal claims. But the discussion sections of paper do, in fact, often it clear what the authors think the causal relations are. These are not stated elsewhere precisely because the evidence for them is weak.

I think that what Pearl perceives as hostility to causation on the part of statistics can be reasonably attributed to caution. Statistics suffers from an over-abundance of this, partly due to statistician’s unwillingness (or lack of time) to really get involved with a subject. This leads us to be rather scared of writing down anything but the weakest models, and it certainly leads us to be highly skeptical of causal statements where the do-calculus (ie, performing an intervention) hasn’t been physically instantiated in an experiment. That attitude is a hindrance, but it’s also born of a century of experience of poor replicability and bad scientific consequences. Even the more-often accepted Rudin analysis has a “no hidden confounders” assumption that many (myself included) find hard to swallow.

Now I agree with Pearl that statisticians are far too scared of writing down a model. One doesn’t have to be Bayesian to say “Let’s start with writing down what I think I know and see how that agrees with the data” — we do that in sample size calculations already. But I’m really not persuaded that sticking with linear models or categorical relationships is the way to do this. One thing that Pearl leaves out of this book is mechanism: how, physically, does this effect take place? (Ok, this comes up as mediation, but in a quite different context) More importantly, can I transfer this understanding to a different system? Without this sort of understanding — and that’s very difficult — causality has relatively limited uses. But Pearls’ observation-based and rather pat models don’t really get us anything like that.

So do read The Book of Why; it is a set of ideas that you should know about, at least informally, and it is useful to think about more than just phenomenological correlation. But then also go and read some physics, or some mathematical biology as well. Statisticians need a really good dose of both.

One thought on “On The Book of Why

Leave a comment