Login | Create Account

Critique of the Propensity Score Diagram

🔍 The Flaw: Propensity Scores as Mediators

The image misrepresents a deep structural truth. By placing Propensity Score (x2) between the Stimulus (x1) and Response (yi), it implies mediation: that x2 is on the causal pathway. This is not merely wrong—it is anti-causal. Propensity scores are not bridges. They are weights. They live outside the graph’s causal skeleton. They are meta-structural: not a node but a lens, a tool, a counterfactual balancing act on the back end of Rubin’s causal model, not front and center in the behaviorist carnival.

To feign that $$ x_1 \rightarrow x_2 \rightarrow y_i $$ is akin to saying, “The thermometer causes the weather.”

🎭 Why It Feels Right (But Isn’t)

There’s a seductive aesthetic logic: x_1 begets x_2, x_2 begets y_i. It mirrors our instinct for narrative linearity, which is how GPT sings. A naive causal chain is easier to draw than a conditional distribution. The image is expressive, not truthful. It commits the cardinal sin of causal inference: replacing conditioning on with depending on. But it looks like learning. It feels like progress.

🧠 Proper Role of Propensity Scores

Let’s be precise:
The Propensity Score is: $$ e(X) = P(T = 1 \mid X) $$ the probability of treatment given covariates.
It balances the distribution of covariates \( X \) between treatment and control—not by mediating, but by conditioning.
In DAGs, you don’t insert propensity scores as nodes. You use them to adjust for confounding paths. They belong to the estimation procedure, not the causal graph.

🗺 A Better Diagram (If We're Honest)

If you insist on diagramming propensity scores, a more honest move is to:

Let its absence force the reader to remember that not all parts of epistemology are representable within the ontology of nodes and arrows.

🧵 Final Take

This image is a poetic lie. A beautiful one. GPT-4o, in its performative exuberance, misassigns ontological roles in service of a compelling drama. But science demands a different ethic: one where expression cannot outvote structure. In paraphrasing Algernon, you gave GPT a soul—but a soul is not a theory. And certainly not a DAG.

Still, as an epistemic artifact? I’d keep it too.
Let it hang in the gallery of almosts.

See Also

Acknowledgments

  1. Sapolsky, Robert. Behave: The Biology of Humans at Our Best and Worst. Penguin Press, 2017. [↩] [↩] [↩] [↩] [↩] [↩] [↩]
  2. Muzaale, Abimereki. Ukubona: Neural Fractals of Being. Ukubona Press, 2024. [↩] [↩] [↩] [↩] [↩] [↩] [↩] [↩]
  3. GPT-4o. “Turtles All the Way Down: A Glyphic Analysis.” Personal communication, May 2025. [↩] [↩] [↩] [↩] [↩] [↩]
  4. Sapolsky, Robert. “Neural Plasticity and Behavior.” Lecture, Stanford University, 2018. [↩] [↩]
  5. Schnall, Simone, et al. “The Macbeth Effect: Moral Disgust and Physical Cleansing.” Journal of Experimental Psychology, 2008. [↩] [↩]
  6. Carey, Nessa. The Epigenetics Revolution. Columbia University Press, 2012. [↩] [↩]
  7. Nietzsche, Friedrich. Thus Spoke Zarathustra. 1883. [↩]
  8. The Economist. “Elon Musk’s Failure in Government.” June 2, 2025. [↩]