Skip to content

Commit 76ea3c2

Browse files
drbenvincentfonnesbeck
authored andcommitted
Further updates to Simpson's paradox notebook (pymc-devs#709)
1 parent d57efe8 commit 76ea3c2

File tree

3 files changed

+2108
-2166
lines changed

3 files changed

+2108
-2166
lines changed

examples/causal_inference/GLM-simpsons-paradox.ipynb

+2,094
Large diffs are not rendered by default.

examples/generalized_linear_models/GLM-simpsons-paradox.myst.md renamed to examples/causal_inference/GLM-simpsons-paradox.myst.md

+14-9
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,10 @@ kernelspec:
1111
---
1212

1313
(GLM-simpsons-paradox)=
14-
# Simpson's paradox and mixed models
14+
# Simpson's paradox
1515

1616
:::{post} September, 2024
17-
:tags: regression, hierarchical model, linear model, posterior predictive, Simpson's paradox
17+
:tags: regression, hierarchical model, linear model, posterior predictive, causal inference, Simpson's paradox
1818
:category: beginner
1919
:author: Benjamin T. Vincent
2020
:::
@@ -25,7 +25,9 @@ kernelspec:
2525

2626
![](https://upload.wikimedia.org/wikipedia/commons/f/fb/Simpsons_paradox_-_animation.gif)
2727

28-
This paradox can be resolved by assuming a causal DAG which includes how the main predictor variable _and_ group membership influence the outcome variable. We demonstrate an example where we _don't_ incorporate group membership (so our causal DAG is wrong, or in other words our model is misspecified). We then show 2 ways to resolve this by including group membership as causal influence upon the outcome variable. This is shown in an unpooled model (which we could also call a fixed effects model) and a hierarchical model (which we could also call a mixed effects model).
28+
Another way of describing this is that we wish to estimate the causal relationship $x \rightarrow y$. The seemingly obvious approach of modelling `y ~ 1 + x` will lead us to conclude (in the situation above) that increasing $x$ causes $y$ to decrease (see Model 1 below). However, the relationship between $x$ and $y$ is confounded by a group membership variable $group$. This group membership variable is not included in the model, and so the relationship between $x$ and $y$ is biased. If we now factor in the influence of $group$, in some situations (e.g. the image above) this can lead us to completely reverse the sign of our estimate of $x \rightarrow y$, now estimating that increasing $x$ causes $y$ to _increase_.
29+
30+
In short, this 'paradox' (or simply ommitted variable bias) can be resolved by assuming a causal DAG which includes how the main predictor variable _and_ group membership (the confounding variable) influence the outcome variable. We demonstrate an example where we _don't_ incorporate group membership (so our causal DAG is wrong, or in other words our model is misspecified; Model 1). We then show 2 ways to resolve this by including group membership as causal influence upon $x$ and $y$. This is shown in an unpooled model (Model 2) and a hierarchical model (Model 3).
2931

3032
```{code-cell} ipython3
3133
import arviz as az
@@ -260,7 +262,7 @@ ax = az.plot_posterior(idata1.posterior["β1"], ref_val=0)
260262
ax.set(title="Model 1 strongly suggests a positive slope", xlabel=r"$\beta_1$");
261263
```
262264

263-
## Model 2: Unpooled regression
265+
## Model 2: Unpooled regression with counfounder included
264266

265267
We will use the same data in this analysis, but this time we will use our knowledge that data come from groups. From a causal perspective we are exploring the notion that both $x$ and $y$ are influenced by group membership. This can be shown in the causal directed acyclic graph ([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) below.
266268

@@ -272,10 +274,15 @@ g.node(name="x", label="x")
272274
g.node(name="g", label="group")
273275
g.node(name="y", label="y")
274276
g.edge(tail_name="x", head_name="y")
277+
g.edge(tail_name="g", head_name="x")
275278
g.edge(tail_name="g", head_name="y")
276279
g
277280
```
278281

282+
So we can see that $group$ is a [confounding variable](https://en.wikipedia.org/wiki/Confounding). So if we are trying to discover the causal relationship of $x$ on $y$, we need to account for the confounding variable $group$. Model 1 did not do this and so arrived at the wrong conclusion. But Model 2 will account for this.
283+
284+
+++
285+
279286
More specifically we will essentially fit independent regressions to data within each group. This could also be described as an unpooled model. We could describe this model mathematically as:
280287

281288
$$
@@ -425,7 +432,7 @@ ax[0].set(
425432
);
426433
```
427434

428-
## Model 3: Partial pooling model
435+
## Model 3: Partial pooling model with confounder included
429436

430437
Model 3 assumes the same causal DAG as model 2 (see above). However, we can go further and incorporate more knowledge about the structure of our data. Rather than treating each group as entirely independent, we can use our knowledge that these groups are drawn from a population-level distribution. We could formalise this as saying that the group-level slopes and intercepts are modeled as deflections from a population-level slope and intercept, respectively.
431438

@@ -471,7 +478,7 @@ We can also express this Model 3 in Wilkinson notation as `1 + x + (1 + x | g)`.
471478

472479
* The `1` captures the global intercept, $\beta_0$.
473480
* The `x` captures the global slope, $\beta_1$.
474-
* The `(1 + x | g)` term captures group specific random effects for the intercept and slope.
481+
* The `(1 + x | g)` term captures group specific terms for the intercept and slope.
475482
* `1 | g` captures the group specific intercept deflections $\vec{u_0}$ parameters.
476483
* `x | g` captures the group specific slope deflections $\vec{u_1}[g_i]$ parameters.
477484
:::
@@ -643,16 +650,14 @@ This paradox is resolved by updating our causal DAG to include the group variabl
643650

644651
Model 3 assumed the same causal DAG, but adds the knowledge that each of these groups are sampled from an overall population. This added the ability to make inferences not only about the regression parameters at the group level, but also at the population level.
645652

646-
If you are interested in learning more, there are a number of other [PyMC examples](http://docs.pymc.io/nb_examples/index.html) covering hierarchical modelling and regression topics.
647-
648653
+++
649654

650655
## Authors
651656
* Authored by [Benjamin T. Vincent](https://github.com/drbenvincent) in July 2021
652657
* Updated by [Benjamin T. Vincent](https://github.com/drbenvincent) in April 2022
653658
* Updated by [Benjamin T. Vincent](https://github.com/drbenvincent) in February 2023 to run on PyMC v5
654659
* Updated to use `az.extract` by [Benjamin T. Vincent](https://github.com/drbenvincent) in February 2023 ([pymc-examples#522](https://github.com/pymc-devs/pymc-examples/pull/522))
655-
* Updated by [Benjamin T. Vincent](https://github.com/drbenvincent) in September 2024 ([pymc-examples#697](https://github.com/pymc-devs/pymc-examples/pull/697))
660+
* Updated by [Benjamin T. Vincent](https://github.com/drbenvincent) in September 2024 ([pymc-examples#697](https://github.com/pymc-devs/pymc-examples/pull/697) and [pymc-examples#709](https://github.com/pymc-devs/pymc-examples/pull/709))
656661

657662
+++
658663

examples/generalized_linear_models/GLM-simpsons-paradox.ipynb

-2,157
This file was deleted.

0 commit comments

Comments
 (0)