DAGs and conditioning scenarios

Introduction

In this practical you need to draw a DAG for each of 3 scenarios and then decide what model to fit.

We will use the same data for each scenario. The data is given in the table below.

First, we perform some estimation so that you know the associations between the 3 variables \(E\), \(E^*\), and \(D\).

Estimating the marginal odds ratio for the association between \(E\) on \(D\) (i.e., using \(D\) as the outcome/dependent variable and \(E\) as the covariate)

dat %>%
  glm(d ~ e, family = binomial, data = .) %>%
  {cbind(coef(.), confint.default(.))} %>%
  exp() %>%
  round(., digits = 2) %>%
  kbl() %>%
  kable_styling(full_width = FALSE)

		2.5 %	97.5 %
(Intercept)	0.28	0.24	0.32
e	1.73	1.42	2.09

Estimating the conditional odds ratio for the association between \(E\) and \(D\) adjusting for/conditioning on \(E^*\)

dat %>%
  glm(d ~ e + es, family = binomial, data = .) %>%
  {cbind(coef(.), confint.default(.))} %>%
  exp() %>%
  round(., digits = 2) %>%
  kbl() %>%
  kable_styling(full_width = FALSE)

		2.5 %	97.5 %
(Intercept)	0.33	0.29	0.39
e	3.00	2.40	3.76
es	0.30	0.24	0.38

Estimating the marginal odds ratio for the association between \(E^*\) and \(D\)

dat %>%
  glm(d ~ es, family = binomial, data = .) %>%
  {cbind(coef(.), confint.default(.))} %>%
  exp() %>%
  round(., digits = 2) %>%
  kbl() %>%
  kable_styling(full_width = FALSE)

		2.5 %	97.5 %
(Intercept)	0.5	0.44	0.56
es	0.5	0.41	0.61

Question

You are given 3 scenarios from which the data could have been obtained. For each scenario we wish to estimate the effect of \(E\) on \(D\).

Draw a DAG for each scenario
Once you have drawn your DAG check that it conforms to the conditional independencies which were estimated above
Use your DAG to write down model would you fit to estimate the effect of \(E\) on \(D\) in each scenario

Scenario 1

The data come from a case-control study
The aetiological question of interest is whether exposure to a particular nonsteroidal anti-inflammatory drug during the first trimester of pregnancy causes a congenital defect (\(D\)) arising in the second trimester
\(D=1\) for cases, \(D=0\) for controls without the defect
The sampling fraction for controls is unknown
\(E^*\) is use of the drug of interest during the first trimester, as self-reported by the mother 1 month postpartum
\(E\) is use of the drug of interest as recorded in comprehensive, accurate medical records of 1st trimester medications
You can ignore including any other possible confounders or other drug exposures

Scenario 2

The data come from a prospective cohort study
\(D\) is all-cause mortality in a cohort of healthy male miners, all aged 25 years, all of whom worked underground in a variety of different mine shafts for 6 months in 1967
40 year follow-up is complete. The aetiologic question is whether pulmonary exposure to doses of radon above a certain level causes increased mortality
For each miner, the air level of radon in his mine was measured (\(E^*\))
A subject’s actual exposure depends on the level of radon in the mine and the physical demands of the job and this was measured by lung dosimetry (\(E\): 0 = below threshold of interest, 1 = above)
It is known that 6 months of physical exertion at age 25 years has no independent effect on subsequent mortality

Scenario 3

The data come from a randomized controlled trial
\(D\) is death over a 15 year period
Study subjects were randomly assigned to an educational intervention to encourage them to eat a low fat diet (\(E^*=1\) for intervention, \(E^*=0\) for control)
Investigators subsequently measured diet accurately in all trial participants (\(E=1\) for low fat diet, \(E=0\) for non-low fat diet)
Assume the intervention has no effect on \(D\) other than through its effect on actual fat consumption \(E\)

Causal diagrams and conditioning scenarios

Introduction

Question

Scenario 1

Scenario 2

Scenario 3