How to include mrrobust Stata code in an R Markdown or Quarto document using the Statamarkdown R package

Introduction

This example shows how to run R and Stata code within the same R Markdown (.Rmd) or Quarto (.qmd) file using the Statamarkdown R package. More information about this package is available here and here. To install this package run the following in R.

install.packages("Statamarkdown")

Note when writing our Stata code chunks we need to be careful when we specify the collectcode=TRUE code chunk option, because each Stata code chunk is run as a separate batch job. For example, we include this chunk option in a chunk which reads in a dataset which we wish to use in subsequent chunks.

Using R and Stata code in the same script means that we can use the functions provided by the TwoSampleMR package to obtain data from the OpenGWAS API, which is the successor to MR-Base (Hemani et al. 2018).

The OpenGWAS API requires authentication. You must place an OPENGWAS_JWT token (by creating a free account on https://api.opengwas.io/profile/) in your .Renviron file before running the code below. See the ieugwasr website for more information.

Next we install the other required packages in R.

install.packages(
  "TwoSampleMR",
  repos = c(
    "https://mrcieu.r-universe.dev",
    "https://cloud.r-project.org"
  ),
  dependencies = TRUE
)
install.packages("foreign")

Example R Markdown or Quarto document

Save the following code chunk as an .Rmd or .qmd file, e.g. myanalysis.Rmd.

---
title: Using mrrobust in an R Markdown or Quarto document
---

## Extracting data from MR-Base

We will be running the script from the MR-Base paper 
([Hemani et al., 2018](https://doi.org/10.7554/eLife.34408)).
The R code we will use is from
[here](https://raw.githubusercontent.com/explodecomputer/mr-base-methods-paper/master/scripts/ldl-chd.R).

We load the packages into our R session.
Note that the **foreign** package provides the `write.dta()` function
which we will use to save the data in Stata format.

```{r}
library(TwoSampleMR)
library(MRInstruments)
library(foreign)
library(Statamarkdown)
```

We can access the data using the **MRInstruments** package.

```{r}
data(gwas_catalog)

# Get published SNPs for LDL cholesterol
ldl_snps <-
  subset(gwas_catalog,
    grepl("LDL choles", Phenotype) & Author == "Willer CJ")$SNP

# Extract from GLGC dataset
exposure <-
  convert_outcome_to_exposure(
    extract_outcome_data(ldl_snps, "ieu-a-300"))

# Get outcome data from Cardiogram 2015
outcome <- extract_outcome_data(exposure$SNP, "ieu-a-7")

# Harmonise exposure and outcome datasets
# Assume alleles are on the forward strand
dat <- harmonise_data(exposure, outcome, action = 1)
```

At this point we have our harmonised genotype-exposure and genotype-outcome
association data saved in an object in our R session called `dat`.

The next two code chunks perform the analysis in R.

```{r}
# Perform MR analysis
mr(dat)
mr_heterogeneity(dat)
dat$exposure <- "LDL cholesterol"
dat$outcome <- "Coronary heart disease"

# Label outliers and create plots
dat$labels <- dat$SNP
dat$labels[! dat$SNP %in% c("rs11065987", "rs1250229", "rs4530754")] <- NA
```

To proceed in Stata we can save our `dat` object as a Stata dataset

```{r}
write.dta(dat, file = "dat.dta")
```

## Analysis in Stata using the mrrobust package

At this point in Stata install the **mrrobust** package and its
dependencies if you have not done so previously.

```stata
net install github, from("https://haghish.github.io/github/")
gitget mrrobust
```

We now read the dataset into Stata and look at the variable names
and the number of observations.

```{stata, collectcode=TRUE}
qui use dat, clear
```

```{stata}
ds, v(28)
```

```{stata}
di _N
```

We can then run the IVW model using `mregger` with fixed effect standard errors.

```{stata}
mregger beta_outcome beta_exposure [aw=1/(se_outcome^2)], ivw fe
```

We then fit the MR-Egger, median, and modal based estimators.

```{stata}
mregger beta_outcome beta_exposure [aw=1/(se_outcome^2)]
```

```{stata}
mrmedian beta_outcome se_outcome beta_exposure se_exposure, weighted
```

```{stata}
mrmodal beta_outcome se_outcome beta_exposure se_exposure, weighted
```

And we could continue with additional Stata code (or indeed R code) as we like.

To render your R Markdown file, either; open it in RStudio and click the Knit button in the toolbar of the Source code window, or run

rmarkdown::render('myanalysis.Rmd')

or to render a Quarto file run the following in your Terminal

quarto render myanalysis.qmd

References

Hemani, Gibran, Jie Zheng, Benjamin Elsworth, Kaitlin H Wade, Valeriia Haberland, Denis Baird, Charles Laurin, et al. 2018. “The MR-Base platform supports systematic causal inference across the human phenome.” eLife 7: e34408. https://doi.org/10.7554/eLife.34408.