using BMI.csv, clear import delimited
Examples from our IJE paper
The paper is available here (Spiller, Davies, and Palmer 2018).
mrrobust set-up
Install the mrrobust
package using the user-written github
package.
"https://haghish.github.io/github/")
net install github, from( gitget mrrobust
If you have Stata 12 or earlier you will need to install some of these manually (see here for instructions).
Summary data description and overview
Accompanying this paper are two sets of data BMI.csv, and Height.csv, containing the set of summary estimates required for performing the BMI-serum glucose and height-serum glucose analyses respectively. Each dataset is organised into 5 columns under the following headings:
SNP
: A set of identifying numbers (rsids) for each genetic variantbeta.exposure
: a set of values representing the coefficient from regressing the exposure upon the genetic variant within a GWASbeta.outcome
: a set of values representing the coefficient from regressing the outcome upon the genetic variant within a GWASse.exposure
: a set of values representing the standard error corresponding to the coefficient in beta.exposurese.outcome
: a set of values representing the standard error corresponding to the coefficient in beta.outcome.
Note Stata removes the .
in the variable names when the data is imported.
In BMI.csv the exposure is standardised body mass index (BMI), and is therefore interpreted on a standard deviation scale. The summary statistics are reported by Locke et al. (2015). In Height.csv the exposure is standardised height in meters and also interpreted on a standard deviation scale. The summary statistics are reported by Wood et al. (2014).
For both analyses log transformed serum glucose was used as an outcome, reported by Shin et al. (2014). All the data was obtained from the MRBase GWAS catalogue available at https://www.mrbase.org/ (Hemani et al. 2018) which has now been superseeded by the OpenGWAS API. Genetic variants were pruned so as to be independent (\(R^2\) = 0.0001), and the effect alleles were aligned between the exposure and outcome datasets using the MRBase web application, prior to implementing mrrobust
.
Stata output for each estimation method using mrrobust: BMI-Serum Glucose
Read in data
IVW
mregger betaoutcome betaexposure [aw=1/(seoutcome^2)], ivw
Number of genotypes = 79
Residual standard error = 1.039
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
betaoutcome |
betaexposure | .0231866 .0079957 2.90 0.004 .0075154 .0388578
------------------------------------------------------------------------------
MR-Egger
mregger betaoutcome betaexposure [aw=1/(seoutcome^2)]
Number of genotypes = 79
Residual standard error = 1.046
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
betaoutcome |
slope | .0218507 .0221852 0.98 0.325 -.0216315 .0653329
_cons | .000038 .0005877 0.06 0.948 -.0011138 .0011897
------------------------------------------------------------------------------
Plot of the MR-Egger model
mreggerplot betaoutcome seoutcome betaexposure seexposurequi gr export mreggerplot-bmi.svg, width(600) replace
Weighted median
seed(300818) mrmedian betaoutcome seoutcome betaexposure seexposure, weighted
Number of genotypes = 79
Replications = 1000
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
beta | .0339256 .0120248 2.82 0.005 .0103576 .0574937
------------------------------------------------------------------------------
Stata output using the mode-based estimator using mrrobust: BMI-Serum Glucose
Using the mrmodalplot
command, modal estimates are calculated using bandwidths of 0.25, 0.5, and 1 respectively. This command also produces three overlaid density plots for each value, as shown in the Figure.
seed(300818)
mrmodalplot betaoutcome seoutcome betaexposure seexposure, lc(gs10 gs5 gs0) qui gr export mrmodalplot-bmi.svg, width(600) replace
Number of genotypes = 79
Replications = 1000
Phi = .25
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
beta | .0374507 .0424036 0.88 0.377 -.0456588 .1205602
------------------------------------------------------------------------------
Number of genotypes = 79
Replications = 1000
Phi = .5
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
beta | .0416424 .0369758 1.13 0.260 -.0308289 .1141137
------------------------------------------------------------------------------
Number of genotypes = 79
Replications = 1000
Phi = 1
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
beta | .0431816 .0281684 1.53 0.125 -.0120274 .0983906
------------------------------------------------------------------------------
Stata output for each estimation method using mrrobust: Height-Serum Glucose
Read in data
using Height.csv, clear import delimited
IVW
mregger betaoutcome betaexposure [aw=1/(seoutcome^2)], ivw
Number of genotypes = 367
Residual standard error = 1.044
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
betaoutcome |
betaexposure | .0015412 .0033017 0.47 0.641 -.00493 .0080124
------------------------------------------------------------------------------
MR-Egger
mregger betaoutcome betaexposure [aw=1/(seoutcome^2)]
Number of genotypes = 367
Residual standard error = 1.045
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
betaoutcome |
slope | -.0025878 .0091178 -0.28 0.777 -.0204584 .0152828
_cons | .0001338 .0002754 0.49 0.627 -.000406 .0006736
------------------------------------------------------------------------------
Plot of the MR-Egger model
mreggerplot betaoutcome seoutcome betaexposure seexposurequi gr export mreggerplot-height.svg, width(600) replace
Weighted median
seed(300818) mrmedian betaoutcome seoutcome betaexposure seexposure, weighted
Number of genotypes = 367
Replications = 1000
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
beta | 0 .0052323 0.00 1.000 -.0102551 .0102551
------------------------------------------------------------------------------
Stata output using the mode-based estimator using mrrobust: Height-Serum Glucose
seed(300818)
mrmodalplot betaoutcome seoutcome betaexposure seexposure, lc(gs10 gs5 gs0) qui gr export mrmodalplot-height.svg, width(600) replace
Number of genotypes = 367
Replications = 1000
Phi = .25
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
beta | .0061368 .0245472 0.25 0.803 -.0419748 .0542484
------------------------------------------------------------------------------
Number of genotypes = 367
Replications = 1000
Phi = .5
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
beta | .0015595 .0212232 0.07 0.941 -.0400372 .0431561
------------------------------------------------------------------------------
Number of genotypes = 367
Replications = 1000
Phi = 1
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
beta | -.0054772 .0149074 -0.37 0.713 -.0346952 .0237408
------------------------------------------------------------------------------