Converting an odds ratio to a risk difference
library(tidyverse)
library(gtsummary)
library(DescTools)
Simulate a genotype g
and an outcome y
set.seed(12345)
<- 1e5
n <- 0.15
minor_allele_freq <- rbinom(n, 2, minor_allele_freq)
genotype <- -4 + log(1.5)*genotype
logitp <- plogis(logitp)
p <- rbinom(n, 1, p)
y <- data.frame(genotype, y) dat
prop.table(table(genotype))
genotype
0 1 2
0.72244 0.25449 0.02307
prop.table(table(y))
y
0 1
0.97901 0.02099
table(genotype, y)
y
genotype 0 1
0 70927 1317
1 24752 697
2 2222 85
prop.table(table(genotype, y))
y
genotype 0 1
0 0.70927 0.01317
1 0.24752 0.00697
2 0.02222 0.00085
prop.table(table(genotype, y), margin = 1)
y
genotype 0 1
0 0.98177011 0.01822989
1 0.97261189 0.02738811
2 0.96315561 0.03684439
prop.table(table(genotype, y), margin = 2)
y
genotype 0 1
0 0.72447677 0.62744164
1 0.25282684 0.33206289
2 0.02269640 0.04049547
Logistic regression - check simulated OR of ~1.5
<- glm(y ~ genotype, family = binomial)
logreg <- exp(coef(logreg)[2])
oddsratio names(oddsratio) <- "Odds ratio"
print(oddsratio)
Odds ratio
1.483173
tbl_regression(logreg, exponentiate = TRUE)
Characteristic | OR1 | 95% CI1 | p-value |
---|---|---|---|
genotype | 1.48 | 1.37, 1.60 | <0.001 |
1 OR = Odds Ratio, CI = Confidence Interval |
Risk difference
<- glm(y ~ genotype, family = binomial(link = "identity"))
rdreg summary(rdreg)
Call:
glm(formula = y ~ genotype, family = binomial(link = "identity"))
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.0182261 0.0004945 36.857 <2e-16 ***
genotype 0.0091936 0.0010087 9.115 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 20373 on 99999 degrees of freedom
Residual deviance: 20276 on 99998 degrees of freedom
AIC: 20280
Number of Fisher Scoring iterations: 2
tbl_regression(rdreg, estimate_fun = function(x) style_sigfig(x, digits = 6))
Characteristic | Beta | 95% CI1 | p-value |
---|---|---|---|
genotype | 0.009194 | 0.007249, 0.011202 | <0.001 |
1 CI = Confidence Interval |
<- coef(rdreg)[1]
p0 <- coef(rdreg)[2]
rdiff <- rdiff + p0
p1 <- c(p0, p1, rdiff)
res0 names(res0) <- c("p0", "p1", "Risk difference")
print(res0)
p0 p1 Risk difference
0.01822615 0.02741977 0.00919362
Formula from Stack Exchange
<- (oddsratio * p0)/(1 + (oddsratio * p0) - p0)
pt <- c(p0, pt, pt - p0, pt / p0)
results names(results) <- c("p_0", "p_t", "Risk difference", "Risk ratio")
print(results)
p_0 p_t Risk difference Risk ratio
0.018226148 0.026796549 0.008570401 1.470225608
Check OR to RR conversion with DescTools
ORToRelRisk(oddsratio, p0)
Odds ratio
1.470226
Comparison using effectsize package
library(effectsize)
riskratio_to_arr(pt / p0, p0)
Odds ratio
0.008570401
Comparison using the risks package
library(risks)
Attaching package: 'risks'
The following object is masked from 'package:effectsize':
riskratio
<- riskdiff(y ~ genotype, data = dat)
fit summary(fit)
Risk difference model, fitted via marginal standardization of a logistic model with delta method (margstd_delta).
Call:
stats::glm(formula = y ~ genotype, family = binomial(link = "logit"),
data = dat, start = "(no starting values)")
Coefficients: (1 not defined because of singularities)
Estimate Std. Error z value Pr(>|z|)
genotype 0.0080924 0.0008088 10.01 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 20373 on 99999 degrees of freedom
Residual deviance: 20277 on 99998 degrees of freedom
AIC: 20281
Number of Fisher Scoring iterations: 6
Confidence intervals for coefficients: (delta method)
2.5 % 97.5 %
genotype 0.006507192 0.009677526
References
- Zhang J, Yu KF. What’s the Relative Risk? A Method of Correcting the Odds Ratio in Cohort Studies of Common Outcomes. JAMA. 1998;280(19):1690–1691. doi:10.1001/jama.280.19.1690 https://doi.org/10.1001/jama.280.19.1690
- Grant R, BMJ 2014, 348 https://doi.org/10.1136/bmj.f7450