12. IP Weighting and Marginal Structural Models: Stata

library(Statamarkdown)
/***************************************************************
Stata code for Causal Inference: What If by Miguel Hernan & Jamie Robins
Date: 10/10/2019
Author: Eleanor Murray 
For errors contact: ejmurray@bu.edu
***************************************************************/

Program 12.1

  • Descriptive statistics from NHEFS data (Table 12.1)
use ./data/nhefs, clear

/*Provisionally ignore subjects with missing values for follow-up weight*/
/*Sample size after exclusion: N = 1566*/
drop if wt82==.

/* Calculate mean weight change in those with and without smoking cessation*/
label define qsmk 0 "No smoking cessation" 1 "Smoking cessation"
label values qsmk qsmk
by qsmk, sort: egen years = mean(age) if age < . 
label var years "Age, years"
by qsmk, sort: egen male = mean(100 * (sex==0)) if sex < . 
label var male "Men, %"
by qsmk, sort: egen white = mean(100 * (race==0)) if race < . 
label var white "White, %"
by qsmk, sort: egen university = mean(100 * (education == 5)) if education < .
label var university "University, %"
by qsmk, sort: egen kg = mean(wt71) if wt71 < .
label var kg "Weight, kg"
by qsmk, sort: egen cigs = mean(smokeintensity) if smokeintensity < . 
label var cigs "Cigarettes/day"
by qsmk, sort: egen meansmkyrs = mean(smokeyrs) if smokeyrs < .
label var kg "Years smoking"
by qsmk, sort: egen noexer = mean(100 * (exercise == 2)) if exercise < . 
label var noexer "Little/no exercise"
by qsmk, sort: egen inactive = mean(100 * (active==2)) if active < . 
label var inactive "Inactive daily life"
qui save ./data/nhefs-formatted, replace
(63 observations deleted)
use ./data/nhefs-formatted, clear

/*Output table*/
foreach var of varlist years male  white university kg cigs meansmkyrs noexer inactive {
  tabdisp qsmk, cell(`var') format(%3.1f)
}
  2.   tabdisp qsmk, cell(`var') format(%3.1f)
  3. }

---------------------------------
quit smoking between |
baseline and 1982    | Age, years
---------------------+-----------
No smoking cessation |       42.8
   Smoking cessation |       46.2
---------------------------------

---------------------------------
quit smoking between |
baseline and 1982    |     Men, %
---------------------+-----------
No smoking cessation |       46.6
   Smoking cessation |       54.6
---------------------------------

---------------------------------
quit smoking between |
baseline and 1982    |   White, %
---------------------+-----------
No smoking cessation |       85.4
   Smoking cessation |       91.1
---------------------------------

------------------------------------
quit smoking between |
baseline and 1982    | University, %
---------------------+--------------
No smoking cessation |           9.9
   Smoking cessation |          15.4
------------------------------------

------------------------------------
quit smoking between |
baseline and 1982    | Years smoking
---------------------+--------------
No smoking cessation |          70.3
   Smoking cessation |          72.4
------------------------------------

-------------------------------------
quit smoking between |
baseline and 1982    | Cigarettes/day
---------------------+---------------
No smoking cessation |           21.2
   Smoking cessation |           18.6
-------------------------------------

---------------------------------
quit smoking between |
baseline and 1982    | meansmkyrs
---------------------+-----------
No smoking cessation |       24.1
   Smoking cessation |       26.0
---------------------------------

-----------------------------------------
quit smoking between |
baseline and 1982    | Little/no exercise
---------------------+-------------------
No smoking cessation |               37.9
   Smoking cessation |               40.7
-----------------------------------------

------------------------------------------
quit smoking between |
baseline and 1982    | Inactive daily life
---------------------+--------------------
No smoking cessation |                 8.9
   Smoking cessation |                11.2
------------------------------------------

Program 12.2

  • Estimating IP weights for Section 12.2
  • Data from NHEFS
use ./data/nhefs-formatted, clear

/*Fit a logistic model for the IP weights*/ 
logit qsmk sex race c.age##c.age ib(last).education c.smokeintensity##c.smokeintensity ///
c.smokeyrs##c.smokeyrs ib(last).exercise ib(last).active c.wt71##c.wt71 

/*Output predicted conditional probability of quitting smoking for each individual*/
predict p_qsmk, pr

/*Generate nonstabilized weights as P(A=1|covariates) if A = 1 and */
/* 1-P(A=1|covariates) if A = 0*/
gen w=.
replace w=1/p_qsmk if qsmk==1
replace w=1/(1-p_qsmk) if qsmk==0
/*Check the mean of the weights; we expect it to be close to 2.0*/
summarize w

/*Fit marginal structural model in the pseudopopulation*/
/*Weights assigned using pweight = w*/
/*Robust standard errors using cluster() option where 'seqn' is the ID variable*/
regress wt82_71 qsmk [pweight=w], cluster(seqn) 
Iteration 0:  Log likelihood = -893.02712  
Iteration 1:  Log likelihood = -839.70016  
Iteration 2:  Log likelihood = -838.45045  
Iteration 3:  Log likelihood = -838.44842  
Iteration 4:  Log likelihood = -838.44842  

Logistic regression                                     Number of obs =  1,566
                                                        LR chi2(18)   = 109.16
                                                        Prob > chi2   = 0.0000
Log likelihood = -838.44842                             Pseudo R2     = 0.0611

-----------------------------------------------------------------------------------
             qsmk | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
              sex |  -.5274782   .1540497    -3.42   0.001      -.82941   -.2255463
             race |  -.8392636   .2100668    -4.00   0.000    -1.250987   -.4275404
              age |   .1212052   .0512663     2.36   0.018     .0207251    .2216853
                  |
      c.age#c.age |  -.0008246   .0005361    -1.54   0.124    -.0018753    .0002262
                  |
        education |
               1  |  -.4759606   .2262238    -2.10   0.035    -.9193511   -.0325701
               2  |  -.5047361    .217597    -2.32   0.020    -.9312184   -.0782538
               3  |  -.3895288   .1914353    -2.03   0.042    -.7647351   -.0143226
               4  |  -.4123596   .2772868    -1.49   0.137    -.9558318    .1311126
                  |
   smokeintensity |  -.0772704   .0152499    -5.07   0.000    -.1071596   -.0473812
                  |
 c.smokeintensity#|
 c.smokeintensity |   .0010451   .0002866     3.65   0.000     .0004835    .0016068
                  |
         smokeyrs |  -.0735966   .0277775    -2.65   0.008    -.1280395   -.0191538
                  |
       c.smokeyrs#|
       c.smokeyrs |   .0008441   .0004632     1.82   0.068    -.0000637    .0017519
                  |
         exercise |
               0  |   -.395704   .1872401    -2.11   0.035    -.7626878   -.0287201
               1  |  -.0408635   .1382674    -0.30   0.768    -.3118627    .2301357
                  |
           active |
               0  |   -.176784   .2149721    -0.82   0.411    -.5981215    .2445535
               1  |  -.1448395   .2111472    -0.69   0.493    -.5586806    .2690015
                  |
             wt71 |  -.0152357   .0263161    -0.58   0.563    -.0668144     .036343
                  |
    c.wt71#c.wt71 |   .0001352   .0001632     0.83   0.407    -.0001846     .000455
                  |
            _cons |   -1.19407   1.398493    -0.85   0.393    -3.935066    1.546925
-----------------------------------------------------------------------------------


(1,566 missing values generated)

(403 real changes made)

(1,163 real changes made)

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
           w |      1,566    1.996284    1.474787   1.053742   16.70009

(sum of wgt is 3,126.18084549904)

Linear regression                               Number of obs     =      1,566
                                                F(1, 1565)        =      42.81
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0435
                                                Root MSE          =     8.0713

                               (Std. err. adjusted for 1,566 clusters in seqn)
------------------------------------------------------------------------------
             |               Robust
     wt82_71 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        qsmk |   3.440535   .5258294     6.54   0.000     2.409131     4.47194
       _cons |   1.779978   .2248742     7.92   0.000     1.338892    2.221065
------------------------------------------------------------------------------

Program 12.3

  • Estimating stabilized IP weights for Section 12.3
  • Data from NHEFS
use ./data/nhefs-formatted, clear

/*Fit a logistic model for the denominator of the IP weights and predict the */
/* conditional probability of smoking */
logit qsmk sex race c.age##c.age ib(last).education c.smokeintensity##c.smokeintensity ///
c.smokeyrs##c.smokeyrs ib(last).exercise ib(last).active c.wt71##c.wt71  
predict pd_qsmk, pr

/*Fit a logistic model for the numerator of ip weights and predict Pr(A=1) */ 
logit qsmk 
predict pn_qsmk, pr

/*Generate stabilized weights as f(A)/f(A|L)*/
gen sw_a=.
replace sw_a=pn_qsmk/pd_qsmk if qsmk==1
replace sw_a=(1-pn_qsmk)/(1-pd_qsmk) if qsmk==0

/*Check distribution of the stabilized weights*/
summarize sw_a

/*Fit marginal structural model in the pseudopopulation*/
regress wt82_71 qsmk [pweight=sw_a], cluster(seqn) 

/**********************************************************
FINE POINT 12.2
Checking positivity
**********************************************************/

/*Check for missing values within strata of covariates, for example: */
tab age qsmk if race==0 & sex==1 & wt82!=.
tab age qsmk if race==1 & sex==1 & wt82!=.
Iteration 0:  Log likelihood = -893.02712  
Iteration 1:  Log likelihood = -839.70016  
Iteration 2:  Log likelihood = -838.45045  
Iteration 3:  Log likelihood = -838.44842  
Iteration 4:  Log likelihood = -838.44842  

Logistic regression                                     Number of obs =  1,566
                                                        LR chi2(18)   = 109.16
                                                        Prob > chi2   = 0.0000
Log likelihood = -838.44842                             Pseudo R2     = 0.0611

-----------------------------------------------------------------------------------
             qsmk | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
              sex |  -.5274782   .1540497    -3.42   0.001      -.82941   -.2255463
             race |  -.8392636   .2100668    -4.00   0.000    -1.250987   -.4275404
              age |   .1212052   .0512663     2.36   0.018     .0207251    .2216853
                  |
      c.age#c.age |  -.0008246   .0005361    -1.54   0.124    -.0018753    .0002262
                  |
        education |
               1  |  -.4759606   .2262238    -2.10   0.035    -.9193511   -.0325701
               2  |  -.5047361    .217597    -2.32   0.020    -.9312184   -.0782538
               3  |  -.3895288   .1914353    -2.03   0.042    -.7647351   -.0143226
               4  |  -.4123596   .2772868    -1.49   0.137    -.9558318    .1311126
                  |
   smokeintensity |  -.0772704   .0152499    -5.07   0.000    -.1071596   -.0473812
                  |
 c.smokeintensity#|
 c.smokeintensity |   .0010451   .0002866     3.65   0.000     .0004835    .0016068
                  |
         smokeyrs |  -.0735966   .0277775    -2.65   0.008    -.1280395   -.0191538
                  |
       c.smokeyrs#|
       c.smokeyrs |   .0008441   .0004632     1.82   0.068    -.0000637    .0017519
                  |
         exercise |
               0  |   -.395704   .1872401    -2.11   0.035    -.7626878   -.0287201
               1  |  -.0408635   .1382674    -0.30   0.768    -.3118627    .2301357
                  |
           active |
               0  |   -.176784   .2149721    -0.82   0.411    -.5981215    .2445535
               1  |  -.1448395   .2111472    -0.69   0.493    -.5586806    .2690015
                  |
             wt71 |  -.0152357   .0263161    -0.58   0.563    -.0668144     .036343
                  |
    c.wt71#c.wt71 |   .0001352   .0001632     0.83   0.407    -.0001846     .000455
                  |
            _cons |   -1.19407   1.398493    -0.85   0.393    -3.935066    1.546925
-----------------------------------------------------------------------------------



Iteration 0:  Log likelihood = -893.02712  
Iteration 1:  Log likelihood = -893.02712  

Logistic regression                                     Number of obs =  1,566
                                                        LR chi2(0)    =   0.00
                                                        Prob > chi2   =      .
Log likelihood = -893.02712                             Pseudo R2     = 0.0000

------------------------------------------------------------------------------
        qsmk | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |  -1.059822   .0578034   -18.33   0.000    -1.173114    -.946529
------------------------------------------------------------------------------


(1,566 missing values generated)

(403 real changes made)

(1,163 real changes made)

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        sw_a |      1,566    .9988444    .2882233   .3312489   4.297662

(sum of wgt is 1,564.19025221467)

Linear regression                               Number of obs     =      1,566
                                                F(1, 1565)        =      42.81
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0359
                                                Root MSE          =     7.7972

                               (Std. err. adjusted for 1,566 clusters in seqn)
------------------------------------------------------------------------------
             |               Robust
     wt82_71 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        qsmk |   3.440535   .5258294     6.54   0.000     2.409131     4.47194
       _cons |   1.779978   .2248742     7.92   0.000     1.338892    2.221065
------------------------------------------------------------------------------

           | quit smoking between
           |   baseline and 1982
       age | No smokin  Smoking c |     Total
-----------+----------------------+----------
        25 |        24          3 |        27 
        26 |        14          5 |        19 
        27 |        18          2 |        20 
        28 |        20          5 |        25 
        29 |        15          4 |        19 
        30 |        14          5 |        19 
        31 |        11          5 |        16 
        32 |        14          7 |        21 
        33 |        12          3 |        15 
        34 |        22          5 |        27 
        35 |        16          5 |        21 
        36 |        13          3 |        16 
        37 |        14          1 |        15 
        38 |         6          2 |         8 
        39 |        19          4 |        23 
        40 |        10          4 |        14 
        41 |        13          3 |        16 
        42 |        16          3 |        19 
        43 |        14          3 |        17 
        44 |         9          4 |        13 
        45 |        12          5 |        17 
        46 |        19          4 |        23 
        47 |        19          4 |        23 
        48 |        19          4 |        23 
        49 |        11          3 |        14 
        50 |        18          4 |        22 
        51 |         9          3 |        12 
        52 |        11          3 |        14 
        53 |        11          4 |        15 
        54 |        17          9 |        26 
        55 |         9          4 |        13 
        56 |         8          7 |        15 
        57 |         9          2 |        11 
        58 |         8          4 |        12 
        59 |         5          4 |         9 
        60 |         5          4 |         9 
        61 |         5          2 |         7 
        62 |         6          5 |        11 
        63 |         3          3 |         6 
        64 |         7          1 |         8 
        65 |         3          2 |         5 
        66 |         4          0 |         4 
        67 |         2          0 |         2 
        69 |         6          2 |         8 
        70 |         2          1 |         3 
        71 |         0          1 |         1 
        72 |         2          2 |         4 
        74 |         0          1 |         1 
-----------+----------------------+----------
     Total |       524        164 |       688 

           | quit smoking between
           |   baseline and 1982
       age | No smokin  Smoking c |     Total
-----------+----------------------+----------
        25 |         3          1 |         4 
        26 |         3          0 |         3 
        28 |         3          1 |         4 
        29 |         1          0 |         1 
        30 |         4          0 |         4 
        31 |         3          0 |         3 
        32 |         8          0 |         8 
        33 |         2          0 |         2 
        34 |         2          1 |         3 
        35 |         3          0 |         3 
        36 |         5          0 |         5 
        37 |         3          1 |         4 
        38 |         4          2 |         6 
        39 |         1          1 |         2 
        40 |         2          2 |         4 
        41 |         3          0 |         3 
        42 |         3          0 |         3 
        43 |         4          2 |         6 
        44 |         3          0 |         3 
        45 |         1          3 |         4 
        46 |         5          0 |         5 
        47 |         3          0 |         3 
        48 |         4          0 |         4 
        49 |         1          1 |         2 
        50 |         2          0 |         2 
        51 |         4          0 |         4 
        52 |         1          0 |         1 
        53 |         2          0 |         2 
        54 |         2          0 |         2 
        55 |         3          0 |         3 
        56 |         2          1 |         3 
        57 |         2          1 |         3 
        61 |         1          1 |         2 
        67 |         1          0 |         1 
        68 |         1          0 |         1 
        69 |         2          0 |         2 
        70 |         0          1 |         1 
-----------+----------------------+----------
     Total |        97         19 |       116 

Program 12.4

  • Estimating the parameters of a marginal structural mean model with a continuous treatment Data from NHEFS
  • Section 12.4
use ./data/nhefs-formatted, clear

* drop sw_a

/*Analysis restricted to subjects reporting <=25 cig/day at baseline: N = 1162*/
keep if smokeintensity <=25

/*Fit a linear model for the denominator of the IP weights and calculate the */
/* mean expected smoking intensity*/ 
regress smkintensity82_71 sex race c.age##c.age ib(last).education ///
c.smokeintensity##c.smokeintensity c.smokeyrs##c.smokeyrs ///
ib(last).exercise ib(last).active c.wt71##c.wt71
quietly predict p_den

/*Generate the denisty of the denomiator expectation using the mean expected */
/* smoking intensity and the residuals, assuming a normal distribution*/
/*Note: The regress command in Stata saves the root mean squared error for the */
/* immediate regression as e(rmse), thus there is no need to calculate it again. */
gen dens_den = normalden(smkintensity82_71, p_den, e(rmse))

/*Fit a linear model for the numerator of ip weights, calculate the mean */
/* expected value, and generate the density*/
quietly regress smkintensity82_71
quietly predict p_num
gen dens_num = normalden( smkintensity82_71, p_num, e(rmse))

/*Generate the final stabilized weights from the estimated numerator and */
/* denominator, and check the weights distribution*/
gen sw_a=dens_num/dens_den
summarize sw_a

/*Fit a marginal structural model in the pseudopopulation*/
regress wt82_71  c.smkintensity82_71##c.smkintensity82_71 [pweight=sw_a], cluster(seqn)

/*Output the estimated mean Y value when smoke intensity is unchanged from */
/* baseline to 1982 */
lincom _b[_cons]

/*Output the estimated mean Y value when smoke intensity increases by 20 from */
/* baseline to 1982*/
lincom _b[_cons] + 20*_b[smkintensity82_71 ] + ///
  400*_b[c.smkintensity82_71#c.smkintensity82_71]
(404 observations deleted)

      Source |       SS           df       MS      Number of obs   =     1,162
-------------+----------------------------------   F(18, 1143)     =      5.39
       Model |  9956.95654        18  553.164252   Prob > F        =    0.0000
    Residual |   117260.18     1,143  102.589834   R-squared       =    0.0783
-------------+----------------------------------   Adj R-squared   =    0.0638
       Total |  127217.137     1,161  109.575484   Root MSE        =    10.129

-----------------------------------------------------------------------------------
smkintensity82_71 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
------------------+----------------------------------------------------------------
              sex |   1.087021   .7425694     1.46   0.144    -.3699308    2.543973
             race |   .2319789   .8434739     0.28   0.783    -1.422952     1.88691
              age |  -.8099902   .2555388    -3.17   0.002    -1.311368   -.3086124
                  |
      c.age#c.age |   .0066545   .0026849     2.48   0.013     .0013865    .0119224
                  |
        education |
               1  |   1.508097   1.184063     1.27   0.203    -.8150843    3.831278
               2  |    2.02692   1.133772     1.79   0.074    -.1975876    4.251428
               3  |   2.240314   1.022556     2.19   0.029     .2340167    4.246611
               4  |   2.528767    1.44702     1.75   0.081    -.3103458     5.36788
                  |
   smokeintensity |  -.3589684   .2246653    -1.60   0.110     -.799771    .0818342
                  |
 c.smokeintensity#|
 c.smokeintensity |   .0019582   .0085753     0.23   0.819    -.0148668    .0187832
                  |
         smokeyrs |   .3857088   .1416765     2.72   0.007     .1077336    .6636841
                  |
       c.smokeyrs#|
       c.smokeyrs |  -.0054871   .0023837    -2.30   0.022    -.0101641   -.0008101
                  |
         exercise |
               0  |   1.996904   .9080421     2.20   0.028      .215288    3.778521
               1  |    .988812   .6929239     1.43   0.154    -.3707334    2.348357
                  |
           active |
               0  |   .8451341   1.098573     0.77   0.442    -1.310312    3.000581
               1  |    .800114    1.08438     0.74   0.461    -1.327485    2.927712
                  |
             wt71 |  -.0656882    .136955    -0.48   0.632    -.3343996    .2030232
                  |
    c.wt71#c.wt71 |   .0005711    .000877     0.65   0.515    -.0011496    .0022918
                  |
            _cons |   16.86761   7.109189     2.37   0.018      2.91909    30.81614
-----------------------------------------------------------------------------------

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        sw_a |      1,162    .9968057    .3222937   .1938336   5.102339

(sum of wgt is 1,158.28818286955)

Linear regression                               Number of obs     =      1,162
                                                F(2, 1161)        =      12.75
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0233
                                                Root MSE          =     7.7864

                                      (Std. err. adjusted for 1,162 clusters in seqn)
-------------------------------------------------------------------------------------
                    |               Robust
            wt82_71 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
--------------------+----------------------------------------------------------------
  smkintensity82_71 |  -.1089889   .0315762    -3.45   0.001    -.1709417   -.0470361
                    |
                 c. |
  smkintensity82_71#|
c.smkintensity82_71 |   .0026949   .0024203     1.11   0.266    -.0020537    .0074436
                    |
              _cons |   2.004525    .295502     6.78   0.000     1.424747    2.584302
-------------------------------------------------------------------------------------


 ( 1)  _cons = 0

------------------------------------------------------------------------------
     wt82_71 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   2.004525    .295502     6.78   0.000     1.424747    2.584302
------------------------------------------------------------------------------


 ( 1)  20*smkintensity82_71 + 400*c.smkintensity82_71#c.smkintensity82_71 + _cons = 0

------------------------------------------------------------------------------
     wt82_71 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   .9027234   1.310533     0.69   0.491    -1.668554    3.474001
------------------------------------------------------------------------------

Program 12.5

  • Estimating the parameters of a marginal structural logistic model
  • Data from NHEFS
  • Section 12.4
use ./data/nhefs, clear

/*Provisionally ignore subjects with missing values for follow-up weight*/
/*Sample size after exclusion: N = 1566*/
drop if wt82==.

/*Estimate the stabilized weights for quitting smoking as in PROGRAM 12.3*/
/*Fit a logistic model for the denominator of the IP weights and predict the */
/* conditional probability of smoking*/ 
logit qsmk sex race c.age##c.age ib(last).education c.smokeintensity##c.smokeintensity ///
c.smokeyrs##c.smokeyrs ib(last).exercise ib(last).active c.wt71##c.wt71  
predict pd_qsmk, pr
/*Fit a logistic model for the numerator of ip weights and predict Pr(A=1) */ 
logit qsmk 
predict pn_qsmk, pr
/*Generate stabilized weights as f(A)/f(A|L)*/
gen sw_a=.
replace sw_a=pn_qsmk/pd_qsmk if qsmk==1
replace sw_a=(1-pn_qsmk)/(1-pd_qsmk) if qsmk==0
summarize sw_a

/*Fit marginal structural model in the pseudopopulation*/
/*NOTE: Stata has two commands for logistic regression, logit and logistic*/
/*Using logistic allows us to output the odds ratios directly*/
/*We can also output odds ratios from the logit command using the or option */
/* (default logit output is regression coefficients*/
logistic death qsmk [pweight=sw_a], cluster(seqn) 
(63 observations deleted)


Iteration 0:  Log likelihood = -893.02712  
Iteration 1:  Log likelihood = -839.70016  
Iteration 2:  Log likelihood = -838.45045  
Iteration 3:  Log likelihood = -838.44842  
Iteration 4:  Log likelihood = -838.44842  

Logistic regression                                     Number of obs =  1,566
                                                        LR chi2(18)   = 109.16
                                                        Prob > chi2   = 0.0000
Log likelihood = -838.44842                             Pseudo R2     = 0.0611

-----------------------------------------------------------------------------------
             qsmk | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
              sex |  -.5274782   .1540497    -3.42   0.001      -.82941   -.2255463
             race |  -.8392636   .2100668    -4.00   0.000    -1.250987   -.4275404
              age |   .1212052   .0512663     2.36   0.018     .0207251    .2216853
                  |
      c.age#c.age |  -.0008246   .0005361    -1.54   0.124    -.0018753    .0002262
                  |
        education |
               1  |  -.4759606   .2262238    -2.10   0.035    -.9193511   -.0325701
               2  |  -.5047361    .217597    -2.32   0.020    -.9312184   -.0782538
               3  |  -.3895288   .1914353    -2.03   0.042    -.7647351   -.0143226
               4  |  -.4123596   .2772868    -1.49   0.137    -.9558318    .1311126
                  |
   smokeintensity |  -.0772704   .0152499    -5.07   0.000    -.1071596   -.0473812
                  |
 c.smokeintensity#|
 c.smokeintensity |   .0010451   .0002866     3.65   0.000     .0004835    .0016068
                  |
         smokeyrs |  -.0735966   .0277775    -2.65   0.008    -.1280395   -.0191538
                  |
       c.smokeyrs#|
       c.smokeyrs |   .0008441   .0004632     1.82   0.068    -.0000637    .0017519
                  |
         exercise |
               0  |   -.395704   .1872401    -2.11   0.035    -.7626878   -.0287201
               1  |  -.0408635   .1382674    -0.30   0.768    -.3118627    .2301357
                  |
           active |
               0  |   -.176784   .2149721    -0.82   0.411    -.5981215    .2445535
               1  |  -.1448395   .2111472    -0.69   0.493    -.5586806    .2690015
                  |
             wt71 |  -.0152357   .0263161    -0.58   0.563    -.0668144     .036343
                  |
    c.wt71#c.wt71 |   .0001352   .0001632     0.83   0.407    -.0001846     .000455
                  |
            _cons |   -1.19407   1.398493    -0.85   0.393    -3.935066    1.546925
-----------------------------------------------------------------------------------



Iteration 0:  Log likelihood = -893.02712  
Iteration 1:  Log likelihood = -893.02712  

Logistic regression                                    Number of obs =   1,566
                                                       LR chi2(0)    =   -0.00
                                                       Prob > chi2   =       .
Log likelihood = -893.02712                            Pseudo R2     = -0.0000

------------------------------------------------------------------------------
        qsmk | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |  -1.059822   .0578034   -18.33   0.000    -1.173114    -.946529
------------------------------------------------------------------------------


(1,566 missing values generated)

(403 real changes made)

(1,163 real changes made)

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        sw_a |      1,566    .9988444    .2882233   .3312489   4.297662


Logistic regression                                     Number of obs =  1,566
                                                        Wald chi2(1)  =   0.04
                                                        Prob > chi2   = 0.8482
Log pseudolikelihood = -749.11596                       Pseudo R2     = 0.0000

                               (Std. err. adjusted for 1,566 clusters in seqn)
------------------------------------------------------------------------------
             |               Robust
       death | Odds ratio   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        qsmk |   1.030578   .1621842     0.19   0.848     .7570517    1.402931
       _cons |   .2252711   .0177882   -18.88   0.000     .1929707    .2629781
------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

Program 12.6

  • Assessing effect modification by sex using a marginal structural mean model
  • Data from NHEFS
  • Section 12.5
use ./data/nhefs, clear

* drop pd_qsmk pn_qsmk sw_a

/*Check distribution of sex*/
tab sex

/*Fit logistc model for the denominator of IP weights, as in PROGRAM 12.3 */
logit qsmk sex race c.age##c.age ib(last).education c.smokeintensity##c.smokeintensity ///
c.smokeyrs##c.smokeyrs ib(last).exercise ib(last).active c.wt71##c.wt71 
predict pd_qsmk, pr

/*Fit logistic model for the numerator of IP weights, no including sex */
logit qsmk sex
predict pn_qsmk, pr

/*Generate IP weights as before*/
gen sw_a=.
replace sw_a=pn_qsmk/pd_qsmk if qsmk==1
replace sw_a=(1-pn_qsmk)/(1-pd_qsmk) if qsmk==0

summarize sw_a

/*Fit marginal structural model in the pseudopopulation, including interaction */
/* term between quitting smoking and sex*/
regress wt82_71 qsmk##sex [pw=sw_a], cluster(seqn)
        sex |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |        799       49.05       49.05
          1 |        830       50.95      100.00
------------+-----------------------------------
      Total |      1,629      100.00


Iteration 0:  Log likelihood = -938.14308  
Iteration 1:  Log likelihood = -884.53806  
Iteration 2:  Log likelihood = -883.35064  
Iteration 3:  Log likelihood = -883.34876  
Iteration 4:  Log likelihood = -883.34876  

Logistic regression                                     Number of obs =  1,629
                                                        LR chi2(18)   = 109.59
                                                        Prob > chi2   = 0.0000
Log likelihood = -883.34876                             Pseudo R2     = 0.0584

-----------------------------------------------------------------------------------
             qsmk | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
              sex |  -.5075218   .1482316    -3.42   0.001    -.7980505   -.2169932
             race |  -.8502312   .2058722    -4.13   0.000    -1.253733   -.4467292
              age |   .1030132   .0488996     2.11   0.035     .0071718    .1988547
                  |
      c.age#c.age |  -.0006052   .0005074    -1.19   0.233    -.0015998    .0003893
                  |
        education |
               1  |  -.3796632   .2203948    -1.72   0.085     -.811629    .0523026
               2  |  -.4779835   .2141771    -2.23   0.026    -.8977629   -.0582041
               3  |  -.3639645   .1885776    -1.93   0.054    -.7335698    .0056409
               4  |  -.4221892   .2717235    -1.55   0.120    -.9547574     .110379
                  |
   smokeintensity |  -.0651561   .0147589    -4.41   0.000    -.0940831   -.0362292
                  |
 c.smokeintensity#|
 c.smokeintensity |   .0008461   .0002758     3.07   0.002     .0003054    .0013867
                  |
         smokeyrs |  -.0733708   .0269958    -2.72   0.007    -.1262816     -.02046
                  |
       c.smokeyrs#|
       c.smokeyrs |   .0008384   .0004435     1.89   0.059    -.0000307    .0017076
                  |
         exercise |
               0  |  -.3550517   .1799293    -1.97   0.048    -.7077067   -.0023967
               1  |    -.06364   .1351256    -0.47   0.638    -.3284812    .2012013
                  |
           active |
               0  |  -.0683123   .2087269    -0.33   0.743    -.4774095    .3407849
               1  |   -.057437   .2039967    -0.28   0.778    -.4572632    .3423892
                  |
             wt71 |  -.0128478   .0222829    -0.58   0.564    -.0565214    .0308258
                  |
    c.wt71#c.wt71 |   .0001209   .0001352     0.89   0.371     -.000144    .0003859
                  |
            _cons |  -1.185875   1.263142    -0.94   0.348    -3.661588    1.289838
-----------------------------------------------------------------------------------



Iteration 0:  Log likelihood = -938.14308  
Iteration 1:  Log likelihood = -933.49896  
Iteration 2:  Log likelihood = -933.49126  
Iteration 3:  Log likelihood = -933.49126  

Logistic regression                                     Number of obs =  1,629
                                                        LR chi2(1)    =   9.30
                                                        Prob > chi2   = 0.0023
Log likelihood = -933.49126                             Pseudo R2     = 0.0050

------------------------------------------------------------------------------
        qsmk | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         sex |  -.3441893   .1131341    -3.04   0.002     -.565928   -.1224506
       _cons |  -.8634417   .0774517   -11.15   0.000    -1.015244   -.7116391
------------------------------------------------------------------------------


(1,629 missing values generated)

(428 real changes made)

(1,201 real changes made)

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        sw_a |      1,629    .9991318    .2636164   .2901148   3.683352

(sum of wgt is 1,562.01032829285)

Linear regression                               Number of obs     =      1,566
                                                F(3, 1565)        =      16.31
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0379
                                                Root MSE          =     7.8024

                               (Std. err. adjusted for 1,566 clusters in seqn)
------------------------------------------------------------------------------
             |               Robust
     wt82_71 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      1.qsmk |    3.60623   .6576053     5.48   0.000      2.31635     4.89611
       1.sex |  -.0040025   .4496206    -0.01   0.993    -.8859246    .8779197
             |
    qsmk#sex |
        1 1  |   -.161224   1.036143    -0.16   0.876      -2.1936    1.871152
             |
       _cons |   1.759045   .3102511     5.67   0.000     1.150494    2.367597
------------------------------------------------------------------------------

Program 12.7

  • Estimating IP weights to adjust for selection bias due to censoring
  • Data from NHEFS
  • Section 12.6
use ./data/nhefs, clear

/*Analysis including all individuals regardless of missing wt82 status: N=1629*/
/*Generate censoring indicator: C = 1 if wt82 missing*/
gen byte cens = (wt82 == .)

/*Check distribution of censoring by quitting smoking and baseline weight*/
tab cens qsmk, column
bys cens: summarize wt71

/*Fit logistic regression model for the  denominator of IP weight for A*/
logit qsmk sex race c.age##c.age ib(last).education c.smokeintensity##c.smokeintensity ///
c.smokeyrs##c.smokeyrs ib(last).exercise ib(last).active c.wt71##c.wt71 
predict pd_qsmk, pr

/*Fit logistic regression model for the numerator of IP weights for A*/
logit qsmk
predict pn_qsmk, pr

/*Fit logistic regression model for the denominator of IP weights for C, */
/* including quitting smoking*/
logit cens qsmk sex race c.age##c.age ib(last).education ///
c.smokeintensity##c.smokeintensity c.smokeyrs##c.smokeyrs ib(last).exercise ///
ib(last).active c.wt71##c.wt71 
predict pd_cens, pr

/*Fit logistic regression model for the  numerator of IP weights for C, */
/* including quitting smoking */
logit cens qsmk
predict pn_cens, pr

/*Generate the stabilized weights for A (sw_a)*/
gen sw_a=.
replace sw_a=pn_qsmk/pd_qsmk if qsmk==1
replace sw_a=(1-pn_qsmk)/(1-pd_qsmk) if qsmk==0

/*Generate the stabilized weights for C (sw_c)*/
/*NOTE: the conditional probability estimates generated by our logistic models */
/* for C represent the conditional probability of being censored (C=1)*/
/*We want weights for the conditional probability of bing uncensored, Pr(C=0|A,L)*/
gen sw_c=.
replace sw_c=(1-pn_cens)/(1-pd_cens) if cens==0

/*Generate the final stabilized weights and check distribution*/
gen sw=sw_a*sw_c
summarize sw

/*Fit marginal structural model in the pseudopopulation*/
regress wt82_71 qsmk [pw=sw], cluster(seqn)
| Key               |
|-------------------|
|     frequency     |
| column percentage |
+-------------------+

           | quit smoking between
           |   baseline and 1982
      cens |         0          1 |     Total
-----------+----------------------+----------
         0 |     1,163        403 |     1,566 
           |     96.84      94.16 |     96.13 
-----------+----------------------+----------
         1 |        38         25 |        63 
           |      3.16       5.84 |      3.87 
-----------+----------------------+----------
     Total |     1,201        428 |     1,629 
           |    100.00     100.00 |    100.00 


--------------------------------------------------------------------------------------
-> cens = 0

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        wt71 |      1,566    70.83092     15.3149      39.58     151.73

--------------------------------------------------------------------------------------
-> cens = 1

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        wt71 |         63    76.55079     23.3326      36.17     169.19



Iteration 0:  Log likelihood = -938.14308  
Iteration 1:  Log likelihood = -884.53806  
Iteration 2:  Log likelihood = -883.35064  
Iteration 3:  Log likelihood = -883.34876  
Iteration 4:  Log likelihood = -883.34876  

Logistic regression                                     Number of obs =  1,629
                                                        LR chi2(18)   = 109.59
                                                        Prob > chi2   = 0.0000
Log likelihood = -883.34876                             Pseudo R2     = 0.0584

-----------------------------------------------------------------------------------
             qsmk | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
              sex |  -.5075218   .1482316    -3.42   0.001    -.7980505   -.2169932
             race |  -.8502312   .2058722    -4.13   0.000    -1.253733   -.4467292
              age |   .1030132   .0488996     2.11   0.035     .0071718    .1988547
                  |
      c.age#c.age |  -.0006052   .0005074    -1.19   0.233    -.0015998    .0003893
                  |
        education |
               1  |  -.3796632   .2203948    -1.72   0.085     -.811629    .0523026
               2  |  -.4779835   .2141771    -2.23   0.026    -.8977629   -.0582041
               3  |  -.3639645   .1885776    -1.93   0.054    -.7335698    .0056409
               4  |  -.4221892   .2717235    -1.55   0.120    -.9547574     .110379
                  |
   smokeintensity |  -.0651561   .0147589    -4.41   0.000    -.0940831   -.0362292
                  |
 c.smokeintensity#|
 c.smokeintensity |   .0008461   .0002758     3.07   0.002     .0003054    .0013867
                  |
         smokeyrs |  -.0733708   .0269958    -2.72   0.007    -.1262816     -.02046
                  |
       c.smokeyrs#|
       c.smokeyrs |   .0008384   .0004435     1.89   0.059    -.0000307    .0017076
                  |
         exercise |
               0  |  -.3550517   .1799293    -1.97   0.048    -.7077067   -.0023967
               1  |    -.06364   .1351256    -0.47   0.638    -.3284812    .2012013
                  |
           active |
               0  |  -.0683123   .2087269    -0.33   0.743    -.4774095    .3407849
               1  |   -.057437   .2039967    -0.28   0.778    -.4572632    .3423892
                  |
             wt71 |  -.0128478   .0222829    -0.58   0.564    -.0565214    .0308258
                  |
    c.wt71#c.wt71 |   .0001209   .0001352     0.89   0.371     -.000144    .0003859
                  |
            _cons |  -1.185875   1.263142    -0.94   0.348    -3.661588    1.289838
-----------------------------------------------------------------------------------



Iteration 0:  Log likelihood = -938.14308  
Iteration 1:  Log likelihood = -938.14308  

Logistic regression                                     Number of obs =  1,629
                                                        LR chi2(0)    =   0.00
                                                        Prob > chi2   =      .
Log likelihood = -938.14308                             Pseudo R2     = 0.0000

------------------------------------------------------------------------------
        qsmk | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |  -1.031787   .0562947   -18.33   0.000    -1.142122   -.9214511
------------------------------------------------------------------------------



Iteration 0:  Log likelihood = -266.67873  
Iteration 1:  Log likelihood = -238.48654  
Iteration 2:  Log likelihood = -232.82848  
Iteration 3:  Log likelihood = -232.68043  
Iteration 4:  Log likelihood = -232.67999  
Iteration 5:  Log likelihood = -232.67999  

Logistic regression                                     Number of obs =  1,629
                                                        LR chi2(19)   =  68.00
                                                        Prob > chi2   = 0.0000
Log likelihood = -232.67999                             Pseudo R2     = 0.1275

-----------------------------------------------------------------------------------
             cens | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
             qsmk |   .5168674   .2877162     1.80   0.072    -.0470459    1.080781
              sex |   .0573131   .3302775     0.17   0.862     -.590019    .7046452
             race |  -.0122715   .4524888    -0.03   0.978    -.8991332    .8745902
              age |  -.2697293   .1174647    -2.30   0.022    -.4999559   -.0395027
                  |
      c.age#c.age |   .0028837   .0011135     2.59   0.010     .0007012    .0050661
                  |
        education |
               1  |   .3823818   .5601808     0.68   0.495    -.7155523    1.480316
               2  |  -.0584066   .5749586    -0.10   0.919    -1.185305    1.068491
               3  |   .2176937   .5225008     0.42   0.677    -.8063891    1.241776
               4  |   .5208288   .6678735     0.78   0.435    -.7881792    1.829837
                  |
   smokeintensity |   .0157119   .0347319     0.45   0.651    -.0523614    .0837851
                  |
 c.smokeintensity#|
 c.smokeintensity |  -.0001133   .0006058    -0.19   0.852    -.0013007    .0010742
                  |
         smokeyrs |   .0785973   .0749576     1.05   0.294    -.0683169    .2255116
                  |
       c.smokeyrs#|
       c.smokeyrs |  -.0005569   .0010318    -0.54   0.589    -.0025791    .0014653
                  |
         exercise |
               0  |    .583989   .3723133     1.57   0.117    -.1457317     1.31371
               1  |  -.3874824   .3439133    -1.13   0.260     -1.06154    .2865754
                  |
           active |
               0  |  -.7065829   .3964577    -1.78   0.075    -1.483626    .0704599
               1  |  -.9540614   .3893181    -2.45   0.014    -1.717111   -.1910119
                  |
             wt71 |  -.0878871   .0400115    -2.20   0.028    -.1663082   -.0094659
                  |
    c.wt71#c.wt71 |   .0006351   .0002257     2.81   0.005     .0001927    .0010775
                  |
            _cons |   3.754678   2.651222     1.42   0.157    -1.441622    8.950978
-----------------------------------------------------------------------------------



Iteration 0:  Log likelihood = -266.67873  
Iteration 1:  Log likelihood = -264.00252  
Iteration 2:  Log likelihood = -263.88028  
Iteration 3:  Log likelihood = -263.88009  
Iteration 4:  Log likelihood = -263.88009  

Logistic regression                                     Number of obs =  1,629
                                                        LR chi2(1)    =   5.60
                                                        Prob > chi2   = 0.0180
Log likelihood = -263.88009                             Pseudo R2     = 0.0105

------------------------------------------------------------------------------
        cens | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        qsmk |   .6411113   .2639262     2.43   0.015     .1238255    1.158397
       _cons |  -3.421172   .1648503   -20.75   0.000    -3.744273   -3.098071
------------------------------------------------------------------------------


(1,629 missing values generated)

(428 real changes made)

(1,201 real changes made)

(1,629 missing values generated)

(1,566 real changes made)

(63 missing values generated)

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
          sw |      1,566    .9962351    .2819583   .3546469   4.093113

(sum of wgt is 1,560.10419079661)

Linear regression                               Number of obs     =      1,566
                                                F(1, 1565)        =      44.19
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0363
                                                Root MSE          =     7.8652

                               (Std. err. adjusted for 1,566 clusters in seqn)
------------------------------------------------------------------------------
             |               Robust
     wt82_71 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        qsmk |   3.496493   .5259796     6.65   0.000     2.464794    4.528192
       _cons |    1.66199   .2328986     7.14   0.000     1.205164    2.118816
------------------------------------------------------------------------------