Lecture 21 1. Bootstrap hypothesis tests: 2-sample t-test 2. Types of missing values 3. Making missing-value datasets: MCAR, MAR, and MNAR 1 Bootstrap tests Our bootstrap example (confidence interval for a correlation): 1. draw B samples with replacement from the data, 2. calculate the statistic r b§ from each bootstrap sample, b = 1, . . . , B , © ™ 3. use the histogram of the bootstrap statistics r b§ to find confidence interval or standard error. Hypothesis test differs from confidence intervals and standard errors: test has a null hypothesis H0, p-value for the test is calculated assuming H0. To bootstrap a test, we need to draw the bootstrap samples from the null hypothesis distribution. 2 The two sample t-test compares two population means µY and µZ by comparing estimates of these means from independent samples © ™ y = y 1, y 2, . . . , y n from population Y and z = {z 1, z 2, . . . , z m } from population Z . The standard assumptions are: 1. The data are simple random samples 2. The two populations are Normal 3. The two populations have the same variance 4. The observations are correctly labeled with their population: no misclassification. The null hypothesis is H0 : µY = µZ . 3 Brain glucose example Magnetic resonance imaging gives researchers a non-invasive way to measure chemicals in the brain minute by minute. One study measured blood sugar (glucose) in the brains of 14 people with diabetes and 14 healthy people. 4 The TTEST Procedure Variable brain_ glucose brain_ glucose brain_ glucose diabetic 0 1 Diff (1-2) Variable brain_glucose brain_glucose N 14 Lower CL Mean 4.6832 Mean 5.3214 Upper CL Mean 5.9596 Std Dev 1.1054 Std Err 0.2954 14 4.1943 4.6857 5.1771 0.8511 0.2275 -0.131 0.6357 1.4021 0.9865 0.3728 Method Pooled Satterthwaite Variances Equal Unequal DF 26 24.4 t Value 1.71 1.71 Pr > |t| 0.1001 0.1009 Conclusion? 5 These are small samples that looked skewed in opposite directions—perhaps the t-test is missing something? 6 Bootstrap 2-sample t-test © ™ We have two independent samples: y = y 1, y 2, . . . , y n from population Y and z = {z 1, z 2, . . . , z m } from population Z . 1. Create two transformed data sets y0 and z0 with equal means to satisfy the null hypothesis. y0 and z0 should have the same standard deviations as originals. 2. Bootstrap y0 and z0 and calculate the test statistic from the samples b = 1, . . . , B : y¯ ° z¯ t b§ = p ¯ 2 + SE(z) ¯2 SE( y) This t-statistic does not assume equal population variances. Use this because we are equalizing means but not SDs. 7 Make the null hypothesis samples Let y¯ be the mean of sample y, and z¯ be the mean of sample z. Subtract y¯ from each observation in sample y: © ™ © ™ ¯ (y 2 ° y), ¯ . . . , (y n ° y) ¯ y0 = y 10 , y 20 , . . . , y n0 = (y 1 ° y), Subtract z¯ from each observation in sample z: © ™ ¯ (z 2 ° z), ¯ . . . , (z m ° z)} ¯ z0 = z 10 , z 20 , . . . , z n0 = {(z 1 ° z), Easy to show both y0 and z0 have mean = zero; null hypothesis is equal means. Shifting each sample by a constant does not change the SD. 8 Original data: samples y and z. 9 Shifted data: samples y0 and z0. 10 Proc Print data = pubh.brain_glucose; Obs diabetic 1 0 2 0 3 0 . . . . 15 1 16 1 17 1 . . . . brain_ glucose 4.5 5.3 8.2 5.2 5.1 5.3 From Proc Ttest, we have y¯ = 5.3214 and z¯ = 4.6857. data nh_data; set pubh.brain_glucose; if diabetic = 0 then bg = brain_glucose - 5.3214; if diabetic = 1 then bg = brain_glucose - 4.6857; 11 Draw simple random samples with replacement from the null hypothesis data (nh_data). Here’s what we did before: Proc Surveyselect seed = 56672119 data = nh_data out=boot_samples method = urs samprate = 1 outhits rep = 2000; Problems: Is it possible to get a sample entirely from one group? What happens to t-statistic if sample sizes for the two groups don’t match originals? 12 ± Solution: Draw bootstrap samples from each group (diabetic control) separately: Proc Surveyselect seed = 56672119 data = nh_data out=boot_samples method = urs samprate = 1 outhits rep = 20; start with 20, make sure code works strata diabetic / alloc = proportional; strata statement identifies the grouping variable alloc = proportional Sample from each group in proportion to its size 13 Calculate t b§ from each bootstrap sample. proc sort data=boot_samples; by replicate; ODS listing close; stop writing output proc ttest ci=none data=boot_samples; by replicate; class diabetic; var bg; * use transformed variable; ODS output ttests = tstars; run; ODS listing; resume writing output proc print data=tstars (obs=10); 14 Obs Replicate Variable 1 2 3 4 5 6 7 8 9 10 1 1 2 2 3 3 4 4 5 5 bg bg bg bg bg bg bg bg bg bg Method Variances tValue DF Probt Equal Unequal Equal Unequal Equal Unequal Equal Unequal Equal Unequal 1.42 1.42 1.63 1.63 -1.47 -1.47 -0.23 -0.23 1.65 1.65 26 25.97 26 22.991 26 25.124 26 24.27 26 23.113 0.1666 0.1666 0.1159 0.1175 0.1541 0.1545 0.8189 0.8190 0.1114 0.1129 Pooled Satterthwaite Pooled Satterthwaite Pooled Satterthwaite Pooled Satterthwaite Pooled Satterthwaite Which t-statistic do we want? Use subsetting IF. Rename tValue as t_star 15 data boot_t_stars; set tstars; if method= "Satterthwaite"; t_star = tValue; observed = 1.71; t_star_larger = (abs(t_star) GE observed); * calculate indicator to get p-value; keep replicate t_star t_star_larger; proc print data=boot_t_stars (obs=10); 16 t_star_ Obs Replicate t_star larger 1 1 1.42297 0 2 2 1.62647 0 3 3 -1.46795 0 4 4 -0.23128 0 5 5 1.64802 0 6 6 0.59854 0 7 7 -0.65278 0 8 8 1.24904 0 9 9 1.80207 1 10 10 0.00004 0 We have 2000 unequal-variance t b§ replicates from H0 17 Draw a histogram of unequal-variance t b§ replicates ODS graphics on; Proc Univariate noprint data=boot_t_stars; var t_star; histogram t_star / cfill=graye03 ; inset q1 median mean q3 / position=NE noframe; run; ODS graphics off; What should this histogram look like? Where should the mean be? 18 19 The TTEST Procedure Variable brain_ glucose brain_ glucose brain_ glucose diabetic 0 1 Diff (1-2) Variable brain_glucose brain_glucose N 14 Lower CL Mean 4.6832 Mean 5.3214 Upper CL Mean 5.9596 Std Dev 1.1054 Std Err 0.2954 14 4.1943 4.6857 5.1771 0.8511 0.2275 -0.131 0.6357 1.4021 0.9865 0.3728 Method Pooled Satterthwaite Variances Equal Unequal Observed t = 1.71. How do we get p-value from histogram? 20 DF 26 24.4 t Value 1.71 1.71 Pr > |t| 0.1001 0.1009 p-value is sum of tail areas where |t b§| ∏ 1.71. 21 To get this area, find proportion of bootstrap t b§ with |t b§| ∏ 1.71. This is the indicator variable t_star_larger. Proc Freq data=boot_t_stats; table t_star_larger; t_star_larger Frequency Percent Cumulative Cumulative Frequency Percent -----------------------------------------------------------------0 1779 88.95 1779 88.95 1 221 11.05 2000 100.00 ± bootstrap approximate two-sided p-value = 221 2000 = 0.1105 22 bootstrap approximate two-sided p-value from B replicates: Ø Ø Ø Ø number of Øt b§Ø ∏ Øt obs Ø B This is the proportion of test statistics from the H0 distribution that are more extreme than the one we observed. ± Bootstrap t-test gives p-value = 221 2000 = 0.1105. Regular t-test gave p = 0.1009. In this small sample, with normality in doubt, the bootstrap provides reassurance that the t-test is not missing a real difference. 23 Missing Data Methods Multicenter Depression trial (HAMD). This clinical trial randomly assigned 100 patients with major depression to an experimental drug (D) or to placebo (P). (Source: Dmitrienko et. al. (2005). Participants completed the Hamilton depression rating scale (HAMD) at baseline and again at the end of the 9-week treatment. Study outcome was HAMD at end. Allocation of patients at 5 clinical centers: drug center Frequency| 1| 2| 3| 4| 5| ---------+--------+--------+--------+--------+--------+ Drug | 11 | 7 | 16 | 9 | 7 | ---------+--------+--------+--------+--------+--------+ Placebo | 13 | 7 | 14 | 10 | 6 | ---------+--------+--------+--------+--------+--------+ Total 24 14 30 19 13 24 Total 50 50 100 The first 5 observations from the Depression Study data ID baseline final drug center 1 27 4 D 1 2 27 9 D 1 3 26 8 D 1 4 27 5 D 1 5 36 8 D 1 Model(s) to compare final HAMD between treatments, adjusted for baseline and center: We’ll return to these models to analyze this data. 25 Missing Values Suppose that some final surveys were missing—not completed. What happens to these participants’ data in the fitting the adjusted model? What if patients with the worst side-effects to the experimental drug (D) dropped out and didn’t complete the final survey? 26 Types of Missing Data Missing completely at random (MCAR): data are missing independently of both observed and unobserved data. Example: a participant flips a coin to decide whether to complete the depression survey. Missing at random (MAR): given the observed data, data are missing independently of unobserved data. Example: male participants are more likely to refuse to fill out the depression survey, but it does not depend on the level of their depression. 27 MCAR implies MAR, but not the other way round. Most methods assume MAR. We can ignore missing data ( = omit missing observations) if we have MAR or MCAR. Missing Not at Random (MNAR): missing observations related to values of unobserved data. Example: participants with severe depression were less likely to complete HAMD form. Informative missingness: the fact that data is missing contains information about the response. Observed data is biased sample. Missing data cannot be ignored. 28 Cannot distinguish MAR from MNAR without additional information. SAS default is to omit cases with missing data = ignore missing data. With MNAR, you get a non-representative sample and biased estimates. References: Dmitrienko et. al. (2005) Analysis of Clinical Trials Using SAS, Chapter 5 R Little and D Rubin (2002) Statistical Analysis with Missing Data, Second Edition 29 Plan: 1. Delete observations from HAMD data to make an example of each type of missing data. 2. Discuss approaches to handling missing data. 3. Compare these approaches on our constructed examples from HAMD. 30 Make missing completely at random (MCAR) example MCAR: data are missing independently of both observed and unobserved data. Example: participant flips a coin to decide whether to complete final survey. Randomly select 30% of the observations in HAMD, set to missing. data MCAR; set ph6470.hamd2; missing = 0; if (ranuni(457392) < .3) then do; select 30% random sample final =. ; missing=1; label missing values end; 31 MCAR example, first 10 observations. Obs missing baseline 1 0 27 2 0 3 final drug center 4 D 1 27 9 D 1 0 26 8 D 1 4 0 27 5 D 1 5 0 36 8 D 1 6 0 39 18 D 1 7 0 25 14 D 1 8 0 33 8 D 1 9 0 38 9 D 1 10 1 39 . D 1 32 proc freq data=MCAR; tables missing; missing Frequency Percent Frequency Percent 0 67 67.00 67 67.00 1 33 33.00 100 100.00 What percent are actually missing? 33 Missing at random (MAR) example Missing at random (MAR): given the observed data, data are missing independently of unobserved data. Example: male participants more likely to refuse to fill out final survey, independent of their level of their depression. Data does not include gender. Missing values related to observed data: only at centers 1, 2, and 3. Need to get º 33 missing cases. Centers 1, 2, 3 together have 64/100 patients in study. What proportion p should be missing? p § 64 = 33 gives x = .516 34 data MAR; set ph6470.hamd2; missing = 0; if (ranuni(457392) < .516 and center IN (1, 2, 3)) then do; final =. ; missing=1; end; proc freq data=MAR; tables missing; Cumulative Cumulative missing Frequency Percent Frequency Percent -----------------------------------------------------------0 63 63.00 63 63.00 1 37 37.00 100 100.00 35 Adjusting the cutoff for the uniform random number gives: data MAR; set ph6470.hamd2; missing = 0; if (ranuni(457392) < .435 and center IN (1, 2, 3)) then do; final =. ; missing=1; end; This produces 34 missing values, nearly the same number as the MCAR example. 36 MAR example, first 10 observations. Obs missing baseline 1 1 27 2 0 3 final change drug center . 23 D 1 27 9 18 D 1 0 26 8 18 D 1 4 0 27 5 22 D 1 5 0 36 8 28 D 1 6 0 39 18 21 D 1 7 1 25 . 11 D 1 8 0 33 8 25 D 1 9 1 38 . 29 D 1 10 1 39 . 18 D 1 37 Missing not at random (MNAR) example MNAR: missing observations related to values of unobserved data. Example: participants with most severe depression were less likely to complete final HAMD survey. Identify “high” final values. Randomly select 33 among these to delete—want same among of missing data as other examples. How do we identify top 50% of baseline values? 38 Proc univariate data=ph6470.hamd2; var final; Quantile Estimate 100% Max 35.0 99% 34.0 95% 28.0 90% 23.5 75% Q3 19.0 50% Median 14.5 25% Q1 8.0 10% 4.0 5% 2.0 1% 1.0 0% Min 1.0 39 What proportion do we remove? p § 50 = 33 gives p = .66 data MNAR; set ph6470.hamd2; missing=0; if ( final GE 14.5 and ranuni(884739) < .66 ) then do; final =. ; missing=1; end; proc freq data=MNAR; tables missing; missing 0 1 Frequency 70 30 Percent 70.00 30.00 Cumulative Frequency 70 100 40 Cumulative Percent 70.00 100.00 Trial and error leads to: data MNAR; set ph6470.hamd2; missing=0; if (final GE 14.5 and ranuni(884739) < .69 ) then do; final =. ; missing=1; end; which gives 33 missing values. 41 MNAR example, first 10 observations: Obs missing baseline 1 0 27 2 0 3 final change drug center 4 23 D 1 27 9 18 D 1 0 26 8 18 D 1 4 0 27 5 22 D 1 5 0 36 8 28 D 1 6 1 39 . 21 D 1 7 0 25 14 11 D 1 8 0 33 8 25 D 1 9 0 38 9 29 D 1 10 1 39 . 18 D 1 42 Plan: 1. Delete observations from HAMD data to make an example of each type of missing data: MCAR, MAR, MNAR. All data sets have 33% missing data. 2. Discuss approaches to handling missing data. 3. Compare these approaches on our constructed examples from HAMD. Results will depend on type of missingness, not amount of missing data. 43
© Copyright 2025