Participant attrition and missing data are omnipresent validity threats in longitudinal research. Study attrition is especially concerning in longitudinal studies with vulnerable populations, such as students in public schools located within poor urban communities where residential mobility is often a fact of life.
The current study is a secondary data analysis of the Lead Peace demonstration study. "Lead Peace" is a middle school service learning program of the Minneapolis Public School District. Student outcomes associated with Lead Peace program involvement are being evaluated by the University of Minnesota Prevention Research Center with a cohort of middle school students followed over three years beginning in the 2006-2007 school year. This evaluation included student surveys administered at four points: the beginning of 6th grade (T1), the end of 6th/beginning of 7th grade (T2), the end of 7th grade (T3), and the end of 8th grade (T4). The current study utilized data from T2, T3 and T4 surveys.
The purpose of this study was two-fold: (1) to compare sub-samples of young adolescents completing surveys at one or more of three time points, and (2) to test methods for handling missing data in longitudinal studies. The primary aim was to examine similarities and differences between students who completed surveys at T2, T3 and T4 and those who did not complete surveys at one or two of these data collection points. The secondary aim was to contrast estimates from multivariate models predicting youth violence involvement using three different datasets, one that included all students present for all surveys (complete-cases) and two that included imputed data from those missing at time points T3 and T4.
The primary aim was addressed through a series of comparison tests contrasting a group of students who completed all three surveys with groups who in-migrated and groups who out-migrated during the study period. Groups were compared on variables including gender, ethnicity, number of years living in one‘s neighborhood, number of schools in current school year, substance use, a variety of pro-social connectedness factors, bullying and violence involvement. The study‘s secondary aim was addressed by creating three different longitudinal datasets, one that includes all students present for all surveys (complete-case analysis) and two that include imputed data for those students missing at time points T3 and T4. Two types of data imputation, regression-imputation and multiple imputation, were used to create a second and third dataset. Comparisons were made of point estimates, standardized beta values, and standard errors generated by each dataset for a longitudinal regression model of relationships between T2 youth violence involvement, T3 neighborhood connectedness measures, and T4 youth violence involvement.
Findings related to primary aim suggested that out-migrating and in-migrating groups of students were similar to those who started and stayed the duration of the Lead Peace study. Students who entered the study at T3 tended to have increased levels of disruptive behavior in their first year, but became more similar to the group of students present the entire time in the second year of surveys. Students who joined Lead Peace for only the T4 data collection point exhibited the greatest number of different characteristics across comparisons from those who were present at all time points.
Data imputation models performed as hypothesized, with each having merits and drawbacks. In each dataset, T2 violence involvement predicted T4 violence involvement at statistically significant levels (p = 0.00 in each multivariate model). T3 neighborhood civic contribution predicted decreased T4 violence involvement (p = 0.03) only in the multivariate model employing the regression-imputation dataset. All other longitudinal multivariate relationships tested were not significant in multivariate models.
The current study offers a framework for understanding attrition in longitudinal research with public school students from low-income urban neighborhoods. Within these settings, students who leave a longitudinal study may be similar to students who stay for the duration of a study. In contrast, students who join a longitudinal study exhibit several differences in psychosocial and behavioral characteristics than those present for the duration of a study. Findings from this study's attrition analysis will inform investigators who are considering study designs and are making generalizations about study samples in similar research settings. The current study also adds to the growing evidence of the utility of data imputation methods to handle missing data in longitudinal research. Finally, findings offer mixed evidence of pro-social “neighborhood connectedness” as a protective factor buffering youth from violence involvement.