Paper
Wednesday, July 11, 2007
Comparison of traditional model selection techniques to Bayesian model averaging in adolescent suicide
Thomas Hardie, EdD, RN, CS, NP, Department of Nursing, University of Delaware, Newark, DE, USA and Kevin G. Lynch, PhD, Department of Psychiatry, University of Pennslyvania, Philadelphia, PA, USA.
Learning Objective #1: identify potential pitfalls of traditional model selection methods. |
Learning Objective #2: understand the strengths and limitation of Bayesian model averaging and its relevance for use in nursing research. |
Background: Primary analyses for an empirical study will use a small number of independent variables to test explicit hypotheses concerning relationships in the data. Further secondary analyses will typically use a much larger number of independent variables, often with the goal of revealing relationships that may be tested in later studies. In our previous work, model selection was completed by the screening of each independent variable followed by the use of forward selection logistic regression. This is one of several exploratory approaches that can be used in a search for a “best” model. Often, these methods provide differing models as candidates, and these models may have very different interpretations. An alternative to selecting and interpreting a single “best” model is to base inference on a combination of several models. Bayesian model averaging (BMA) provides a framework within which multiple models can be combined to provide better overall prediction than is available from any single model. Purpose: Compare traditional model selection techniques to BMA on a large adolescent database.
Methods: ADD Health data were analyzed to develop a model for suicidal ideation. Predictors were reduced by developing latent variables for depression, externalizing behaviors, and alcohol use across the waves of data. Factor scores were used as predictors in various model selection approaches. These models were then compared to the results from Bayesian model averaging.
Results/Conclusions: Our analyses found model differences among traditional logistic regression model selection methods, not only in their parameter estimates but also in the dependent variables selected. Findings from Bayesian model averaging will be presented, along with their interpretation. These results provide analytic methods for others to consider in the analysis of data with multiple indicators. Its relevance for use with nursing models, as well as strengths and limitations, will also be discussed.