In the chapter on “A New Era of Clinical Research Methods in a Data Rich Environment” in Hesse et al., Oncology Informatics, I described the evolution from a data poor science in which data were generated to answer specific research questions (and then often discarded) to a data rich science in which voluminous and temporally dense data are generated from many sources that can be applied to answer a range of research questions. Although the behavioral and social sciences have seen an explosion of data sources from smartphones, sensor technologies, electronic health records, social media, and administrative data repositories, we continue to train our doctoral students in the methods of a data poor science. For example, Aiken and West compared their study of methods training in psychology programs in 1990 to 2007 and concluded that “the research design curriculum has largely stagnated.” Despite a paradigm-shifting change in new data sources and approaches, research methods and statistics training for behavioral and social science graduates remains largely unchanged. A 2017 workshop of the National Academies of Science, Engineering, and Medicine on Graduate Training in the Social and Behavioral Sciences came to a similar conclusion.
To address this problem and facilitate data science training commensurate with the data now available for behavioral and social sciences research, a group of NIH-wide staff, led by Liz Ginexi at the OBSSR, identified the key behavioral and social science training needs in data science, identified the current support for this training by the NIH, and developed and released a request for applications (RFA) on Predoctoral Training in Advanced Data Analytics for Behavioral and Social Sciences Research (TADA-BSSR). The vision of the TADA-BSSR T32 initiative is to develop a cohort of specialized predoctoral candidates who possess advanced competencies in data science analytics to apply to an increasingly complex landscape of behavioral and social health-related big data. These predoctoral training programs were encouraged to integrate computer science, informatics, mathematics, and statistics into behavioral and social sciences research training.
I’m excited to announce that eight TADA-BSSR T32 grants were recently awarded, and these programs will initiate training for their first cohort of diverse trainees this fall:
- University of Washington (PI: Sara Curran): Data Science Training in Demography and Population Health
- Stanford University (PI: Lorene Nelson): Stanford BSSR Pre-Doctoral Training Program at the Intersection of Data Sciences with Behavioral, Social, and Population Health Research
- University of California, Berkley (PI: David Harding): Computational Social Science Training Program
- University of California, San Diego (PI: Lucila Ohno-Machado): Advanced Data Analytics Training for Behavioral and Social Sciences Research.
- University of California, San Francisco (UCSF) (PI: Madellina Glymour): UCSF Data Science Training to Advance Behavioral and Social Science Expertise for Health Research (DaTABASE) Program
- Johns Hopkins University (PI: Elizabeth Stuart): Data Integration for Causal Inference in Behavioral Health
- University of Arkansas (PI: John Tilford): University of Arkansas for Medical Sciences—Arkansas Center for Health Disparities T32 Pre-doctoral Research Training Program
- Emory University (PI: Hannah Cooper): Training in Advanced Data Analytics to End Drug-Related Harms (TADA)
These programs have already begun curricula development and will expand and refine their curricula over the next few years. They also will work together, sharing resources, holding cross-program webinar trainings, and providing their students with access to the faculty expertise and trainings available at each of their programs. Next June, they will hold their first summit on Behavioral and Social Science Research Training in Advanced Data Analytics.
Our hope is that these vanguard programs will serve as a catalyst for more behavioral and social science training programs to make data analytics training a more central component of all behavioral and social sciences graduate training. Our students need to be skilled in the methods of data rich sciences if we are to utilize the wealth of data available to us to answer critically important questions of behavior, social systems, and health.