Research Spotlights: May 2020

research spotlight

Evaluating the risk of school violence using natural language processing and machine learning

School violence can be extremely traumatic and devastating to all involved. In a recent publication, researchers supported by NCATS, NHGRI, NLM, the Agency for Healthcare Research and Quality, and Cincinnati Children’s Hospital Medical Center, developed a risk assessment program to determine the likelihood that a high school student was at risk of participating in an act of school violence. Current practices for these types of assessments rely primarily on the clinicians’ subjective impression in determining the individual’s risk levels and the methods are expensive and not scalable. The focus of this study was to improve current risk assessment practices by developing a natural language processing (NLP) and machine learning process to automate and improve the risk assessment process.

To inform their model, the researchers recruited 131 students (ages 10-18 years old, both sexes equally recruited) with or without behavioral concerns from 89 schools between 05/01/2015 and 04/30/2018. Demographics (sex, race, ethnicity) and socioeconomic status (education, public assistance, household income) were collected from the subject’s legal guardians. The subjects were interviewed with two risk assessment scales and a questionnaire, 1) Brief Rating of Aggression by Children and Adolescents (BRACHA), 2) School Safety Scale (SSS) that evaluates risk and protective factors for school violence, and 3) Psychiatric Intake Response Center (PIRC) questionnaire that collects student background information (personality, school, social, and family dynamics). Most questions were asked in an open-ended format to encourage more detailed answers rather than just a “Yes/No” response. Question wording was based on the student’s age and cognitive level. After the risk assessment, a forensic psychiatrist assessed the student’s behaviors, attitudes, feelings, and technology use (e.g., social media).

The researchers took the data from the assessment scales and questionnaire, and using NLP techniques, transformed each student interview to an array of conversational, semantic and contextual features. NLP algorithms and techniques were developed to examine the risk assessment material and cross validate the risk assessment with the clinical judgment. The various dimensions of validation included: positive predictive value, sensitivity, negative predictive value, specificity, and area under the ROC curve. The researchers determined that using linguistic features via the automated process significantly improved the classifiers’ predictive performance (P < 0.01), relative to simply using the individual’s sociodemographic information.

By analyzing the content from student interviews, and calibrating it to clinical judgement, the NLP and machine learning algorithms showed a strong capacity for detecting risk of school violence. The feature selection also uncovered multiple warning markers that may be useful clinical insights to assist in personalizing prevention interventions. There are some potential limitations to this study such as regional variations in linguistic patterns which may affect risk prediction. Future studies are underway to address some of these limitations.

Ni Y, Barzman D, Bachtel A, Griffey M, Osborn A, Sorter M. 2020. Finding warning markers: Leveraging natural language processing and machine learning technologies to detect risk of school violence, Int J Med Inform,139:104137

Children can “catch” their parents’ hidden emotions through synchronization of physiological responses

Previous research has established that parents can influence children’s emotional responses through direct and subtle behavior. In a study supported by grants from the NIMH, National Science Foundation, and the Amini Foundation, researchers examined this relationship further to determine if parents’ experiences of negative emotion impact children’s emotions. Parents have a significant influence on their children’s developing self-regulatory skills and recent research indicates that parents also influence their children’s affective states by transmitting their own affective states to their children through synchronization of physiological responses. In the current study, the researchers investigated how parents’ acute stress responses are transmitted to their children and how parental emotional suppression would affect parents’ and children’s physiological responses and behavior. Additionally, they assessed if there were differences in these responses/effects between mothers and fathers and their children.

The researchers recruited parents and their children (N = 214; dyads = 107; 47% fathers; 61% male children; children ages 7-11). All subjects completed a laboratory visit where baseline measurements were collected from both parents and children using electrocardiography and impedance cardiography to measure sympathetic nervous system (SNS) activation. Parents and children were each asked to list the top five topics that caused conflict between them. Parents and children were separated, and the parents underwent a standardized laboratory stressor, the Trier Social Stress Test (TSST). The TSST involves an individual giving a 5 min speech followed by 5 minutes of answering questions from two evaluators and has been shown to reliably activate the body’s primary stress systems. Before reuniting with their children, parents were randomly assigned to either hide their emotions from their child or to behave as normal (control condition) during the next tasks. Once reunited, parents and children performed three tasks together, a conflict conversation about the topic that ranked highest on both of their conflict lists, a cooperation task, and free play. The parent and child’s SNS responses and interaction behavior were observed and measured continually during the study.

The researchers found that children had a physical response when parents tried to hide their emotions. Mothers in the control group did not transmit their emotional state, as evidenced by nonsynchronous SNS responses, to their children. However, the SNS responses of mothers that hid their emotions influenced their child’s SNS responses, resulting in their children exhibiting more signs of stress, both physiologically and behaviorally. There were differences between these effects between mothers and fathers and the effects on the child. When fathers hid their emotions their SNS responses were influenced by their child’s SNS responses. Fathers in the control group and the experimental group did not influence the SNS responses of their children. In dyads where the parent suppressed their emotions, there was less engagement during the interaction period as compared to controls.

These findings indicate that parents’ emotion regulation efforts impact parent–child stress transmission and may alter the way they interact together. The physiological linkage or “transmission”, was stronger when parents suppressed their emotions, albeit in different ways for mothers and fathers. The direction of linkage was opposite in the father-child dyads than what was found in the mother–child dyads, so that fathers in the suppression condition were physiologically influenced by their children. These differential effects may be due to societal parenting norms; however, this is yet to be determined. Overall, these findings indicate that in order to foster effective emotional self-regulation in their children it may be beneficial for parents to acknowledge their own emotions to their children rather than hiding them. Additionally, this study highlights the need to include fathers as well as mothers, in developmental research.

Waters SF, Karnilowicz HR, West TV, Mendes WB. 2020. Keep it to yourself? Parent emotion suppression influences physiological linkage and interaction behavior. J Fam Psychol. doi: 10.1037/fam0000664

A risk-prediction model using patient electronic health records may help predict suicide risk in diverse populations

Suicide is a leading cause of mortality in the U.S. with a 30 percent increase in suicide-related deaths between 2000 and 2016. In order to effectively employ interventions for suicide, early and accurate identification of individuals at high risk for suicide are needed. In the current study, supported by grants from the NIMH, NCATS, Patient-Centered Outcomes Research Institute, Tommy Fuss Fund, Cancer Prevention Research Institute of Texas, Cullen Trust for Health Care, a Tepper Family MGH Research Scholarship, and the Demarest Lloyd Jr Foundation, researchers sought to improve and validate a risk-detection tool for suicide. Advances in automated techniques and the increase in the availability of longitudinal health data provide the opportunity for improved risk-detection tools. Leveraging these advances, the researchers used a process for training machine-learning algorithms based on electronic health records (EHR) to identify individuals who may have an increased risk of suicide across independent healthcare systems.

The researchers analyzed EHR data from patients ages 10 to 90 (n = 3,714,105; female = 2,130,454) across five U.S. health care systems: Partners HealthCare System in Boston; Boston Medical Center; Boston Children's Hospital; Wake Forest Medical Center in North Carolina; and University of Texas Health Science Center at Houston. Longitudinal data (6 to 17 years, depending on healthcare system) were extracted from the EHR including: International Classification of Diseases (ICD) codes, laboratory test results, procedures codes, and medications. Models were trained using naive Bayes classifiers in each of the 5 systems and were cross validated in independent data sets at each center. From this data, a total of 39,162 suicide attempts were identified. The model was developed in two steps, by first using half of the patient data to train a computer model to detect patterns that were associated with documented suicide attempts and then validating the trained model using the other half of the patient data.

The researchers found that the model performance was similar across the sites, even with variation in geographical location, demographic characteristics, and population health characteristics. Predictive features varied by site; however, the most common predictors were associated with mental health conditions such as borderline personality disorder (odds ratios: 8.1-12.9), bipolar disorder (odds ratios: 0.9-9.1), and substance use disorders (drug withdrawal syndrome, odds ratios: 7.0-12.9). The models were able to detect 33-39 percent of suicide attempts across the five centers with 90 percent specificity. Using this method, suicide-behavior was able to be detected from 1.3 to 3.5 years before the actual suicide attempt.

These findings indicate that a machine learning approach that leverages longitudinal patient EHR data may be able to detect increased individual risk of suicidal behavior in patients. Further, this model performed well despite variations in patient demographics and healthcare sites. Since healthcare centers may have unique predictive factors, based on different hospital coding practices, local demographics, and health patterns, having an approach, that allows sites to easily implement model training and validation on their own data sets is important for optimal performance. Using these types of risk prediction modeling approaches may aid the development of clinical decision support tools that help to inform clinical care and interventions.

Barak-Corren Y, Castro VM, Nock MK, Mandl KD, Madsen EM, Seiger A, Adams WG, Applegate RJ, Bernstam EV, Klann JG, McCarthy EP, Murphy SN, Natter M, Ostasiewski B, Patibandla N, Rosenthal GE, Silva GS, Wei K, Weber GM, Weiler SR, Reis BY, Smoller JW. 2020. Validation of an Electronic Health Record–Based Suicide Risk Prediction Modeling Approach Across Multiple Health Care Systems, JAMA Netw Open. 3(3):e201262. doi:10.1001/jamanetworkopen.2020.1262