Leveraging Machine Learning to Understand How Emotions Influence Equity Related Education: Quasi-Experimental Study

doi:10.2196/33934

Original Paper

¹Institute of Living, Hartford Hospital, Hartford, CT, United States

²Centre for Education Research and Innovation, Western University, London, ON, Canada

*all authors contributed equally

Corresponding Author:

Javeed Sukhera, MD, PhD

Institute of Living

Hartford Hospital

200 Retreat Avenue

Hartford, CT, 06106

United States

Phone: 1 8605457629

Email: javeedsukhera@gmail.com

Background: Teaching and learning about topics such as bias are challenging due to the emotional nature of bias-related discourse. However, emotions can be challenging to study in health professions education for numerous reasons. With the emergence of machine learning and natural language processing, sentiment analysis (SA) has the potential to bridge the gap.

Objective: To improve our understanding of the role of emotions in bias-related discourse, we developed and conducted a SA of bias-related discourse among health professionals.

Methods: We conducted a 2-stage quasi-experimental study. First, we developed a SA (algorithm) within an existing archive of interviews with health professionals about bias. SA refers to a mechanism of analysis that evaluates the sentiment of textual data by assigning scores to textual components and calculating and assigning a sentiment value to the text. Next, we applied our SA algorithm to an archive of social media discourse on Twitter that contained equity-related hashtags to compare sentiment among health professionals and the general population.

Results: When tested on the initial archive, our SA algorithm was highly accurate compared to human scoring of sentiment. An analysis of bias-related social media discourse demonstrated that health professional tweets (n=555) were less neutral than the general population (n=6680) when discussing social issues on professionally associated accounts (χ² [2, n=555)]=35.455; P<.001), suggesting that health professionals attach more sentiment to their posts on Twitter than seen in the general population.

Conclusions: The finding that health professionals are more likely to show and convey emotions regarding equity-related issues on social media has implications for teaching and learning about sensitive topics related to health professions education. Such emotions must therefore be considered in the design, delivery, and evaluation of equity and bias-related education.

JMIR Med Educ 2022;8(1):e33934

doi:10.2196/33934

Keywords

bias (52); equity (79); sentiment analysis (145); medical education (544); emotion (68); education (450)

Research on addressing bias in health professionals found that feedback conversations about topics such as bias provoked defensive reactions [Sukhera J, Milne A, Teunissen PW, Lingard L, Watling C. The Actual Versus Idealized Self: Exploring Responses to Feedback About Implicit Bias in Health Professionals. Acad Med 2018 Apr;93(4):623-629. [CrossRef] [Medline]1,Sukhera J, Wodzinski M, Teunissen PW, Lingard L, Watling C. Striving While Accepting: Exploring the Relationship Between Identity and Implicit Bias Recognition and Management. Acad Med 2018 Nov;93(11S Association of American Medical Colleges Learn Serve Lead: Proceedings of the 57th Annual Research in Medical Education Sessions):S82-S88. [CrossRef] [Medline]2]. However, these emotions did not hijack the learning process as learners still perceived their experience as positive while perceiving feedback about their biases as actionable [Sukhera J, Wodzinski M, Milne A, Teunissen PW, Lingard L, Watling C. Implicit Bias and the Feedback Paradox: Exploring How Health Professionals Engage With Feedback While Questioning Its Credibility. Acad Med 2019 Aug;94(8):1204-1210. [CrossRef] [Medline]3]. This finding was unique in the feedback literature, which generally suggests that feedback should be targeted away from the self to avoid hijacking the feedback process [Kluger A, DeNisi A. The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological bulletin. 1996. URL: https://doi.org/10.1037/0033-2909.119.2.254 [accessed 2021-01-31] 4]. This paradox suggests the need to further explore how emotions may mediate conversations about bias among health professionals.

Understanding the role of emotions when discussing topics related to bias or equity is essential to advance education in the field. We know that emotions play an important role in mediating the relationship between self-concept and learning. If confronted with their biases, learners may perceive a threat and therefore perceive the situation to have a negative attainment value leading to negative emotions. Negative emotions may then impede information recall and promote avoidance in processing its content [Trevors G, Muis K, Pekrun R, Sinatra G, Winne P. Identity and epistemic emotions during knowledge revision: A potential account for the backfire effect. Discourse Processes. 2016. URL: https://doi.org/10.1080/0163853X.2015.1136507 [accessed 2021-01-31] 5]. Not all emotions have a negative influence on learning. For example, emotions are essential for transformative learning and similar methods that require dissonance, critical reflection, facilitated dialogue, action, and behavior change [Sukhera J, Watling CJ, Gonzalez CM. Implicit Bias in Health Professions: From Recognition to Transformation. Acad Med 2020 May;95(5):717-723. [CrossRef] [Medline]6].

The importance of understanding emotions related to bias or equity education is especially salient when defensive or skeptical reactions are provoked. When challenging learners’ perceptions regarding the erroneous beliefs that they are not biased, emotions can lead to the backfire effect, strengthening the belief in such erroneous information even after attempted refutation [Lewandowsky S, Stritzke WGK, Oberauer K, Morales M. Memory for fact, fiction, and misinformation: the Iraq War 2003. Psychol Sci 2005 Mar;16(3):190-195. [CrossRef] [Medline]7,Lewandowsky S, Stritzke WGK, Freund AM, Oberauer K, Krueger JI. Misinformation, disinformation, and violent conflict: from Iraq and the "War on Terror" to future threats to peace. Am Psychol 2013 Oct;68(7):487-501. [CrossRef] [Medline]8]. This could lead learners to expend considerable cognitive resources to counter refutation [Nauroth P, Gollwitzer M. Gamers against sciencempirical research on violent video games as social identity threat. InVortrag auf dem EASP Medium Size Meeting? Intergroup conflict: The cognitive, emotional, and behavioral consequences of communication?, Soesterberg, Niederlande. 2013. URL: https://doi.org/10.1002/ejsp.1998 [accessed 2022-02-18] 9,Nyhan B, Reifler J. When corrections fail: The persistence of political misperceptions. Political Behavior. 2010. URL: https://doi.org/10.1007/s11109-010-9112-2 [accessed 2021-01-31] 10] and activate more evidence that supports their original erroneous beliefs.

In our previous work, we found that the idea of having bias and therefore being vulnerable to its effects was a threat to the strongly held belief among health professionals that they must operate without bias [Sukhera J. Bias in the Mirrorxploring implicit bias in health professions education. Datawyze Maastricht 2018 Nov 29:E-170. [CrossRef]11]. Research suggests that strongly held beliefs, such as the idea that health professionals cannot have bias, are integral to health professionals’ sense of self [Kahan DM, Peters E, Dawson EC, Slovic P. Motivated numeracy and enlightened self-government. Behav. Public Policy 2017 May 31;1(1):54-86. [CrossRef]12,Eccles J. Who am I and what am I going to do with my life? Personal and collective identities as motivators of action. Educational psychologist. 2009. URL: https://doi.org/10.1080/00461520902832368 [accessed 2021-01-31] 13]. Bias acceptance, therefore, may be perceived as identity threatening and trigger self-protective responses such as defensiveness and denial [Teal CR, Shada RE, Gill AC, Thompson BM, Frugé E, Villarreal GB, et al. When best intentions aren't enough: helping medical students develop strategies for managing bias about patients. J Gen Intern Med 2010 May;25 Suppl 2:S115-S118 [FREE Full text] [CrossRef] [Medline]14] to restore a sense of self-worth [Nyhan B, Reifler J. Does correcting myths about the flu vaccine work? An experimental evaluation of the effects of corrective information. Vaccine 2015 Jan 09;33(3):459-464. [CrossRef] [Medline]15].

Research regarding emotions in health professions education can also be challenging for numerous reasons. For example, there are tensions in how emotions are conceptualized in health professions education. Some view emotions as a physiological response, others as skills or abilities, and others view emotions as a sociocultural mediator [McNaughton N. Discourse(s) of emotion within medical education: the ever-present absence. Med Educ 2013 Jan;47(1):71-79. [CrossRef] [Medline]16]. There are also ontological tensions and a lack of conceptual and methodological consistency [Cloke P, Cooke P, Cursons J, Milbourne P, Widdowfield R. Ethics, Reflexivity and Research: Encounters with Homeless People. Ethics, Place and Environment 2010 Jul;3(2):133-154 [FREE Full text] [CrossRef]17]. Despite such challenges, a deeper understanding of how emotions influence learning is needed to enhance teaching and learning about emotionally challenging topics such as equity.

Advances in machine learning (ML) technology such as natural language processing (NLP) and sentiment analysis (SA) may provide a novel way of approaching such research [James CA, Wheelock KM, Woolliscroft JO. Machine Learning: The Next Paradigm Shift in Medical Education. Acad Med 2021 Jul 01;96(7):954-957. [CrossRef] [Medline]18]. ML techniques can automate information processing and have been applied towards applications such as competence assessments [Dias RD, Gupta A, Yule SJ. Using Machine Learning to Assess Physician Competence: A Systematic Review. Acad Med 2019 Mar;94(3):427-439. [CrossRef] [Medline]19]. NLP is a form of ML that can structure and extract text-based information making it available for further analysis [Friedman C, Hripcsak G. Natural language processing and its future in medicine. Acad Med 1999 Aug;74(8):890-895. [CrossRef] [Medline]20]. NLP and advanced text analytics are being used increasingly in a health care context [Hao T, Huang Z, Liang L, Weng H, Tang B. Health Natural Language Processing: Methodology Development and Applications. JMIR Med Inform 2021 Oct 21;9(10):e23898. [CrossRef] [Medline]21,Elbattah M, Arnaud, Gignon M, Dequen G. The Role of Text Analytics in Healthcare: A Review of Recent Developments and Applications. InHEALTHINF 2021 (pp. 825-832) 2021:825-832. [CrossRef]22]. SA is a mechanism of analysis that evaluates the sentiment of textual data by assigning scores to textual components and calculating and assigning a sentiment value to the text [Prabowo R, Thelwall M. Sentiment analysis: A combined approach. Journal of Informetrics. 2009. URL: https://doi.org/10.1016/j.joi.2009.01.003 [accessed 2021-01-31] 23].

SA is most commonly discussed in business settings as it allows one to determine customers' overall sentiment about products and services through data scraping and analysis from social media [Mäntylä M, Graziotin D, Kuutila M. The evolution of sentiment analysis?A review of research topics, venues, and top cited papers. Computer Science Review. 2018. URL: https://doi.org/10.1016/j.cosrev.2017.10.002 [accessed 2021-01-31] 24]. In health care, SA has been used to analyze online comments regarding hospital services to explore patient experiences [Greaves F, Ramirez-Cano D, Millett C, Darzi A, Donaldson L. Use of sentiment analysis for capturing patient experience from free-text comments posted online. J Med Internet Res 2013;15(11):e239 [FREE Full text] [CrossRef] [Medline]25] and applied to electronic health records to analyze health professional behavior [Gohil S, Vuik S, Darzi A. Sentiment Analysis of Health Care Tweets: Review of the Methods Used. JMIR Public Health Surveill 2018 Apr 23;4(2):e43 [FREE Full text] [CrossRef] [Medline]26]. In another study, SA was applied to twitter health news to compare whether health news is delivered in a manner more consistent with facts or opinion [Kolajo T, Kolajo JO. Sentiment analysis on twitter health news. Fudma Journal of Sciences 2018 Jul 17;2(2):e4. [CrossRef]27]. In these examples, researchers acknowledged their lack of clinical experience and limitations in the execution of their analysis. For example, Gohil and colleagues acknowledge that their methods had not been tested for accuracy [Gohil S, Vuik S, Darzi A. Sentiment Analysis of Health Care Tweets: Review of the Methods Used. JMIR Public Health Surveill 2018 Apr 23;4(2):e43 [FREE Full text] [CrossRef] [Medline]26]. The potential for SA in health professions education research is therefore limited without further research and evaluation.

Our previous research on emotions and bias-related feedback may provide a window into the application of SA. More recently, a shift from in-class to online discussions on sensitive and emotionally charged topics may provide an opportunity for inquiry. A deeper analysis of the language used by health professionals on social media may therefore provide insight into the emotions associated with teaching and learning about equity and bias.

Overall, our aim for this study was to improve our understanding of the role of emotions in bias-related discourse. We, therefore, conducted an SA of bias-related dialogue among health professionals. First, we tested if our SA algorithm was accurate by testing the accuracy of our NLP library on an existing archive of bias-related discourse among health professionals. Second, we utilized our SA to compare if the sentiment toward equity-related online discourse differed between health professionals and the general population.

Sentiment Analysis

Sentiment is a thought, opinion, or idea based on the underlying feeling or emotion about a specific topic or item. SA is utilized to analyze text and assign the writer’s attitude as positive, negative, or neutral given the presence of certain keywords. First, the text is split into four basic components: tokens, sentences, phrases, and entities. Next, an algorithm is applied using one of two systems. In a rule-based system, rules are manually crafted to analyze textual components. Specific words are scored as negative, neutral, or positive and associated with a score. These values are then tabulated to provide an estimate of the overall sentiment of the text. In an automatic system, machine learning technology is used to acquire knowledge from the data and allow for terms that are not currently within an existing set of rules. Both a rules-based and an automatic system can also be combined to utilize an initial database as a reference while also allowing for the inclusion of new terms and the alteration of sentiment values [Kundi F, Khan A, Ahmad S, Asghar M. Lexicon-based sentiment analysis in the social web. Journal of basic and Applied Scientific Research- ISSN 2090-4304 Jul 13 2014;4(6):238-248.28].

Step 1: Developing and Testing Our SA Algorithm

We developed a potential SA algorithm from an NLP library known as TextBlob. This library was built out of a toolkit using many different resources that are versatile and contain millions of training texts ranging from movie reviews to online conversations. TextBlob uses a naïve Bayes classifier which is a natural language toolkit (NLTK) that was trained from a movie review corpus. Millions of reviews were striped into tokens that were assigned positive or negative values to allow for the sentiment of the entire message to be interpreted.

Since naïve Bayes is a generative model while other approaches such as linear regression (LR) are discriminative, we felt that Naïve Bayes was a stronger model to use for a small data set which requires extending beyond the corpus that was originally used for training. This is only true if the assumption of independence holds, which is the case with our data. In addition, naïve Bayes performs well in the presence of categorical input variables, which is also the case in this study. Lastly, TextBlob is well documented and therefore is easy to integrate into our existing algorithm [TextBlob: STP. Texblog. Internet. URL: https://textblog.readthedocs.io.en/dev/index.html [accessed 2021-01-31] 29].

To determine the accuracy of our newly developed SA algorithm for our purposes, we utilized a pre-existing and de-identified data set of interviews with health professionals about their implicit biases. Ethics approval was not required for secondary analysis of de-identified data. We conducted SA on the transcribed interviews to score their underlying sentiment. We then compared the machine score with a manual human-scored sentiment categorization which had been completed prior to the algorithm execution. This comparison allowed us to determine the accuracy of the algorithm within the context of health professions' education and practice. We calculated the accuracy of our algorithm by calculating how many interviews were correctly computed in comparison to the manually scored value.

Step 2: Application of SA to Twitter Archive

We collected an archive of publicly available tweets, including metadata such as display name, username, and user biography through the Twitter Application Programming Interface (API). These “tweets” were stored if they included specific hashtags, which are commonly used to discuss bias-related topics. The hashtags included were “#AllLivesMatter/#ALM,” “#BlackLivesMatter/#BLM,” “#HeForShe,” “#ImplicitBias,” “#RepresentationMatters,” and “#UnconsciousBias.”

Our archive was then categorized into two databases, “health professionals” and “general population .”We distinguished between each group by searching for specific markers in the display name, username, or biography that were manually checked to ensure all individuals included in the data set would fit the classification of health professionals. The individuals whose “tweets” belonged to the general population had no additional criteria to be met other than using the hashtag.

The data collection process was initiated with the first official data pull on 12 January 2020 and collected for approximately three months, commencing on 29 March 2020, when the database was sufficient enough to analyze. The final archive contained 555 “tweets” from health professionals and 6680 tweets from the general population.

To compare sentiment scores between health professionals and the general population, the total sums in each of the three categories, “positive,” “negative,” and “neutral” were calculated for each of the two databases, “health professionals” and “general population.” The purpose of the general population proportions was to serve as an expected value and to identify if health care professionals vary from this standard. This then allowed us to perform a chi-square goodness of fit test. We selected the chi-square goodness of fit test after methodological consultation with local experts in epidemiology and biostatistics. In general, a chi-square allows researchers to draw inferences and test for relationships between categorical variables. The goodness of fit test is useful to evaluate whether a full population is represented through the sample data. As our research sample sought to compare the sentiment between health professional discourse and the general population, we felt the goodness of fit test would be appropriate.

We noted that the volume of data being collected between the general population and health professionals was vastly different in quantity. We chose to use proportions as the quantity of data may have been misleading. As there were fewer health professional tweets included, we scaled down this group to have more tangible numbers for our statistical analysis. For example, on a given data pull, if there were 150/500 negative tweets from the general population versus 12/20 negative tweets from health professionals, the comparison of raw quantities would have skewed analysis and interpretation. Therefore, the observed values were comprised of the counts of each category in the health professional data set. The expected values were the proportion of each category in the general population data set scaled to the sum of the health professional data set. The standard significance value of .05 was maintained, and considering there were three categories, two degrees of freedom were present, and we concluded that an χ² value of 9.21 was required for the deviation from the general population to be deemed statistically significant.

Programming Specifications

Our SA algorithm was written in Python 3.0. This was an object-oriented program that used a class method to handle Twitter API credentials, authorize access to the database, and utilize the NLP as the “tweets” were retrieved. A class method refers to the structure of the algorithm, which means that the class, program code template, and method are bound to the class and not the object of the class. In programming, class refers to a descriptor of certain objects rather than the objects themselves. Our algorithm was developed into a Python script for each hashtag, and then a bash script file was written to allow ease of access to collect the data. A bash file refers to a text file that contains a series of commands. In this study, the bash file contained the commands to run the Python algorithms to collect data and populate the database. Overall, we used the same algorithm for both components of this study, accuracy testing and Twitter analysis. However, there were slight modifications, such as removing authentication from the local accuracy testing script as the data was retrieved locally.

In order to test the accuracy of the NLP library for the interviews conducted, there were 53 health professionals, including registered nurses and medical doctors. When we tested the original algorithm, our tool was able to accurately identify more than the required number of underlying sentiments to be deemed valid.

With 44 out of the 53 interviews (83%) being correctly assessed on sentiment with the utilization of the equation referenced, this returned an accuracy of 0.82, which was higher than the required threshold of 0.75. This concluded that using the TextBlob library was highly accurate but not subject to minor deviance. Nonetheless, it can still be utilized with high confidence when applied to a topic such as health care. Table 1 provides a breakdown of the scores.

When applying the algorithm to the tweets gathered, there was a noticeable difference in the sentiments between health care professionals and the general population. This discrepancy highlighted a smaller proportion of neutral tweets from health care professionals’ professional accounts on social media. This difference was proven to be statistically significant.

When using the chi-square test equation for goodness of fit, a χ² value of 35.455 is achieved (χ² [2, n= 555]=35.455; P<.001). As this value is higher than the 9.21 required for significance to be achieved, the results can be deemed statistically significant. Thus, it can be stated that health care professionals attach more sentiment to their posts on Twitter than seen in the general population. Table 2 provides a more detailed breakdown of the scores and comparison. Table 3 and Table 4 provide an illustration of the sentiment scores. Table 3 shows that the sentiment of health care professionals was more positive, less neutral, and less negative than expected. Table 4 shows the variance in the sentiment between the tweets between health care professionals and the general population of tweets with the same specified hashtags. This figure suggests that tweets by health professionals were more positive, less negative, and approximately the same level of neutrality when compared to the general population of tweets. Table 5 provides a breakdown of sample tweets.

Table 1. The sentiment algorithm, TextBlob, on interviews of health care professionals regarding implicit bias in medicine

Group	Total interviews	Correctly scored	Accuracy
Pediatric physician	11	10	0.90
Pediatric nurse	10	8	0.80
Psychiatric nurse	11	10	0.90
Psychiatric resident	10	7	0.70
Psychiatric physician	11	9	0.82
Total	53	44	0.83

Table 2. The sentiment score of the tweets by health professionals and their associated chi-square values with intermediaries.

Sentiment Score	Observed	Expected	Difference	Difference Sq.	Difference Sq./Exp Fr.^a
Positive	275.00	211.18	63.82	4073.3773	19.2889
Negative	42.00	70.70	–28.70	823.5057	11.6484
Neutral	238.00	273.13	–35.13	1233.8518	4.5175
Total					35.455^b

^aExpected Fraction.

^bChi-square statistic value which is used to determine statistical significance.

Table 3. The observed and expected values for sentiment score of the tweets by health professionals.

	Observed	Expected
Positive	275	211.18
Negative	42	70.7
Neutral	238	273.13

Table 4. The variance in sentiment between health care professionals and the general population with respect to the same hashtags.

	Health Professionals	General Population
Positive	45.66	38.22
Negative	5.2	12.74
Neutral	49.14	4904

Table 5. Sample tweets from health care professionals and the general population with equity-related hashtags.

	Positive	Neutral	Negative
Health professionals	It’s #WomensHistoryMonth and #AMWA will be spotlighting incredible #womeninmedicine all month-long!	Please join us at the diversity in medicine conference #docswithdisabilities	Maybe if I work hard enough and almost die of COIVD my patients will start calling me “doctor” instead of “mademoiselle” #genderbias #womeninmedicine
General population	Moving from a safe place to a brave place to address issues of #implicitbias #CCM49 Great session thank you!	Check out this in-depth podcast…on educating scientific communities #implicitbias	Gender bias is not a good look #genderbias #checkyourgenderbias #unconsciousbias

Principal Findings

The finding that health professionals are more likely to show and convey emotions regarding equity-related issues on social media has implications for teaching and learning about sensitive topics related to equity and bias for health professionals. Such emotions are likely to influence learning processes and therefore must be considered in the design, delivery, and evaluation of equity and bias-related education.

Emotions and Identity in Health Professions Education

Our aims through this research were to gain further insight into how emotions influence equity and bias-related education through SA. By leveraging advances in ML technology, NLP, and SA, we developed, tested, and applied a novel SA algorithm to social media discourse. Our findings suggest that health professionals are more likely to convey emotions on social media about equity-related topics than the general public. Although previous research has found evidence that there are defensive reactions to discussions about bias among both health professionals and the general public [Howell J, Redford L, Pogge G, Ratliff K. Defensive responding to IAT feedback. Social Cognition. 2017. URL: https://doi.org/10.1521/soco.2017.35.5.520 [accessed 2021-01-31] 30-Vitriol J, Moskowitz G. Reducing defensive responding to implicit bias feedback: On the role of perceived moral threat and efficacy to change. Journal of Experimental Social Psychology. URL: https://doi.org/10.1016/j.jesp.2021.104165 [accessed 2021-10-31] 32], our SA findings suggest that health professionals may be uniquely susceptible to defensiveness and counter-react through positive emotion as a response.

This finding aligns with previous research on defense mechanisms to grapple with the reality of an individual’s role in perpetuating prejudice or discrimination [Juby H. Racial ambivalence, racial identity and defense mechanisms in white counselor trainees. Columbia University 2005:E [FREE Full text]33]. Our study suggests the evidence of reaction formation as a defense for learners. Reaction formation refers to when an individual forms an attitude that is the opposite of one’s threatening or unacceptable actual thoughts [Baumeister R, Dale K, Sommer K. Freudian Defense Mechanisms and Empirical Findings in Modern Social Psychology: Reaction Formation, Projection, Displacement, Undoing, Isolation, Sublimation, and Denial. Journal of Personality 2002 Jan 04;66(6):1081-1124. [CrossRef]34]. By conveying a higher degree of positive sentiment, health professionals may be attempting to project that they are more neutral or objective when, in reality, they demonstrate the same degree of bias as the general population [FitzGerald C, Hurst S. Implicit bias in healthcare professionals: a systematic review. BMC Med Ethics 2017 Mar 01;18(1):19 [FREE Full text] [CrossRef] [Medline]35].

We also found that variance in sentiment between health professionals and the public suggests that not only do health professionals convey more emotion, but they also demonstrate greater sentiment variance related to positive emotion compared to the general public, who convey greater variance related to negative emotion. Greater positive sentiment among health professionals suggests that health professionals are utilizing Twitter differently than the general public. Therefore, our findings suggest caution for health professions educators who attempt to challenge normative thinking of health professionals as neutral or objective. Skilled facilitators may be necessary to mediate and regulate emotions among both teachers and learners when such challenges arise [Gonzalez CM, Lypson ML, Sukhera J. Twelve tips for teaching implicit bias recognition and management. Med Teach 2021 Dec;43(12):1368-1373. [CrossRef] [Medline]36].

Emotions and Social Media

Social media discourse provides an opportunity to explore how individuals react to social issues and world events. Tweets provide a source of data that can be automatically classified according to sentiment to provide insights into the emotional nature of certain topics. Although SA has been previously used for digital marketing or opinion mining, its use in health professions education research has been to date quite limited.

Global events and social movements related to equity and bias, such as #BlackLivesMatter and #JusticeforGeorgeFloyd, underscore the importance of social media discourse as it relates to teaching and learning about bias. Such reactions among both health professionals and the general public during unexpected events can provide evidence for collective sense-making [Gilles I, Bangerter A, Clémence A, Green EGT, Krings F, Mouton A, et al. Collective symbolic coping with disease threat and othering: a case study of avian influenza. Br J Soc Psychol 2013 Mar;52(1):83-102. [CrossRef] [Medline]37], social sharing of emotions [Bee C, Neubaum D. The role of cognitive appraisal and emotions of family members in the family business system. Journal of Family Business Strategy. 2014. URL: https://doi.org/10.1016/j.jfbs.2013.12.001 [accessed 2021-01-31] 38], and individual strategies of approach/avoidance [Sandover S, Jonas-Dwyer D, Marr T. Graduate entry and undergraduate medical students' study approaches, stress levels and ways of coping: a five year longitudinal study. BMC Med Educ 2015 Jan 24;15:5 [FREE Full text] [CrossRef] [Medline]39]. Emotions also mediate how contact between and among different social groups can effectively address prejudice [Seger CR, Banerji I, Park SH, Smith ER, Mackie DM. Specific emotions as mediators of the effect of intergroup contact on prejudice: findings across multiple participant and target groups. Cogn Emot 2017 Aug;31(5):923-936. [CrossRef] [Medline]40]. As we set out to explore in our study, SA may be an effective tool to analyze such discourse.

Before it can be effectively applied, however, the limitations of SA require ensuring its accuracy and utility in a health professions education context. Our study provides an example of a SA algorithm that was tested for accuracy before being applied. This algorithm can be used in future research to analyze sentiment associated with social media discourse and may also have future applications to other types of archives such as electronic health records.

Sentiment Analysis in Health Professions Education

Advances in NLP applied to textual data for educational purposes are developing at a rapid pace. SA has demonstrated potential in evaluating instruction, designing policy, enhancing learning systems, and educational research [Dolianiti F, Iakovakis D, Dias S, Hadjileontiadou S, Diniz J, Hadjileontiadis L. Sentiment analysis techniques and applications in education: A survey. 2019 May 29 Presented at: International Conference on Technology and Innovation in Learning, Teaching and Education (pp. ). Springer, Cham; 2018 Jun 20; Thessaloniki, Greece p. 412-427 URL: https://doi.org/10.1007/978-3-030-20954-4_31 [CrossRef]41]. For example, SA has been used to analyze students’ feedback to improve teaching [Altrabsheh N, Gaber M, Cocea M. SA-Entiment analysis for education. Frontiers in Artificial Intelligence and Applications Jun 13 2013:255-262. [CrossRef]42-Munezero M, Montero C, Mozgovoy M, Sutinen E. Exploiting sentiment analysis to track emotions in students' learning diaries. 2013 Nov 14 Presented at: the 13th Koli Calling International Conference on Computing Education Research (pp. ); 2013 Nov 14; Koli Finland p. 145-152. [CrossRef]44] and track students’ emotions across longitudinal learning activities through learning diaries [Munezero M, Montero C, Mozgovoy M, Sutinen E. Exploiting sentiment analysis to track emotions in students' learning diaries. 2013 Nov 14 Presented at: the 13th Koli Calling International Conference on Computing Education Research (pp. ); 2013 Nov 14; Koli Finland p. 145-152. [CrossRef]44]. However, there is a paucity of research into how SA can be applied specifically in a health profession education context.

Our study provides an example and template for future researchers to develop and utilize SA for a variety of purposes. We also hope that our work can provide insights into the emotionally charged nature of teaching and learning about bias and inform future work to develop, implement, and evaluate antibias and antiracism curricula for health professions learners.

Key Implications and Future Directions

For health professions educators to effectively consider emotions in the design, delivery, and evaluation of equity or bias-related curricula, educators should anticipate defensive reactions when emotions are provoked and ensure skilled facilitation for sensitive or emotionally charged discussions. Our finding regarding the unique nature of social media discourse among health professionals and the public also suggests that health advocacy curricula must be augmented with information on digital aspects of advocacy. In addition, existing teaching and learning on digital professionalism may benefit from information regarding sentimentality and how digital aspects of communications differ from traditional media.

Limitations

A key limitation of a SA approach is that SA focuses on categorical aspects of sentiment value such as positive, negative, or neutral. This limits our ability to understand nuanced emotional states that reflect an individual’s experience. Past research on how individuals cope with potentially threatening feedback related to their biases highlights that ambivalence may form an important component of how they can respond to identity threats and move forward towards change [Rothman NB, Vitriol J. Conflicted but Aware: Emotional Ambivalence Buffers Defensive Responding to Implicit Bias Feedback. In: Proceedings. 2018 Aug Presented at: Academy of Management Annual Conference; August 11, 2018; Sedona Arizona p. 16762. [CrossRef]45]. Additional research is therefore needed, particularly into how situations are perceived and the individual and social resources that individuals experience or have to cope with emotions that may interfere with learning.

Another important limitation of using NLP is that it requires the classification model to be trained. This requires intensive learning and manual categorization, and the most accurate models are still continuing to improve. However, the most efficient models have not been trained to categorize health care–specific data. While this research has proven a high accuracy rate (0.83), it must be recognized it is not all-encompassing and open to errors. Nonetheless, accuracy will only continue to improve, and in turn, these models will become more relevant. It will be important to ensure that health care data is used in these training processes.

Our study was conducted in 2019 when BERT (Bidirectional Encoder Representations from Transformers) models were less commonly used. Although such models allow for better sentence processing leveraging the architecture for the Corpus of Linguistic Acceptability, they require an extremely large corpus of testing data for models, which would not necessarily align with our criteria and potential limit accuracy.

Further, it is worth noting that an unavoidable bias exists within any NLP algorithm itself as a human-designed approach may be subject to the biases present in the training data set. This is an area of future work that should be considered as SA in health education evolves with larger data sets.

Lastly, we recognize that our SA was not developed using data from the general public; however, we believed that it would be reasonable to use on general public tweets due to previous research on defensive reactions to bias-related feedback in the general public that align with our previous studies, and other research.

Conclusions

To explore the role of emotions in teaching and learning about bias and equity for health professionals, we developed and tested a SA algorithm of bias-related discourse. We developed a highly accurate SA algorithm that demonstrated health professionals use a higher degree of emotion when communicating about bias on social media compared to the general population. Our findings support that emotions must be considered in the design, delivery, and evaluation of equity and bias-related education.

Acknowledgments

This work was supported by a grant from the Academic Medical Organization of Southwestern Ontario.

Conflicts of Interest

None declared.

Sukhera J, Milne A, Teunissen PW, Lingard L, Watling C. The Actual Versus Idealized Self: Exploring Responses to Feedback About Implicit Bias in Health Professionals. Acad Med 2018 Apr;93(4):623-629. [CrossRef] [Medline]
Sukhera J, Wodzinski M, Teunissen PW, Lingard L, Watling C. Striving While Accepting: Exploring the Relationship Between Identity and Implicit Bias Recognition and Management. Acad Med 2018 Nov;93(11S Association of American Medical Colleges Learn Serve Lead: Proceedings of the 57th Annual Research in Medical Education Sessions):S82-S88. [CrossRef] [Medline]
Sukhera J, Wodzinski M, Milne A, Teunissen PW, Lingard L, Watling C. Implicit Bias and the Feedback Paradox: Exploring How Health Professionals Engage With Feedback While Questioning Its Credibility. Acad Med 2019 Aug;94(8):1204-1210. [CrossRef] [Medline]
Kluger A, DeNisi A. The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological bulletin. 1996. URL: https://doi.org/10.1037/0033-2909.119.2.254 [accessed 2021-01-31]
Trevors G, Muis K, Pekrun R, Sinatra G, Winne P. Identity and epistemic emotions during knowledge revision: A potential account for the backfire effect. Discourse Processes. 2016. URL: https://doi.org/10.1080/0163853X.2015.1136507 [accessed 2021-01-31]
Sukhera J, Watling CJ, Gonzalez CM. Implicit Bias in Health Professions: From Recognition to Transformation. Acad Med 2020 May;95(5):717-723. [CrossRef] [Medline]
Lewandowsky S, Stritzke WGK, Oberauer K, Morales M. Memory for fact, fiction, and misinformation: the Iraq War 2003. Psychol Sci 2005 Mar;16(3):190-195. [CrossRef] [Medline]
Lewandowsky S, Stritzke WGK, Freund AM, Oberauer K, Krueger JI. Misinformation, disinformation, and violent conflict: from Iraq and the "War on Terror" to future threats to peace. Am Psychol 2013 Oct;68(7):487-501. [CrossRef] [Medline]
Nauroth P, Gollwitzer M. Gamers against sciencempirical research on violent video games as social identity threat. InVortrag auf dem EASP Medium Size Meeting? Intergroup conflict: The cognitive, emotional, and behavioral consequences of communication?, Soesterberg, Niederlande. 2013. URL: https://doi.org/10.1002/ejsp.1998 [accessed 2022-02-18]
Nyhan B, Reifler J. When corrections fail: The persistence of political misperceptions. Political Behavior. 2010. URL: https://doi.org/10.1007/s11109-010-9112-2 [accessed 2021-01-31]
Sukhera J. Bias in the Mirrorxploring implicit bias in health professions education. Datawyze Maastricht 2018 Nov 29:E-170. [CrossRef]
Kahan DM, Peters E, Dawson EC, Slovic P. Motivated numeracy and enlightened self-government. Behav. Public Policy 2017 May 31;1(1):54-86. [CrossRef]
Eccles J. Who am I and what am I going to do with my life? Personal and collective identities as motivators of action. Educational psychologist. 2009. URL: https://doi.org/10.1080/00461520902832368 [accessed 2021-01-31]
Teal CR, Shada RE, Gill AC, Thompson BM, Frugé E, Villarreal GB, et al. When best intentions aren't enough: helping medical students develop strategies for managing bias about patients. J Gen Intern Med 2010 May;25 Suppl 2:S115-S118 [FREE Full text] [CrossRef] [Medline]
Nyhan B, Reifler J. Does correcting myths about the flu vaccine work? An experimental evaluation of the effects of corrective information. Vaccine 2015 Jan 09;33(3):459-464. [CrossRef] [Medline]
McNaughton N. Discourse(s) of emotion within medical education: the ever-present absence. Med Educ 2013 Jan;47(1):71-79. [CrossRef] [Medline]
Cloke P, Cooke P, Cursons J, Milbourne P, Widdowfield R. Ethics, Reflexivity and Research: Encounters with Homeless People. Ethics, Place and Environment 2010 Jul;3(2):133-154 [FREE Full text] [CrossRef]
James CA, Wheelock KM, Woolliscroft JO. Machine Learning: The Next Paradigm Shift in Medical Education. Acad Med 2021 Jul 01;96(7):954-957. [CrossRef] [Medline]
Dias RD, Gupta A, Yule SJ. Using Machine Learning to Assess Physician Competence: A Systematic Review. Acad Med 2019 Mar;94(3):427-439. [CrossRef] [Medline]
Friedman C, Hripcsak G. Natural language processing and its future in medicine. Acad Med 1999 Aug;74(8):890-895. [CrossRef] [Medline]
Hao T, Huang Z, Liang L, Weng H, Tang B. Health Natural Language Processing: Methodology Development and Applications. JMIR Med Inform 2021 Oct 21;9(10):e23898. [CrossRef] [Medline]
Elbattah M, Arnaud, Gignon M, Dequen G. The Role of Text Analytics in Healthcare: A Review of Recent Developments and Applications. InHEALTHINF 2021 (pp. 825-832) 2021:825-832. [CrossRef]
Prabowo R, Thelwall M. Sentiment analysis: A combined approach. Journal of Informetrics. 2009. URL: https://doi.org/10.1016/j.joi.2009.01.003 [accessed 2021-01-31]
Mäntylä M, Graziotin D, Kuutila M. The evolution of sentiment analysis?A review of research topics, venues, and top cited papers. Computer Science Review. 2018. URL: https://doi.org/10.1016/j.cosrev.2017.10.002 [accessed 2021-01-31]
Greaves F, Ramirez-Cano D, Millett C, Darzi A, Donaldson L. Use of sentiment analysis for capturing patient experience from free-text comments posted online. J Med Internet Res 2013;15(11):e239 [FREE Full text] [CrossRef] [Medline]
Gohil S, Vuik S, Darzi A. Sentiment Analysis of Health Care Tweets: Review of the Methods Used. JMIR Public Health Surveill 2018 Apr 23;4(2):e43 [FREE Full text] [CrossRef] [Medline]
Kolajo T, Kolajo JO. Sentiment analysis on twitter health news. Fudma Journal of Sciences 2018 Jul 17;2(2):e4. [CrossRef]
Kundi F, Khan A, Ahmad S, Asghar M. Lexicon-based sentiment analysis in the social web. Journal of basic and Applied Scientific Research- ISSN 2090-4304 Jul 13 2014;4(6):238-248.
TextBlob: STP. Texblog. Internet. URL: https://textblog.readthedocs.io.en/dev/index.html [accessed 2021-01-31]
Howell J, Redford L, Pogge G, Ratliff K. Defensive responding to IAT feedback. Social Cognition. 2017. URL: https://doi.org/10.1521/soco.2017.35.5.520 [accessed 2021-01-31]
Howell JL, Ratliff KA. Not your average bigot: The better-than-average effect and defensive responding to Implicit Association Test feedback. Br J Soc Psychol 2017 Mar;56(1):125-145. [CrossRef] [Medline]
Vitriol J, Moskowitz G. Reducing defensive responding to implicit bias feedback: On the role of perceived moral threat and efficacy to change. Journal of Experimental Social Psychology. URL: https://doi.org/10.1016/j.jesp.2021.104165 [accessed 2021-10-31]
Juby H. Racial ambivalence, racial identity and defense mechanisms in white counselor trainees. Columbia University 2005:E [FREE Full text]
Baumeister R, Dale K, Sommer K. Freudian Defense Mechanisms and Empirical Findings in Modern Social Psychology: Reaction Formation, Projection, Displacement, Undoing, Isolation, Sublimation, and Denial. Journal of Personality 2002 Jan 04;66(6):1081-1124. [CrossRef]
FitzGerald C, Hurst S. Implicit bias in healthcare professionals: a systematic review. BMC Med Ethics 2017 Mar 01;18(1):19 [FREE Full text] [CrossRef] [Medline]
Gonzalez CM, Lypson ML, Sukhera J. Twelve tips for teaching implicit bias recognition and management. Med Teach 2021 Dec;43(12):1368-1373. [CrossRef] [Medline]
Gilles I, Bangerter A, Clémence A, Green EGT, Krings F, Mouton A, et al. Collective symbolic coping with disease threat and othering: a case study of avian influenza. Br J Soc Psychol 2013 Mar;52(1):83-102. [CrossRef] [Medline]
Bee C, Neubaum D. The role of cognitive appraisal and emotions of family members in the family business system. Journal of Family Business Strategy. 2014. URL: https://doi.org/10.1016/j.jfbs.2013.12.001 [accessed 2021-01-31]
Sandover S, Jonas-Dwyer D, Marr T. Graduate entry and undergraduate medical students' study approaches, stress levels and ways of coping: a five year longitudinal study. BMC Med Educ 2015 Jan 24;15:5 [FREE Full text] [CrossRef] [Medline]
Seger CR, Banerji I, Park SH, Smith ER, Mackie DM. Specific emotions as mediators of the effect of intergroup contact on prejudice: findings across multiple participant and target groups. Cogn Emot 2017 Aug;31(5):923-936. [CrossRef] [Medline]
Dolianiti F, Iakovakis D, Dias S, Hadjileontiadou S, Diniz J, Hadjileontiadis L. Sentiment analysis techniques and applications in education: A survey. 2019 May 29 Presented at: International Conference on Technology and Innovation in Learning, Teaching and Education (pp. ). Springer, Cham; 2018 Jun 20; Thessaloniki, Greece p. 412-427 URL: https://doi.org/10.1007/978-3-030-20954-4_31 [CrossRef]
Altrabsheh N, Gaber M, Cocea M. SA-Entiment analysis for education. Frontiers in Artificial Intelligence and Applications Jun 13 2013:255-262. [CrossRef]
Rani S, Kumar P. A Sentiment Analysis System to Improve Teaching and Learning. Computer 2017 May;50(5):36-43. [CrossRef]
Munezero M, Montero C, Mozgovoy M, Sutinen E. Exploiting sentiment analysis to track emotions in students' learning diaries. 2013 Nov 14 Presented at: the 13th Koli Calling International Conference on Computing Education Research (pp. ); 2013 Nov 14; Koli Finland p. 145-152. [CrossRef]
Rothman NB, Vitriol J. Conflicted but Aware: Emotional Ambivalence Buffers Defensive Responding to Implicit Bias Feedback. In: Proceedings. 2018 Aug Presented at: Academy of Management Annual Conference; August 11, 2018; Sedona Arizona p. 16762. [CrossRef]

‎

API: application programming interface

ML: machine learning

NPL: natural language processing

SA: sentiment analysis

Edited by T Leung; submitted 29.09.21; peer-reviewed by P Dattathreya, M Elbattah; comments to author 23.11.21; revised version received 24.01.22; accepted 15.02.22; published 30.03.22

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Education, is properly cited. The complete bibliographic information, a link to the original publication on https://mededu.jmir.org/, as well as this copyright and license information must be included.

This paper is in the following e-collection/theme issue:

Leveraging Machine Learning to Understand How Emotions Influence Equity Related Education: Quasi-Experimental Study