Mixed Impressions
Explicitly unbiased large language models still form biased associations
Xuechunzi Bai et al.
Proceedings of the National Academy of Sciences, 25 February 2025
Abstract:
Large language models (LLMs) can pass explicit social bias tests but still harbor implicit biases, similar to humans who endorse egalitarian beliefs yet exhibit subtle biases. Measuring such implicit biases can be a challenge: As LLMs become increasingly proprietary, it may not be possible to access their embeddings and apply existing bias measures; furthermore, implicit biases are primarily a concern if they affect the actual decisions that these systems make. We address both challenges by introducing two measures: LLM Word Association Test, a prompt-based method for revealing implicit bias; and LLM Relative Decision Test, a strategy to detect subtle discrimination in contextual decisions. Both measures are based on psychological research: LLM Word Association Test adapts the Implicit Association Test, widely used to study the automatic associations between concepts held in human minds; and LLM Relative Decision Test operationalizes psychological results indicating that relative evaluations between two candidates, not absolute evaluations assessing each independently, are more diagnostic of implicit biases. Using these measures, we found pervasive stereotype biases mirroring those in society in 8 value-aligned models across 4 social categories (race, gender, religion, health) in 21 stereotypes (such as race and criminality, race and weapons, gender and science, age and negativity). These prompt-based measures draw from psychology's long history of research into measuring stereotypes based on purely observable behavior; they expose nuanced biases in proprietary value-aligned LLMs that appear unbiased according to standard benchmarks.
The Manifestation of the Karen Trope in the Workplace: A Reconsideration of Stereotypes of White Women at Work
Kristina Tirol-Carmody et al.
Personnel Psychology, forthcoming
Abstract:
As research largely considers White women a race-neutral gender group associated with generic feminine attributes (e.g., communality), there is a limited theoretical understanding of how their intersectional racial and gender identities might combine to shape their work experiences. Yet the recently emerged Karen trope -- popularized on social media as White women who complain incessantly -- deviates from generic feminine attributes and may have seeped into the workplace to potentially create unique work experiences for White women based on their race and gender. Using a mixed-method design across three studies, we explore the implications of how the Karen trope has manifested within organizations. Our qualitative study revealed that employees have adopted this trope to label White female colleagues who prohibitively voice as workplace Karens, leading to various social penalties for these individuals. Integrating these preliminary findings with a stereotype activation and application lens, we develop a conceptual model of when White women (vs. men and non-White women) conduct prohibitive voice, they activate a broader stereotype that White women are workplace Karens, causing observers to perceive them as having less organizational concern, which leads to lower promotability evaluations and decreased intent to rely on their voice. Using a series of experimental vignettes and a field study combining a critical incident technique with random assignment to experimental conditions, we find empirical support for our model. Our results enhance the theoretical and practical understanding of a rare context wherein Whiteness and gender interact to create negative work experiences for an otherwise advantaged social group.
Do female experts face an authority gap? Evidence from economics
Hans Sievertsen & Sarah Smith
Journal of Economic Behavior & Organization, March 2025
Abstract:
This paper reports results from a survey experiment comparing the effect of (the same) opinions expressed by visibly senior, female versus male experts. Members of the public were asked for their opinion on topical issues and shown the opinion of either a named male or a named female economist, all professors at leading US universities. There are three findings. First, experts can persuade members of the public -- the opinions of individual expert economists affect the opinions expressed by the public. Second, the opinions expressed by visibly senior female economists are more persuasive than the same opinions expressed by male economists. Third, removing credentials (university and professor title) eliminates the gender difference in persuasiveness, suggesting that credentials act as a differential information signal about the credibility of female experts.
Cultural Capital Signaling and Class-Related Selection Biases in Employment and Education
Laura Guzman et al.
Basic and Applied Social Psychology, forthcoming
Abstract:
Class-related norms, tastes, and social expectations -- termed cultural capital -- signal class identity. We provide evidence that cultural capital signals of a higher class help to produce socioeconomic hierarchies. Study 1 demonstrates that college students perceive target individuals signaling highbrow cultural capital as wealthier, more competent, and more deserving of a prestigious occupational role compared to targets signaling popular cultural capital. In Study 2, we reveal that while only college admission counselors at more expensive selective institutions are more likely to respond to student applicants signaling highbrow (vs. popular) extracurricular activities, all admissions counselors expend greater effort in responding to such students. These findings reveal that signals of cultural capital can be potent sources of inequality maintenance, legitimization, and expansion.
Learning too much from too little: False face stereotypes emerge from a few exemplars and persist via insufficient sampling
Xuechunzi Bai et al.
Journal of Personality and Social Psychology, January 2025, Pages 61-81
Abstract:
Face stereotypes are prevalent, consequential, yet oftentimes inaccurate. How do false first impressions arise and persist despite counter-evidence? Building on the overgeneralization hypothesis, we propose a domain-general cognitive mechanism: insufficient statistical learning, or Insta-learn. This mechanism posits that humans are quick statistical learners but insufficient samplers. Humans extract statistical regularities from very few exemplars in their immediate context and prematurely decide to stop sampling, creating and perpetuating locally accurate -- but globally inaccurate -- impressions. Six experiments (N = 1,565) tested this hypothesis using novel pairs of computer-generated faces and social behaviors by fixing the population-level statistics of face-behavior associations to zero (i.e., no relationship). The initial sample contained either 11, five, or three examples with either a positive, zero, or negative linear relationship between facial features and social behaviors. The sampling procedure contained a free-sampling condition in which participants were free to decide when to stop viewing more examples and a fixed-sampling condition in which participants were forced to view all stimuli before making decisions. Consistent with the Insta-learn mechanism, participants learned novel face stereotypes quickly, with as few as three examples, and did not sample enough when they were given the freedom to do so. This domain-general cognitive mechanism provides one plausible origin of false face stereotypes, demonstrating negative consequences when people learn too much from too little.
Social Category Modulation of the Happy Face Advantage
Douglas Martin et al.
Personality and Social Psychology Bulletin, forthcoming
Abstract:
The size of the happy face advantage -- faster categorization of happy faces -- is modulated by interactions between perceiver and target social categories, with reliable happy face advantages for ingroups but not necessarily outgroups. The current understanding of this phenomenon is constrained by the limited social categories typically used in experiments. To better understand the mechanism(s) underpinning social category modulation of the happy face advantage, we used racially more diverse samples of perceivers and target faces and manipulated the intergroup context in which they appeared. We found evidence of ingroup bias, with perceivers often showing a larger happy face advantage for ingroups than outgroups (Experiments 1-2). We also found evidence of majority/minority group bias, with perceivers showing a larger happy face advantage for majority outgroups than minority outgroups (Experiments 2-3c). These findings suggest social category modulation of the happy face advantage is a dynamic context-dependent process.
Supernatural enemies affect outgroup attitudes through strength of human identity? A conceptual replication of Ellithorpe et al. (2018)
Shay Xuejing Yao et al.
Journal of Media Psychology, forthcoming
Abstract:
Previous research investigated how a shared human identity may promote interracial inclusion in the context of mass media (Ellithorpe et al., 2018). The original study found promising results of reduced intergroup prejudice through watching the supernatural genre. However, the original results had mixed support to theory, that is, the Common Ingroup Identity Model (CIIM). To test whether the expected and unexpected results of Ellithorpe et al. (2018) would exhibit again with a different sample, we replicated and extended this research. The overall pattern of our findings is consistent with the original study, in that we again found a cross-over interaction between human identity cues (human villains vs. nonhuman villains) and diversity cues (all-White heroes vs. racially diverse heroes) on the strength of human identity. Participants reported stronger human identity when watching racially diverse heroes fighting against nonhuman villains compared to human villains. Participants also reported stronger human identity when watching White heroes fighting against human villains compared to non-human villains. However, we are hesitant to claim that Ellithorpe et al.'s results were fully replicated in our research as most simple main effects of group comparisons were statistically nonsignificant. Overall, our findings offered limited support to Ellithorpe et al.'s original study and minimal support to CIIM. Our experiment also extended the original research. We found that the impact of watching supernatural genre on ingroup relationship may be more likely to work through reduced negative attitudes rather than increased positive attitudes.
Stuck in the middle: Men experience countervailing reactions to discussions about misogyny and violence against women
Morgana Lizzio-Wilson et al.
Psychology of Men & Masculinities, forthcoming
Abstract:
Across three preregistered studies (total N = 1,344), we sought to understand how men react to discussions about violence against women. Initially, we expected that highly identified men would react defensively. That is, exposure to antiviolence advocacy would lead highly identified men to engage in outgroup derogation (i.e., minimize the prevalence of violence against women, exaggerate women's gender-based privilege), ingroup favoritism (i.e., subtype perpetrators of violence, support men's rights activism), and reduce their willingness to engage in collective action to end violence against women. We further expected that these reactions would be explained by social identity threat over concerns that men were being unfairly derided and negatively stereotyped. However, the findings revealed a more complex pattern of responding. On the one hand, exposure to these discussions (vs. a control message) elicited social identity threat, which, in turn, predicted higher outgroup derogation and ingroup-favoring responses (Studies 1-3) and lower action intentions (Studies 2 and 3). But exposure also elicited collective guilt (Studies 2 and 3) and perceived injustice regarding women's disadvantage (Study 3), which predicted lower outgroup derogation and ingroup favoritism and higher action intentions. These opposing reactions fully offset each other and were not moderated by ingroup identification. These findings uncover a paradox in the fight for gender parity by showing that, in the face of messages that highlight inequality, men exhibit countervailing motivations to both protect their group's interests and better women's treatment. We discuss the implications of these findings for involving men in gender equality efforts.