What is already known on this topic
Artificial intelligence (AI)-supported diagnosis of mental illness is a viable prospect for future clinical practice but raises many ethical challenges.
Minimal empirical evidence enlightens how such concerns are viewed by the general public.
Availability of data elucidating lay ethical concerns about AI diagnosis is crucial to ensure socially responsible development and application of new technology.
What this study adds
This paper reports a large-scale representative survey (n=2060) of the US and UK populations, which explored lay perspectives on the ethical issues raised by AI diagnosis in psychiatry, compared with standard Diagnostic and Statistical Manual of Mental Disorders (DSM) diagnostic approaches.
Results identify the specific ethical issues that cause the greatest public concern and suggest that the lay public are less concerned overall about AI compared with DSM diagnosis.
How this study might affect research, practice or policy
These findings alert researchers, practitioners and policymakers to the specific ethical concerns that should be prioritised in developing and implementing new approaches to psychiatric diagnosis.
Understanding the opinions and preferences of the lay public, who represent the users and potential users of mental health services, will help ensure AI diagnostic technologies can be steered towards maximal benefit and minimal harm.
To the editor:
Psychiatric theory, policy and practice are currently grappling with the risks and opportunities presented by artificial intelligence (AI) applications in mental healthcare. Synthesising data to generate diagnosis is an aspect of mental healthcare where AI is anticipated to have the greatest and soonest impact.1–4 While such technologies remain some distance from clinical application, preliminary evidence suggests AI-derived classifications may predict certain treatment outcomes and clinical trajectories, and could soon become available to supplement or replace traditional manual-based diagnostic assessment.5
The use of AI algorithms to diagnose mental illness raises many ethical challenges. These include the potential for security breaches or misuse of private mental health data, the risk that AI trained on biased data sets will reinforce societal inequalities, the risk of false-positive diagnoses that expose patients to stress and discrimination, and issues with the interpretability of ‘black box’ AI decisions.6–13 For any emerging technology, evidence on how the lay public views its ethical challenges (eg, the risks that most concern end users) is vital to ensuring socially responsible application. Moreover, to optimise the value for policy and practice, this analysis should occur prospectively rather than retrospectively, while technological development can still be adjusted in line with societal values and priorities. With AI-informed diagnosis likely approaching implementation in clinical settings, minimal data reveal societal perspectives on this technology or the ethical issues it raises.14
To enlighten these issues, an online survey study was recently conducted, with ethical approval from the Research Ethics Committee of University College Dublin. A research company was contracted to recruit samples in the USA (n = 1060) and UK (n = 1000) that were nationally representative on gender, age and region. Using Qualtrics software, participants were randomly assigned to read one of four vignettes (online supplemental material). All vignettes described a person (‘Morgan’; gender unspecified) undergoing clinical assessment for the same mental health difficulties (eg, flat mood, sleep difficulties, paranoia), but differed in whether they were diagnosed either using an AI or a standard Diagnostic and Statistical Manual of Mental Disorders (DSM) approach. Furthermore, to ensure the generalisability of results given that different diagnostic labels trigger different associations regarding severity and stigma,15 half of the participants (evenly broken down across the AI/DSM groups) read that ‘Morgan’ had been diagnosed with major depressive disorder (MDD) and half with schizophrenia spectrum disorder (SSD). After reading the vignettes and completing a brief attention check, participants were asked to imagine they were in Morgan’s position themselves and to rate their degree of concern (on a 7-point Likert Scale) about 16 issues after receiving their diagnosis. Given the study’s interest in mapping contemporary responses to emerging diagnostic technologies, these 16 issues were derived from a prestudy review of the literature on ethical challenges of AI diagnosis in mental health. The ethical issues were carefully phrased so that they could, in principle, apply to both AI and DSM diagnosis (for instance, the question on ‘bias’ could be equally interpreted as connoting algorithmic or human bias, while ‘intrusion’ could be interpreted with reference to an individual clinician asking personal questions or technology that tracks one’s daily activity and speech). Table 1 displays the range of ethical issues queried, in order of their average levels of concern within the total sample. Self-reported demographic identifications indicated participants were 51.6% female; aged 18–89 (mean = 48.41) years; 25.4% identified as an ethnic minority; 71.5% had tertiary education; and 30.7% had previously received a psychiatric diagnosis.
Data were analysed using SPSS V.27. Participants who failed attention checks (n = 84), completed the survey implausibly quickly (n = 25), or had suspicious response patterns (eg, selecting the same button for every question, n = 10) were removed from the final data set. A two-way multivariate analysis of variance (MANOVA) using Pillai’s trace assessed the impact of vignette condition (Diagnostic Method: DSM vs AI; and Diagnostic Category: MDD vs SSD) on ethical concerns, controlling for country, gender, age, ethnicity, education and personal diagnosis experience. The MANOVA showed no significant interaction between Diagnostic Method and Diagnostic Category. A main effect of Diagnostic Category indicated that people had significantly greater concern about the implications of a diagnosis of SSD than MDD, F(16,2017) = 7.32, p < 0.001, ηp 2 = 0.06. Most interestingly for the present purposes, a main effect of Diagnostic Method suggested that the DSM vignettes elicited significantly more concerns than the AI vignettes, F(16,2017) = 2.94, p < 0.001, ηp 2 = 0.02. Tests of between-subjects effects indicated that the DSM vignettes prompted significantly greater concern on the dimensions of communicability, F(1,2032) = 10.84, p = 0.001, ηp 2 = 0.005, stress, F(1,2032) = 5.90, p = 0.015, ηp 2 = 0.003, and medicalisation, F(1,2032) = 7.76, p = 0.005, ηp 2 = 0.04. Figure 1 displays mean levels of concern across Diagnostic Category and Diagnostic Method conditions.
These results raise multiple important points. First, regarding diagnosis overall, the ethical issues that most concern the lay public relate to the personal and social impacts of a diagnosis (eg, its implications for discrimination, communicability, stress and self-concept). These data suggest the public is relatively unconcerned about the possibility of clinical assessment being biased, intrusive or conducted by an incompetent clinician. The hierarchy of ethical concerns illustrated in Table 1 can inform the development of diagnostic technologies that align with lay priorities. For example, addressing diagnoses’ potential to trigger discrimination and stress, and difficulty explaining a diagnosis to others, is imperative to ensure the acceptability of new diagnostic approaches. While the lower-ranked concerns may reflect genuine indifference, they could also indicate a need to raise public awareness regarding certain risks; for example, the possibility for bias in both clinician judgements and AI algorithms.
Second, results suggest that AI-based assessments do not heighten lay concern relative to traditional DSM diagnosis. On the contrary, accounts of DSM diagnosis elicited more concern about issues such as the diagnosis’ communicability to others, stress to self and medicalisation. This unanticipated result suggests that prevailing manual-based diagnostic methods may not have strong residual acceptability among the general population. In considering the implementation of new diagnostic technology, it is equally important to critically appraise the diagnostic methods it proposes to replace or supplement; while AI diagnosis may raise specific ethical challenges, the public may deem these less risky than the known limitations of traditional diagnostic methods. However, it remains unclear whether the public’s relative comfort regarding AI diagnosis authentically reflects lay priorities, or results from unfamiliarity with a still-hypothetical clinical technique. Moreover, the fictional clinical cases described in the vignettes all resulted in classification into a traditional diagnostic category (MDD or SSD). Since one anticipated outcome of AI diagnosis is the elimination or subdivision of traditional diagnoses by algorithmically derived diagnoses reflecting intricate biological and behavioural profiles,6 10 public and service-user responses to unfamiliar precision diagnoses represent a further unknown that requires clarification.
This preliminary study is subject to numerous limitations, particularly pertaining to the reliance on hypothetical vignettes, the superficial nature of the online survey method and the unavailability of previously validated measures of ethical concern. Nevertheless, it represents the first data internationally on how lay publics evaluate the ethical challenges of AI-enabled diagnostic technologies. As public opinion will likely evolve in parallel with technological developments, continuing to track lay perspectives as AI diagnosis comes onstream is crucial to ensuring it can be steered towards the maximal benefit and minimal harm.