Original Research

Validity and reliability of the Thai version of the Utrecht Gender Dysphoria Scale-Gender Spectrum (UGDS-GS) in Thai youths and young adults with gender dysphoria

Abstract

Background Many people who are gender variant have undiagnosed gender dysphoria, resulting in delayed receipt of gender-affirming support and prolonged distress in living with their gender-non-conforming sex. The Utrecht Gender Dysphoria Scale-Gender Spectrum (UGDS-GS) is a newly developed tool that measures dissatisfaction with gender identity and expression. However, there is no translated version of this tool in Thai. Moreover, the sensitivity, specificity and cut-off point of the UGDS-GS to detect gender dysphoria in people who are transgender remain unknown.

Aims This study translated the UGDS-GS into Thai and then examined the validity and reliability of the Thai UGDS-GS.

Methods 185 participants with and without gender dysphoria were selected from the Gender Variation Clinic in Ramathibodi Hospital and from social media platforms. The UGDS-GS was translated into Thai according to the World Health Organization (WHO) guidelines on translation. The medical records of patients with gender dysphoria and semi-structured interviews were used to confirm the diagnosis of gender dysphoria. Subsequently, the validity and reliability of the instrument were analysed.

Results The mean age of participants was 30.43 (7.98) years among the 51 assigned males (27.6%) and 134 assigned females (72.4%) at birth. The Thai UGDS-GS average score was 77.82 (9.71) for those with gender dysphoria (n=95) and 46.03 (10.71) for those without gender dysphoria (n=90). Cronbach’s alpha coefficient was 0.962, showing excellent internal consistency. In addition, exploratory factor analysis showed compatibility with the original version’s metrics. The value of the area under the curve was 0.976 (95% confidence interval: 0.954 to 0.998), indicating outstanding concordance. At the cut-off point of ‘60’, sensitivity and specificity were good (96.84% and 91.11%, respectively).

Conclusions The Thai UGDS-GS is an excellent, psychometrically reliable and valid tool for screening gender dysphoria in clinical and community settings in Thailand. The cut-off point of ‘60’ scores suggests a positive indicator or a high chance of gender dysphoria.

What is already known on this topic

  • Gender dysphoria is a psychological problem that remains underdiagnosed, resulting in delayed gender-affirming therapy.

  • The Utrecht Gender Dysphoria Scale-Gender Spectrum (UGDS-GS) is a newly developed and reliable tool for gender dysphoria assessment in the gender spectrum population; however, screening tools in Thai for identifying gender dysphoria in non-binary transgender individuals are lacking.

What this study adds

  • This study translated the UGDS-GS into the Thai language and evaluated the validity and reliability of this new version.

  • It also suggested a cut-off point for screening gender dysphoria to benefit gender-variant individuals in clinical and community settings.

  • This extension of the original study facilitates cross-cultural comparisons and research collaborations, contributing to a more comprehensive understanding of gender dysphoria on a global scale; however, it is essential to note that further research and validation of the translated tool in various international settings are necessary.

  • This process would involve assessing its psychometric properties, cultural appropriateness and applicability in different populations.

How this study might affect research, practice or policy

  • Further research may use the cut-off point to screen gender dysphoria and adjust cut-off points for the specific purpose of the studies.

  • Also, this instrument could assist in detecting individuals with gender dysphoria in clinical settings and be used as a basic screening tool.

Introduction

Gender-variant individuals in Thailand presently experience more rights and freedom regarding sexual expression and coming out (revealing their sexual identity) than in the past because today Thai people are generally more open and accepting of these gender variations. The estimated prevalence of people who are gender diverse or transgender varies due to different populations studied and measurement methods. The recent prevalence of self-reported transgender identity in children, adolescents and adults, as reported in various references with data from multiple countries such as the USA, Europe, Asia and Australia, ranges from 0.5% to 1.3%.1 Furthermore, many who are transgender experience gender dysphoria (GD).2 In Thailand, studies examining the proportions of transgender and gender-diverse individuals are lacking. However, one study in Thailand which aimed to assess the content and linguistic validity of a translated version of a sexual orientation and gender identity measure among an online population of 282 individuals found that 9.9% reported being transgender, 18.8% identified as homosexual and 6.0% identified as bisexual.3 Among Thai adolescents, a study conducted in three schools in Bangkok with a sample size of 600 students found that 16.3% identified as non-cisgender and 35.2% identified as non-heterosexual.4

Diagnostic criteria for GD were developed when the American Psychiatric Association’s Diagnostic and Statistical Manual of Mental Disorders, Third Edition initially included ‘gender identity disorder of childhood and transsexualism’ (for adolescents and adults) in 1980.5 Afterwards, in the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV)6 and the DSM-IV, Text Revision,7 the term ‘gender identity disorder’ was used. Next, in 2013, the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) removed the word ‘disorder’ and added the term ‘gender dysphoria’. This change was to destigmatise the diagnosis—as it was not a ‘disorder’—and to more accurately refer to the psychological distresses related to the marked incongruence between one’s experienced/expressed gender and assigned gender.8–10

The diagnostic criteria will likely continue to change as the body of knowledge increases and gender diversity is depathologised.10 However, diagnosing GD is difficult for non-medical professionals and medical professionals in non-related fields, leading to misdiagnosis and delayed intervention. In addition, mental health stigma may also play a role in delayed diagnosis and proper management of psychiatric symptoms/disorders commonly seen in transgender individuals with GD, such as anxiety, depression and behavioural problems (eg, self-harm, suicide).9 11

Various assessment tools have been developed for use during clinical interviews to assist clinicians with the complicated evaluation of the condition. To our knowledge, the tools that have been validated and found to be reliable and that are widely used for GD assessment in both adolescents and adults are the Gender Identity/Gender Dysphoria Questionnaire for Adolescents and Adults (GIDYQ-AA) and the Utrecht Gender Dysphoria Scale (UGDS).12 While GIDYQ-AA has numerous items (27 items), UGDS has only 12 items, making it less complicated and easier to use. However, dimorphic standardisation—characterising gender identity into only male-to-female or female-to-male categories and designing separate questionnaires for these two groups—of the UDGS and GIDYQ-AA does not allow for the clinical assessment of those who are non-binary transgender. In addition, after a gender role change, the questionnaire used for evaluation differs from the one used before the transition, making pregender and postgender role change comparisons infeasible. The revision of a gender-neutral, single-version adaptation of the original UDGS tool was then developed as the Utrecht Gender Dysphoria Scale-Gender Spectrum (UGDS-GS).13

The UGDS-GS is a self-report questionnaire comprising 18 items. It allows individuals to express their unique experiences and perceptions of GD. This subjective perspective is valuable in understanding personal aspects of gender identity and dysphoria. It also provides privacy and confidentiality, enabling participants to respond honestly without external judgement. The questionnaire uses a 5-point Likert scale, with higher scores indicating greater dysphoria. It has been revised from previous UGDS versions to include all gender identities and expressions. The scale is designed to be appropriate for individuals across different age groups, from adolescence to adulthood. It can be administered at any stage of the social or medical transition process. The questionnaire is designed to be time-efficient, taking no more than 10 min to complete. It has undergone validation for use with both binary and non-binary transgender individuals, enhancing its applicability to diverse populations.14 15

A Thai version of screening tools for GD among all gender identities and expressions is still lacking, so this study’s objectives are to develop a Thai UGDS-GS and then evaluate its validity and reliability for GD diagnosis in Thailand.

Methods

Study design

This study was a cross-sectional study. Data were gathered from October 2021 to June 2022.

Participants

Participants were eligible for inclusion if they were adolescents and adults aged 13 years or older with the ability to understand the Thai language and answer questionnaires independently. They were excluded if they could not understand the Thai language or did not give informed consent. Participants were selected from public relations outreach via social media platforms (eg, Facebook and other websites) and the Gender Variation (GEN-V) Clinic at Ramathibodi Hospital in Bangkok. The GEN-V Clinic group comprised participants who visited the clinic for hormonal therapy, sex reassignment surgery or consultation during the study period. Most of them had been diagnosed with GD by psychiatrists, paediatricians or endocrinologists experienced in transgender care during clinical interviews using DSM-5 criteria. The medical records of these participants were reviewed by the researchers to confirm the diagnosis. Some participants in the social media platforms group with no history of GEN-V Clinic visits and who identified themselves as cisgender were selected for semi-structured interviews. We conducted semi-structured interviews within this group of participants to screen for potential cases of GD, using the DSM-5 criteria as our diagnostic guideline. We aimed to ensure the anonymity of the questionnaire responses and provided participants with information to gain confidence in sharing their sensitive issues. The number of participants was calculated to be 180, based on the sample size appropriate for factor analysis which is 10 subjects per one variable of the questionnaire.16 17

Measurements

Demographic data

Personal data that included age, hometown, religion, educational level, monthly income and gender-related data (assigned sex, affirmed gender, sexual orientation, history of breast/genital surgery and history of sex hormone use) were collected by online self-report questionnaire.

Development of the Thai version of the UGDS-GS

The Thai version of the UGDS-GS was developed following the World Health Organization (WHO) Guidelines on Translation and Adaptation of Instruments,18 after being granted permission from the original author. The translation process consisted of a forward translation from English to Thai by two of our authors and a reverse translation into English by another coauthor; all are experts in English and Thai and have experience in providing care to those who are transgender. Then cognitive interviews were randomly conducted with 10 participants, including those with and without GD. The content validity was evaluated using item-objective congruence by two of the researchers. Final modifications and adjustments were made accordingly.

The Thai UGDS-GS consisted of 18 self-report questions about gender affirmations (items 1, 3–5) and GD (items 2, 6–18). Each item has a 5-point Likert rating scale option as follows: (1) disagree completely, (2) disagree, (3) neither agree nor disagree, (4) agree and (5) agree completely. The greater the total sum scores of the questionnaire, the greater the dysphoria about their gender.

Semi-structured interview

Some participants from the social media group were randomly selected for semi-structured interviews based on DSM-5 criteria for GD in children and adolescents/adults. The interviews were conducted by one of the two researchers who specialise in child and adolescent psychiatry and have extensive experience providing mental healthcare to transgender individuals. This expertise was essential in ensuring an accurate diagnosis of GD. Cohen’s kappa analysis of the inter-rater reliability of seven semi-structured interviews in this group conducted by the two researchers showed excellent agreement (kappa=p<0.05).

Statistical analysis

Descriptive statistics were used to calculate the frequency, percentage, means and SD of demographic and gender-related variables and the scores of the Thai UGDS-GS. The χ2 test was used for analysing categorical data. The reliability of the Thai UGDS-GS was analysed using Cronbach’s alpha coefficient. The validity of the Thai UGDS-GS was also analysed using an exploratory factor analysis (EFA) (Varimax method) and sensitivity/specificity to evaluate construct validity and criterion validity, respectively. The best cut-off score was calculated using the area under the curve (AUC) and the receiving operating characteristic (ROC) curve by determining the proper sensitivity and specificity. Cohen’s kappa analysis was used to find inter-rater reliability. A subgroup analysis was performed on youth participants to assess the reliability and validity of the questionnaire among this group.

All statistical data were analysed using SPSS V.18.0. A p value of <0.05 was considered statistically significant.

Results

Demographic and gender-related data

Three hundred and thirty-eight participants completed online questionnaires. Two hundred and forty-four of them (72.2%) were from social media platforms, and 94 (27.8%) were from the GEN-V Clinic. Of the participants from social media, 135 were randomly selected for semi-structured interviews. Forty-four of those were excluded as they could not be contacted or declined to be interviewed. Finally, data from 185 participants were obtained for statistical evaluation, as shown in figure 1.

Figure 1
Figure 1

Flowchart of the study. GEN-V, Gender Variation.

The mean age of participants was 30.43 years (standard deviation (SD)=7.98, min–max=15–60 years). Most of the participants had assigned female sex at birth (n=134, 72.4%). There was no statistically significant difference in the demographic data between the GD-positive and GD-negative participants. However, most of the gender-related data were significantly different. GD-positive participants tended to identify themselves as binary transgender (male-to-female or female-to-male transgender) or non-binary transgender (genderqueer or gender other than the above-mentioned); they had more homosexual orientation and experience in gender-affirming surgery or hormonal usage than the GD-negative participants. The Thai UGDS-GS average score was 77.82 (9.71) in the GD group and 46.03 (10.71) in the non-GD group, a statistically significant difference as shown in table 1. (The mean score and SD of each Thai UGDS-GS item are shown in online supplemental table 1.)

Table 1
|
Demographic and gender-related data and differences between GD-positive and GD-negative participants

Reliability and item analysis

For internal consistency reliability, the overall Cronbach’s alpha coefficient of the Thai UGDS-GS was 0.962. Correlations of all items ranged from 0.596 to 0.931, indicating very good discrimination, except for items 1, 3 and 4, which were 0.263, 0.228 and 0.227 (good discrimination), respectively. The mean scores for all items and Cronbach’s alpha values (if the item was deleted) are also shown in table 2.

Table 2
|
Thai version of the Utrecht Gender Dysphoria Scale-Gender Spectrum content validation and reliability test

Validity analysis

Construct validity: exploratory factor analysis

The EFA was done using the Kaiser-Meyer-Olkin (KMO) test and Bartlett’s test. The KMO measure of sampling adequacy and the χ2 of Bartlett’s test of sphericity were 0.850 and 1017.727 (p<0.001), respectively. As shown in table 3, the Thai UGDS-GS questions were divided into four factors: the first factor consisted of items 7–11, 14–17; the second factor consisted of items 12, 13, 18; the third factor consisted of items 1, 3–5; the fourth factor consisted of items 2, 6.

Table 3
|
Exploratory factor analysis of the Thai version of the Utrecht Gender Dysphoria Scale-Gender Spectrum

The EFA was then again conducted by categorising all of the factors into two-factor groups (table 3), which correlated with the original paper. The first group consisted of items 2, 6–18, and the second group consisted of items 1, 3–5.

Criterion validity: sensitivity/specificity

The criterion validity was assessed by calculating the sensitivity and specificity from the semi-structured interview results and from the GEN-V Clinic medical records to compare the GD and non-GD groups.

The appropriate cut-off point was estimated by plotting the ROC curve. The AUC of the ROC curve from the Thai UGDS-GS was 0.976 (95% confidence interval (CI): 0.954 to 0.998), as shown in online supplemental table 2.

Sensitivity and specificity for the appropriate cut-off point of the Thai UGDS-GS were 96.84% and 91.11%, respectively. Positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (PLR) and negative likelihood ratio (NLR) for the appropriate cut-off point were 92.00%, 96.47%, 10.89 and 0.03, respectively. Sensitivity, specificity, PPV, NPV, PLR and NLR for other cut-off points of the Thai UGDS-GS are shown in table 4.

Table 4
|
Sensitivity, specificity, predictive and likelihood ratios of the Thai version of the Utrecht Gender Dysphoria Scale-Gender Spectrum

Subgroup analysis

A total of 12 youth participants were included in this study, with a mean age of 16.08 (1.08) years. The majority of these participants were identified as male (n=9, 75.0%). The Thai UGDS-GS questionnaire demonstrated excellent internal consistency reliability with a Cronbach’s alpha coefficient of 0.911 among youth participants. However, due to the small sample size, factor analysis was not conducted as recommended guidelines suggested a minimum of 180 participants (10 subjects per one questionnaire variable) for robust and reliable results.16 The sensitivity and specificity of the Thai UGDS-GS at the appropriate cut-off point (66 points) were both 100.00%. It is important to note that only two youth participants in our study were GD-negative; therefore, these results should be interpreted cautiously. Future studies with larger sample sizes are necessary to confirm these findings (online supplemental tables 3–5).

Discussion

Main findings

This study’s objectives were to develop, validate and find appropriate cut-off scores for the Thai UGDS-GS screening questionnaire to diagnose GD (the Thai UGDS-GS questionnaire is shown in online supplemental table 6).

For internal consistency reliability using Cronbach’s alpha coefficient, some authors recommended that the alpha be at least 0.90 or, ideally, 0.95 for instruments used in clinical settings.17 ,19 However, the UGDS-GS had a Cronbach’s alpha of 0.962, which exceeded the recommended alpha, representing the measure’s high internal consistency reliability. Correlations of all items showed very good discrimination, except for items 1, 3 and 4, which showed good discrimination. Interestingly, the mean scores of the Thai UGDS-GS indicated high levels of satisfaction with the affirmed gender across both the GD and non-GD groups. These items (items 1, 3 and 4) were initially categorised under the gender affirmation subscales,15 which may have influenced their ability to discriminate between individuals effectively. Nevertheless, it is important to note that significant differences in the mean scores of these specific items were observed between the two groups (see online supplemental table 1). Retaining all items without exclusion could be beneficial for interpretation purposes, as it allows for a comprehensive understanding of the valuable insights these items provide.

Using a method of EFA, the loading factor value of every question was high (>0.4), indicating a high construct validity for the Thai UGDS-GS. To assess the sampling adequacy for our study, we conducted the Kaiser-Meyer-Olkin (KMO) test. The KMO measure of sampling adequacy yielded a value of 0.85, which is considered highly meritorious according to guidelines.20 This result indicates that the dataset is well-suited for exploratory factor analysis, providing confidence in the validity of our findings. The χ2 of Bartlett’s test of sphericity (p<0.001) showed the homogeneity of the measure.21 The questions were divided into four factors, as shown in table 3. The first factor (items 7–11, 14–17) was about dislikes and dissatisfaction regarding incongruent physical appearance (anatomical sex or sex characteristics). These items relate to the medical dimension of gender affirmation. Transgender individuals, especially those with GD, sought gender affirmation through the acquisition of secondary sexual characteristics which concurred with their affirmed gender via surgical or hormonal interventions.22 Moreover, medical procedure engagement was inversely associated with depression, anxiety, stress symptoms, psychological distress and suicidal ideation among transgender adults.23 24 The second factor (items 12, 13, 18) specified feelings of hopelessness and despair related to dysphoria, which agreed with the results of a previous study which found transgender youths tended to have gender minority stress and more feelings of hopelessness than cisgender youths.25 26 Moreover, this feeling led to depression and anxiety and was associated with a higher risk for suicidal ideation and attempts among transgender individuals.26 27 The third factor (items 1, 3–5), representing gender affirmation subscales, correlated with the original paper’s items. These items represent social affirmation, one aspect of the gender affirmation process.22 In contrast, the fourth factor (items 2, 6) involved questions regarding coming out and passive social interaction, which led to feelings of distress for not being accepted or expressing themselves as their affirmed gender. Moreover, all of the factors in the fourth group are related to gender minority stressors, both proximal (eg, internalised stigma, concealment and fear of identity disclosure) and distal stressors (eg, gender-based victimisation/rejection/non-affirmation)28; these might lead to the clinical manifestation of GD. When the factor analysis of the UGDS-GS was analysed by specifying two factors, the gender affirmation subscale and the dysphoria subscale according to the original version, we found the same result (table 3).

As in the original paper, the EFA showed only two factors which were gender affirmation and dysphoria subscales. In this study, we adjusted the factor analysis by fixating two domains, as shown in table 3; this shows exactly the same item in each subscale as in the original paper, indicating that the EFA of the study was in concordance with the previous original study.

Analysing the AUC of the Thai UGDS-GS, the AUC value was 0.976 (95% CI: 0.954 to 0.998), showing outstanding concordance.29 Regarding the criterion validity of the questionnaire, the Thai UGDS-GS cut-off point score of 60 showed good sensitivity at 96.84% and good specificity at 91.11%. Moreover, the PPV and NPV at the same cut-off point were quite high (92.00% and 96.47%, respectively). This indicates a probability that individuals with positive screening results will have GD, and, similarly, those with negative screening results will not have GD. Regarding the PLR and the NLR at the same cut-off score, the results showed that individuals with GD are 10.89 times more likely to have positive screening results than individuals without GD and 0.03 times as likely to have negative screening results as individuals without GD. These results, which showed a PLR of >10 and an NLR of <0.1, indicated that the Thai UGDS-GS had good discrimination ability and was effective in establishing or excluding GD.30–32

Strengths

The diagnosis of GD in this study was determined by clinical interview using DSM-5 criteria, a proven reliable and effective assessment method. Without using the self-report scale, structured interviews were done to diagnose or exclude GD in the participants from social media. In addition, this study benefited from being conducted at the GEN-V Clinic, a multidisciplinary care model for gender-diverse populations. We had access to a well-established clinical population, allowing us to recruit a significant number of participants with GD. This population provided valuable insights into the experiences and perspectives of individuals seeking care at a specialised centre.

Limitations and future directions

This study had some limitations. First, the majority (72.4%) of samples were assigned female; only three participants (1.6%) were non-binary/genderqueer, and most of the participants were from Bangkok and central Thailand, implying that this may not represent the whole country’s population and all non-binary/genderqueer populations. Future studies should aim to include a more diverse sample in terms of gender identity and geographical representation to capture a broader range of perspectives and experiences. In addition, it is also essential to consider the similarities and differences between Thai individuals with GD and their international counterparts. While our study focused specifically on the Thai context, acknowledging these broader perspectives can provide a more comprehensive understanding of GD. Second, the study was done in only one gender variation clinic; a more comprehensive range of clinical settings is suggested to improve the variety of samples in the following studies. To enhance the diversity of samples and increase the external validity of the findings, future research should consider including participants from multiple clinical settings and exploring community-based samples. Third, the diagnosis of GD was made by various experts, although the diagnostic reliability was not calculated. In further studies, diagnostic reliability may be needed. Fourth, although the age range of participants in our study was 15–60 years, the mean age of the participants was 30.43 years which may not reflect the adolescent population. Therefore, it is recommended to focus on teenage samples to obtain more comprehensive information, particularly in light of a recent finding that indicates a decreasing mean age of GD diagnosis.33 Additionally, the subgroup analysis of youth participants was based on a small sample size, which may limit the interpretability of the findings. Future research should include a larger sample size of youth participants to improve the robustness of the results. Fifth, the self-rated nature of the UGDS-GS allowed participants to express their unique experiences of GD, but individual variations in self-awareness and introspection should be considered. Supplementing self-rated data with other sources, such as clinical interviews, can provide a more comprehensive evaluation. Finally, because of the fluidity of both cross-sectional studies and GD, the dysphoria for some participants may be alleviated after sex reassignment therapy or after some time, resulting in different outcomes.24 Evaluation of dysphoria after sex reassignment therapy and a time-course-related study design would solve this limitation and is suggested for further studies.

Implications

The Thai version of the UGDS-GS is an excellent screening tool for GD. The instrument can be used in gender variation clinics or community settings. Positive screening results from the Thai UGDS-GS can potentially lead to earlier and increased gender-affirming care and alleviate psychiatric comorbidity in transgender individuals with GD. However, the clinical interview remains essential and should be performed to confirm the diagnosis.

Public significance statement

This study found that the Thai UGDS-GS is a reliable and valid non-binary measure of GD in gender-spectrum populations. We translated the scale’s original version into Thai as an extension of the original study. Then we determined an appropriate cut-off point for using the scale as a screening instrument for Thai individuals.

Tanawis Jamneankal obtained a Doctor of Medicine (M.D.) degree with second-class honours from Srinakharinwirot University, Thailand, in 2018. While conducting this research, Dr Tanawis was a child and adolescent psychiatry resident at the Department of Psychiatry, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Thailand. He is currently a lecturer at the Department of Psychiatry, Faculty of Medicine, Srinakharinwirot University, Thailand. His main research interests include transgender mental health, child and adolescent mental health and mental health problems.

author bio image