Data on the Margins: Identifying LGBTIQ+ data in European Data Archives

There are lots of data collected on opinions towards LGBTIQ+ people, for example the European Value Study or the Eurobarometer. But which data exist that tell us more about these populations from their own perspectives? To answer this question, the authors researched all data archives of CESSDA ERIC, the Consortium of European Social Science Data Archives, for data collected from LGBTIQ+ populations and found 66 LGBTIQ+ datasets in 9 archives. By analyzing the dataset characteristics, coverage(s), and topics, they attempt to assess an important data gap and identify where data is still lacking.
Es gibt viele Daten, die die Meinung über LGBTIQ+-Personen erfassen, zum Beispiel im Rahmen der Umfrageprogramme European Value Study oder Eurobarometer. Aber welche Daten gibt es, die uns mehr über diese Bevölkerungsgruppen aus ihrer eigenen Perspektive erzählen? Um diese Frage zu beantworten, haben die Autor*innen alle Datenarchive von CESSDA ERIC, dem Consortium of European Social Science Data Archives, nach Daten über LGBTIQ+-Personen durchsucht und 66 LGBTIQ+-Datensätze in 9 Archiven gefunden. Durch die Analyse der Merkmale der Datensätze, der Abdeckung(en) und der Themen versuchen sie, eine wichtige Datenlücke zu bewerten und zu ermitteln, wo noch Daten fehlen.
DOI: 10.34879/gesisblog.2024.82
LGBTIQ+ Populations and Data Gaps
Data gaps are a significant lack of data about population groups. If data about certain groups are faulty or missing, this can lead to biases in research results, policies- and systems-design. Consequently, this impacts (parts of) the population about whom data is not available in considerable, sometimes in life-threatening ways. The gender data gap is a very prominent example and refers to the lack of data collected from and about women. For example, crash test dummies modelled on the female body were only introduced in the EU in 2023, a fact that lead to higher injury rates among women in car accidents.1 Generally, data gaps most often affect marginalized groups and exist due to unequal power relations.2
The data gap concerning LGBTIQ+ people is even larger than the gender data gap. Lesbian, gay, bisexual, trans, intersex, and queer people, as well as those with non-normative (a)sexual or (a)gender identities not falling under one of these labels, are often considered a “hidden population”.3 (Population-)level data about these groups are often not collected – e.g., because of oversight, because it is inconvenient, or due to political resistance, or to stigma and criminalization of homosexuality and non-cisgender persons.
The Corona pandemic revealed drastic implications of this data gap. For the US it was reported that not only have LGBTIQ+ persons higher prevalence of many underlying conditions associated with severe COVID-related illnesses.4 They also suffered from worse mental health, increased layoffs, decreased ability to meet basic needs, and worsening social connectedness compared to the majority group during the pandemic.5 Better knowledge about LGBTIQ+ persons’ physical and mental health statuses, job situations, and their social network could have improved the government’s and aid organizations’ response to these needs and their situation during the pandemic. As this was a likely outcome also in other countries, in its LGBTIQ Equality Strategy 2020-2025, the European Commission points out that “[r]eliable and comparable equality data will be crucial for assessing the situation of LGBTIQ people and to effectively tackling inequalities’, vowing to support data collection efforts as part of implementing the strategy”.6
To learn more about which LGBTIQ+ datasets are available from European social science data archives, we searched for such data in the catalogs of all 23 CESSDA member and former member archives, and 12 associated partners. A dataset was considered relevant if an LGBTIQ+ sample was the main focus of the data collection. This includes oversampling for LGBTIQ+ populations or studies where they are a significant part of the sample, for example when an LGBTIQ+ population is contrasted with a heterosexual and/or cisgender and/or endosex (i.e., non-intersex) control group.
Fig.2: Share of LGB+ persons worldwide & Only 0.145% of all datasets are LGBTIQ+ data

Results
Overall, we found that comparatively little LGBTIQ+ data are available in European Social Science Data Archives: Only 66 datasets out of the approximately 45,600 datasets stored in CESSDA archives have LGBTIQ+ samples or oversamples.
Data and sample characteristics
- General coverage: Only 0.145% of all datasets are LGBTIQ+ data, compared to the share of LGB+ persons worldwide of 11%, that of trans persons 2%.7 8
- Method: Thirty datasets are quantitative datasets, 29 are qualitative, and 7 are mixed-method.
- Publication: All datasets were published between 1990 and 2023.
- Temporal coverage: Temporal coverage goes back to the 1920s and 1930s, but there is a large gap between the 1950s and the 1970, when homosexuality was still illegal in most countries.
- Spatial coverage: A number of countries are covered only by the two waves of the EU LGBT(I)+ Surveys9 10. Besides this large-scale EU survey, only very few Northern and Western European countries are covered by further datasets identified in our search.
- Samples sizes range from 1 to 139,799 respondents.
- Sample composition: Sixty-five datasets contain data on sexual orientation and 20 datasets contain data on gender identity. Only six datasets collected information about the presence of intersex variations in respondents, making the data gap for intersex persons especially large.
Fig.1: European countries covered by LGBTIQ+ datasets found vs. European countries covered by LGBTIQ+ datasets when two waves of the EU LGBT(I)Q Survey not considered


vs
European countries covered by LGBTIQ+ datasets when two waves of the EU LGBT(I)Q Survey not considered
(Map created with https://www.mapchart.net)
Data Topics
We also used the keywords assigned to the studies to learn more about the topics covered by the studies. We mapped all assigned keywords to the CESSDA Topic Classification11 to identify the most strongly covered topics and potential topic gaps. Based on the mapping to the CESSDA Topic Classification the most strongly covered topics, with more than 100 keywords each, are found in the following categories:
- Health: covers health policy aspects as well as mental and physical health, specific diseases or medical conditions, and medical treatment.
- Social stratification and groupings: contains sub-topics for different population groups based on characteristics such as age, gender, sexuality, ethnicity, or minority status.
- Society and culture: includes sub-topics pertaining to community, values and attitudes, social behavior, activities, and quality of life.
Yet, upon closer look, these categories are also associated with gaps regarding the coverage of sub-topics into which they are divided:
- No datasets were assigned keywords relating to the elderly, and only one dataset considers disability alongside LGBTIQ+ identity.
- The analysis suggests a lack of data on specific ethnic groups and youth, meaning that data on some of the most vulnerable groups in the wider LGBTIQ+ population is not readily findable.
- Terms in the category “Society and culture” are commonly paired with negatively connoted terms such as “discrimination”, suggesting a potential prevalence of so-called deficit- or damage-centered research.
- In addition, few datasets covered in our study have keywords from the topic categories ‘Economics’ and ‘Social welfare policy and systems’, suggesting that overall European Social Science data archives hold little data to study the economic situation of the LGBTIQ+ population.
While two large datasets, the two waves of EU LGBT(I)+ Surveys from 2012 and 2019, play an important role in filling LGBTIQ+ data gaps for the more recent years, this gap remains large, both in the amount of data collected about LGBTIQ+ persons and the topics covered with the data.
A dataset listing all LGBTIQ+ datasets found, their characteristics and keywords is available: Perry, Anja, & Recker, Jonas (2024). Data on the margins – Data from LGBTIQ+ Populations in European Social Science Data Archives. GESIS, Cologne. Data File Version 1.0.0, https://doi.org/10.7802/2650.
References
- Euronews. 2023. This is the first crash test dummy modelled on the female body. Will it make cars safer for women? https://www.euronews.com/next/2023/09/21/this-is-the-first-crash-test-dummy-modelled-on-the-female-body-will-it-make-cars-safer-for.
- D’Ignazio, Catherine, and Lauren F. Klein. 2020. Data Feminism. Cambridge, Massachusetts: The MIT Press. https://data-feminism.mitpress.mit.edu/.
- Compton, D’Lane R. 2018. “How Many (Queer) Cases Do I Need? Thinking through Research Design.” In Other, Please Specify: Queer Methods in Sociology, 185–200.
- Heslin, Kevin C., and Jeffrey E. Hall. 2021. “Sexual Orientation Disparities in Risk Factors for Adverse COVID-19–Related Outcomes, by Race/Ethnicity — Behavioral Risk Factor Surveillance System, United States, 2017–2019.” MMWR. Morbidity and Mortality Weekly Report 70(5): 149–54. doi:10.15585/mmwr.mm7005a1.
- Nowaskie, Dustin Z., and Anna C. Roesler. 2022. “The Impact of COVID-19 on the LGBTQ+ Community: Comparisons between Cisgender, Heterosexual People, Cisgender Sexual Minority People, and Gender Minority People.” Psychiatry Research 309: 114391. doi:10.1016/j.psychres.2022.114391.
- European Commission. 2020. LGBTIQ Equality Strategy 2020-2025 (p. 22). https://commission.europa.eu/system/files/2020-11/lgbtiq_strategy_2020-2025_en.pdf
- Ipsos. 2021. LGBT+ Pride 2021 Global Survey points to a generation gap around gender identity and sexual attraction. https://www.ipsos.com/en/lgbt-pride-2021-global-survey-points-generation-gap-around-gender-identity-and-sexual-attraction
- In this context it is important to note that LGBTIQ+ data are not only held in research data infrastructure organizations such as the CESSDA archives, but also in community-related organizations. Regarding survey data, LGBTIQ+ advocacy groups and queer or feminist research institutes are important actors in this landscape. However, these data tend to be not available for reuse by researchers not affiliated with these organizations.
- European Union Agency for Fundamental Rights (FRA). 2023. European Union Lesbian, Gay, Bisexual and Transgender Survey, 2012: Special Licence Access. [data collection]. UK Data Service. SN: 7956, DOI: http://doi.org/10.5255/UKDA-SN-7956-1.
- European Union Agency for Fundamental Rights (FRA), Vienna, Austria. 2021. The EU LGBTI II Survey, 2019. GESIS Data Archive, Cologne. ZA7604 Data file Version 1.1.0, https://doi.org/10.4232/1.13733.
- CESSDA. 2022. CESSDA Controlled Vocabulary for CESSDA Topic Classification. https://vocabularies.cessda.eu/vocabulary/TopicClassification.
One comment