Scaling Up Qualitative Research Methods with Natural Language Processing Tools: The Case-Study of Colombia’s 2018 Presidential Election on Twitter

Juan Luis Suárez, Erin Huner
2022

Abstract

Traditional qualitative methods have the capacity to create rich and nuanced understandings of participants’ lived experiences, but this richness in understanding has often been at the expense of scale. For instance, Mason (2010) completed an empirical analysis of the average number of interviews contained within qualitative PhD theses, using data collected from theses.com (Mason, 2010). Through an analysis of 560 PhD theses, that Mason coded to be theses that utilized qualitative methods, they determined the mean number of qualitative interviews was 31 (Mason, 2010: 13) but when looking at defined methods, theses using a grounded theory approach used (mode: 25) and (median: 32) interviews (Mason, 2010: 9). Other empirical work cites that the average number of qualitative interviews needed to reach saturation within qualitative research to be 16 interviews, where meta-themes could begin to be established reliably at around six interviews (Guest et al., 2006: 83). Are there tools that we can begin to deploy, in addition to qualitative methods, that might allow researchers to expand sample sizes, while maintaining the richness and nuance of understanding that are the hallmarks of qualitative analysis? Here we argue yes, and that Natural Language Processing is a tool that qualitative researchers need to explore and deploy when working with large-scale textual datasets, such as those produced through social media.