Qualitative Research After the Interviewer Bot
I have a new hot take. The first half of it is perhaps not so hot, and it would probably be a consensus on Bluesky: AI cannot replace qualitative research. As the letter from prominent qualitative researchers argues, meaning-making is a human activity, and even if LLMs could do it, it is unclear how useful it would be. More broadly, I’d argue that qualitative research is what qualitative researchers do — and at least for now, they are not buying the AI pill. The second part is where things may get controversial. People will use Generative AI to answer questions previously reserved for qualitative research, and they will do so within a quantitative paradigm preferred by institutions. And before you throw the first stone, this is a descriptive claim, not a normative one.
But let me take a step back to define what I mean by “qualitative research,” because my use of the term is inevitably shaped by my work in HCI and other CS-y fields. The tradition I have in mind includes things like ethnography, participant observation, interviewing, interpretive content analysis, and grounded theory. These are sometimes referred to as “interpretative” in contrast to “positivist” methods. The acceptance of interpretative methods in tech-adjacent fields was (and still is) uneven and bumpy, but it has happened because of their great utility for making sense of phenomena that do not yield easily to logs or math alone. And it feels like these phenomena are ever-relevant, because important questions seem to be at the edge of what our institutions can quantify: stigma, domestic violence, emotional labor.
But the mainstream institutional acceptance of qualitative research often rests on the limits of quantitative approaches. Institutions have not truly bought the “thick-description” pill. They seek it when quantification is not yet available, credible, or cheap, but are quick to pivot once that is no longer true. We see this pattern across the most diverse areas, from human rights monitoring to child protection, stakeholders’ preferences for numbers displaced much interpretive work in favor of a couple of metrics and decision trees. And it is worth understanding their viewpoint: they have to navigate complicated trade-offs, and that is arguably easier to do once you have simplified your world into a handful of indicators.
My point here is not to turn this into a discussion of the epistemic validity of different methods, but to give context on what advances in GenAI mean for qualitative research (maybe beyond my modest talents, but hey, this is a blog post). Even if not referred to as “Qualitative AI research,” new technologies enabled by GenAI such as open-ended interviewing at scale, or semi-automated inductive coding, enable quantitative approaches to examine questions that were previously too much horse for quant research’s cowboy. Consider, for example, the recent hype around the “Anthropic Interviewer.” The system is no ethnographer in any deep sense, but it can collect reasonably good interviews cheaply and at scale. And that’s probably preferable for many institutions: they can cheaply get something that fills the same gap as qualitative inquiry but in a form more amenable to scale and quantification.
One potential criticism of my point is that no questions are inherently “qualitative.” Fair enough. But in practice, institutions and firms usually behave as if the distinction exists. They distinguish between questions they see as problems of measurement and optimization, and questions they see as requiring interpretation, context, and meaning-making. A health agency, for instance, would not approach estimating vaccine effectiveness in the same way it approaches understanding why some communities resist vaccination. But what happens when that second task begins to look operationalizable at scale, say, through a Claude-backed interviewer?
What is most interesting is how hard it is to criticize this approach while staying within an interpretivist frame. There are genuine concerns about AI-conducted interviews. But the critiques most likely to persuade institutional stakeholders often appeal to concepts like reliability, validity, and generalizability, terms that interpretivist traditions have often resisted. This means the threat model here is that AI may reorganize how institutions understand the value of qualitative inquiry.
As a quantitative researcher myself (who got chastised for being a “positivist” only once! I was a master’s student, give me a break), I find these new developments exciting but at the same time slightly terrifying. These technological advances may help address important questions at scale and with the kinds of guarantees that institutions care about: consistency, comparability, auditability, and speed. But they force harder questions: who gets to decide when an AI-informed account of a phenomenon is “good enough”? And whose interests are served when that threshold is set? As Messeri and Crockett remind us, “increasing productivity does not guarantee an improved understanding of the world.”

