(This is a big rant on why research on content moderation, algorithms, and monetization strategies is hard and why we desperately need it. It is an interpolation between some of the materials I prepared for my job talk and my PhD thesis)
Online platforms like Facebook, Wikipedia, YouTube, Amazon, Uber, DoorDash, Airbnb, and Tinder have changed the world and become embroidered into the social fabric. It is hard to imagine how our lives would be without them: our economies, our relationships, and how we acquire knowledge have become deeply connected to these online platforms. The United Nations Conference on Trade and Development estimated that the global value of e-commerce sales reached almost 26 trillion dollars in 2018. Pew Research estimated that around 10% of partnered US adults met their match online dating in 2023. Wikipedia received over 7 billion monthly visits in 2023, satisfying users with the most diverse information needs.
Thus, it is perhaps not surprising that online platforms are also strongly connected to some of the most significant societal challenges of the 21st century. E-commerce platforms are responsible for a sizeable chunk of greenhouse gas emissions. Gig work platforms like Uber and DoorDash ignited discussions about workers' rights and precarious employment. Radicalization and terrorism have become an online-first phenomenon: mainstream social media platforms like YouTube saw an influx of radical content that snowballed in popularity in the late 2010s, and fringe platforms like Gab and Parler were tightly associated with terrorist attacks and anti-democratic protests.
However, online platforms are not “immovable rocks” that we should accept as they are; they are sociotechnical systems where design choices, policies, and algorithms steer human behavior. And in the spirit of Campbell's experimenting society, we should propose and assess ways of improving online platforms, maximizing their benefits, and minimizing their harms.
The critical enterprise of online platforms is that they curate content. Users on these platforms upload images, list products to sell, and make profiles. At the same time, platforms curate this content and serve it to other users on the platform. And there are some critical ways in which they do so: they decide what to recommend to users, how to monetize content on the platform, and what is allowed on the platform.
Content curation practices have captured the imagination of journalists, politicians, and the general public. For example, tech CEOs testified in Senate Hearings in 2018, 2020, 2021, and 2024; they were often asked to discuss content curation practices like recommender systems or content moderation. In a 2018 poll, 65% of self-described US conservatives thought social media platforms were censoring conservative ideas. In that context, research informing content curation practices is actionable; it can propose concrete ways of changing online platforms and inform stakeholders. For example, deplatforming or banning individuals or collectives from our online ecosystem has been widely debated, as the intervention treads a thin line between preventing harm and censoring speech. How to weigh the benefits and harms of these practices? How to assess whether they are effective? Enter research on content curation.
Despite this promise, however, research on content curation practices has arguably failed to drive their development and adjustment. Compared to other (polarizing) topics like healthcare or labor, content curation practices in social media are disproportionally driven by anecdotal or observational evidence that describes problems, but not solutions. This is (at least partly) because researching content curation practices is damn hard:
Content curation practices are opaque and carried by private companies at their discretion.
Researchers often lack access to data or the necessary experimentation infrastructure;
Online platforms are highly dynamic, raising concerns about the temporal validity of findings;
And, in some cases, disentangling the effect of content curation practices is methodologically challenging, e.g., see the literature on the effects of the YouTube algorithm.
Yet, we should not give up! Causal evidence on content curation practices is challenging but can yield significant payoffs: given how widely used online platforms are, marginal improvements to online environments are meaningful; and given that there's a wide appetite for regulating online platforms, research can guide policy away from guesswork.