Manoel, I greatly appreciate your feedback, since it makes me realize that I should write more about our Science eLetter and its implications. Once I find time, I'll put together a proper overall response at UncommonGood.substack.com. Meanwhile, let me quickly respond to the "three key arguments against Bagchi et al. (2024)" that you've taken from the response eLetter by Guess et al.
First, I don't think the paper by Guess et al. is internally valid. Consider the following. If a paper computes a causal estimate based on an experiment, but the control condition is meaningfully changed during the experiment specifically to affect the target causal estimand, and the paper doesn't reveal anything about that specific change, nor accounts for it, then the change could result in any desirable value of the causal estimate, without revealing anything about how it happened. In other words, a causal claim must define exactly what control and treatment conditions are. If it doesn't then the causal claim may be invalid in the situations where the description misses something meaningful that's related to the estimand. If the change was not described, then the default assumption should be that there was no meaningful change during the experiment. However, during the the experiment of Guess et al. there was a meaningful change introduced...
Second, you write that:
> Guess et al. (2023) data, they found little change in the number of unreliable sources pre- vs. post-study period. In other words, in their control group (where the recommender algorithm is enabled), they don’t observe this drop in the fraction of untrustworthy content.
Ok, so let's see what exactly Guess et al. write in their response eLetter, I quote
> Over the 90 days prior to the treatment, untrustworthy sources represented 2.9% of all content seen by participants in the Algorithmic Feed (control) group – during the study period, this dropped only modestly to 2.6%.
So, according to their own measure, there was a drop in the fraction of misinformation from 2.9% to 2.6%, so that's 10.5% relative drop (0.3/2.9), whereas we reported a 24% drop. Note, however, that only about half of their treatment period overlaps with the period of Facebook's emergency interventions. If it overlapped entirely, then probably instead of 10.5% drop, we would observe a 21% drop. That's starts to be quite close to the 24% drop we measured using a different dataset and a different notion of misinformation.
Third, you're right that the evidence used by Bagchi et al. (2024) is not causal. However, in our eLetter we haven't made any causal statements. Instead, we're pointing out that Guess et al. made causal statements without properly describing the control condition of its experiment. That said, we provided also potential explanations for the drop in the fraction of misinformation in news feeds of users. This explanation aligns with the reasons why emergency measures were introduced. These reasons were provided both officially by Facebook representatives [1], and unofficially by Facebook employees and a whistleblower, Francis Haugen [2, 3].
Thanks for the reply, I attached it to the end of the post!
I disagree with point #1: the control group was "Facebook as it was during the election," that's fine. It is like saying that you an experiment to study in mobility in a city is invalid because there were changes due to Christmas — most likely than not, Facebook will always have changes for US elections...
I am not super convinced of point #2 on either way. You make a good point about the "entire treatment period". But still, this is such a convoluted thing because there are exogenous shocks to the demand and supply of news here.
I agree with you on point #3. You folks didn't do any causal claims in the letter, but note that: “Our results show that social media companies can mitigate the spread of misinformation by modifying their algorithms but may not have financial incentives to do so” is a causal statement, which is what bothered me.
Thanks for your replies. I appreciate the discussion.
It helps to consider point 1 outside of this specific context, in isolation from the study by Guess et al., as a thought experiment. If a paper computes a causal estimate based on an experiment, but the control condition is meaningfully changed during the experiment specifically to affect the target causal estimand, and the paper doesn't reveal anything about that specific change, nor accounts for it, then the change could result in any desirable value of the causal estimate, without revealing anything about how it happened. In other words, a causal claim must define exactly what control and treatment conditions are. If it doesn't then the causal claim may be invalid in the situations where the description misses something meaningful that's related to the estimand. Does this make sense?
Regarding point 2, thanks for recognizing the point about similar drop estimates after correcting for the difference in time periods, whether we estimate it based on Guess et al. data or Facebook URLs data. Regarding your second sentence, I agree that the situation is convoluted, particularly because of point 1 that wasn't addressed by the original study, but do we want to try to understand it to the best of our ability, or not? I think it's important that we consider each component in isolation and then we put them together. We will be looking more closely into different datasets to understand how the supply of misinformation was changing back then.
Regarding point 3 and the sentence you quote, I guess you're suggesting to change the word "show" to "suggest". I'd be ok with that, but note that our statement is already relatively weak, since it contains the modal verb "can". That said, don't you think that we have now more evidence for this statement than against it, including reports from Facebook employees? If you disagree, then how would you explain this drop? The issue is, we don't have any alternative hypothesis with equivalent level of support. The only alternative that Guess et al., and you, provide is the generic statement about the changes in supply/demand, but that statement doesn't clarify the mechanism that could result in the drop, not to mention that it doesn't have backing in external reports from Facebook employees.
Thank you, Manoel, for your interest in our Science eLetter and for your comments. I appreciate them greatly. I attach my responses to your Substack post here, since these exchanges may lead to a broader discussion about our eLetter and the original paper by Guess et al.
First, I'd like to clarify that the "debunk" framing originates from the University of Massachusetts Amherst's press release, not from Dublin, and it doesn't originate from me, but it does appear in both releases. It may sound as an exaggeration, since Science hasn't issued the correction. For this reason, I've requested already to take it out from the title of University College Dublin's press release.
Second, I also don't believe that the entire paper by Guess et al. is debunked, but it did miss crucial information. Without revealing *any* information about the 63 break-glass measures, it could arrive at any desired conclusion, since the result depends on the unrevealed emergency interventions, and nobody would know what this conclusion really means.
Third, yes we've read the "fairly precise description of algorithmic changes enacted by Facebook". However, when I talked with co-authors from Meta, they said that these dates are based on unofficial leaks and may be incorrect. That's why we are careful about the wording in our Science eLetter.
Finally, you write "This is false!". Would you mind clarifying what exactly is false in that statement “Our results show that social media companies can mitigate the spread of misinformation by modifying their algorithms but may not have financial incentives to do so.”?
I'm glad re: the "debunking" framing, and I totally expected it not to come from you... This is clearly a journalist's thing :). I also agree with you that the paper should have included mentions to this, the letter does a great service in that sense. I also thank you for the clarification re: the exact dates, this is very helpful, and I should update the blog post.
I am totally on board with you that the original paper should have mentioned these break-glass measures. But I believe this is an incremental thread to validity — it is not like we could conclude that algorithms ALWAYS do or do not induce polarization from this study alone. The biggest thing we can learn from the study is that, Facebook's algorithm, as it was during the election, did not increase polarization.
Regarding the false statement. Your results do not show whether social media companies can mitigate the spread of misinformation. First, I don't know where Guess et al. 2023 talk about misinformation (this is not one of their outcomes). Second, your analysis simply shows that links from untrustworthy domains dropped after Joe Biden got elected: whether this is caused by Facebook changes or by changes in the news ecosystem is inconclusive!
Thanks for your prompt response, Manoel. To answer your last paragraph:
1. Guess et al. uses the term "proportion of feed from untrustworthy sources". That's the outcome I mean when writining "misinformation".
2. Facebook introduced the emergency measures in response to election fraud misinformation spreading on its platform, so the fraction of content from untrustworthy started to increase, rather than decrease, after the election day. In the eLetter, we write that "there may have been a surge in election-related misinformation (Vosoughi, Roy, and Aral 2018), which might increase our estimated drop".
I hate to be nitpicky regarding your second point because it is such an assymetrical situation: they have access to data that you do not (and even the dataset you have is hard to get). But I just don't think it is possible to make this claim given this data. I don't mean it is false as in, this could not have happened, but, it is unwise to conclude it from the time series.
1. The figure shared by the other letter suggests that the supply of news (in general) decreased after the election (until January 6th); see [1].
2. I don't think we have much clue on how the information ecosystem changed. Parler grew insanely much around this same time (see [2]). Does this decrease dodgy activity on Facebook? Is this related at all to their changes? How does the scenario look when considering content other than URLs (their letter says that not very different)?
Notably, we know some of these points just because of your letter, so we have you to thank for it!
Thanks for recognizing that we've brought new relevant information to light. Note that so far we haven't claimed, nor we haven't provided any evidence, that "the fraction of content from untrustworthy (sources) started to increase", so "this claim" and "unwise conclusion" seem to be targeting something we didn't actually claim. Above I've written this sentence as an extension of the explanation of why Facebook introduced the emergency interventions back then, but of course we don't have the data that Facebook has, as you're correctly pointing out.
1. Note that my statement wasn't about "supply of news", but rather about fraction of misinformation under a hypothetical situation where the emergency interventions weren't introduced, so I don't see any inconsistency here.
2. I imagine it's related. These are all great questions, definitely worth further research!
Manoel, I greatly appreciate your feedback, since it makes me realize that I should write more about our Science eLetter and its implications. Once I find time, I'll put together a proper overall response at UncommonGood.substack.com. Meanwhile, let me quickly respond to the "three key arguments against Bagchi et al. (2024)" that you've taken from the response eLetter by Guess et al.
First, I don't think the paper by Guess et al. is internally valid. Consider the following. If a paper computes a causal estimate based on an experiment, but the control condition is meaningfully changed during the experiment specifically to affect the target causal estimand, and the paper doesn't reveal anything about that specific change, nor accounts for it, then the change could result in any desirable value of the causal estimate, without revealing anything about how it happened. In other words, a causal claim must define exactly what control and treatment conditions are. If it doesn't then the causal claim may be invalid in the situations where the description misses something meaningful that's related to the estimand. If the change was not described, then the default assumption should be that there was no meaningful change during the experiment. However, during the the experiment of Guess et al. there was a meaningful change introduced...
Second, you write that:
> Guess et al. (2023) data, they found little change in the number of unreliable sources pre- vs. post-study period. In other words, in their control group (where the recommender algorithm is enabled), they don’t observe this drop in the fraction of untrustworthy content.
Ok, so let's see what exactly Guess et al. write in their response eLetter, I quote
> Over the 90 days prior to the treatment, untrustworthy sources represented 2.9% of all content seen by participants in the Algorithmic Feed (control) group – during the study period, this dropped only modestly to 2.6%.
So, according to their own measure, there was a drop in the fraction of misinformation from 2.9% to 2.6%, so that's 10.5% relative drop (0.3/2.9), whereas we reported a 24% drop. Note, however, that only about half of their treatment period overlaps with the period of Facebook's emergency interventions. If it overlapped entirely, then probably instead of 10.5% drop, we would observe a 21% drop. That's starts to be quite close to the 24% drop we measured using a different dataset and a different notion of misinformation.
Third, you're right that the evidence used by Bagchi et al. (2024) is not causal. However, in our eLetter we haven't made any causal statements. Instead, we're pointing out that Guess et al. made causal statements without properly describing the control condition of its experiment. That said, we provided also potential explanations for the drop in the fraction of misinformation in news feeds of users. This explanation aligns with the reasons why emergency measures were introduced. These reasons were provided both officially by Facebook representatives [1], and unofficially by Facebook employees and a whistleblower, Francis Haugen [2, 3].
[1] https://www.nytimes.com/2020/12/16/technology/facebook-reverses-postelection-algorithm-changes-that-boosted-news-from-authoritative-sources.html
[2] https://www.wsj.com/articles/the-facebook-files-11631713039
[3] https://www.washingtonpost.com/documents/5bfed332-d350-47c0-8562-0137a4435c68.pdf
Thanks for the reply, I attached it to the end of the post!
I disagree with point #1: the control group was "Facebook as it was during the election," that's fine. It is like saying that you an experiment to study in mobility in a city is invalid because there were changes due to Christmas — most likely than not, Facebook will always have changes for US elections...
I am not super convinced of point #2 on either way. You make a good point about the "entire treatment period". But still, this is such a convoluted thing because there are exogenous shocks to the demand and supply of news here.
I agree with you on point #3. You folks didn't do any causal claims in the letter, but note that: “Our results show that social media companies can mitigate the spread of misinformation by modifying their algorithms but may not have financial incentives to do so” is a causal statement, which is what bothered me.
Thanks for your replies. I appreciate the discussion.
It helps to consider point 1 outside of this specific context, in isolation from the study by Guess et al., as a thought experiment. If a paper computes a causal estimate based on an experiment, but the control condition is meaningfully changed during the experiment specifically to affect the target causal estimand, and the paper doesn't reveal anything about that specific change, nor accounts for it, then the change could result in any desirable value of the causal estimate, without revealing anything about how it happened. In other words, a causal claim must define exactly what control and treatment conditions are. If it doesn't then the causal claim may be invalid in the situations where the description misses something meaningful that's related to the estimand. Does this make sense?
Regarding point 2, thanks for recognizing the point about similar drop estimates after correcting for the difference in time periods, whether we estimate it based on Guess et al. data or Facebook URLs data. Regarding your second sentence, I agree that the situation is convoluted, particularly because of point 1 that wasn't addressed by the original study, but do we want to try to understand it to the best of our ability, or not? I think it's important that we consider each component in isolation and then we put them together. We will be looking more closely into different datasets to understand how the supply of misinformation was changing back then.
Regarding point 3 and the sentence you quote, I guess you're suggesting to change the word "show" to "suggest". I'd be ok with that, but note that our statement is already relatively weak, since it contains the modal verb "can". That said, don't you think that we have now more evidence for this statement than against it, including reports from Facebook employees? If you disagree, then how would you explain this drop? The issue is, we don't have any alternative hypothesis with equivalent level of support. The only alternative that Guess et al., and you, provide is the generic statement about the changes in supply/demand, but that statement doesn't clarify the mechanism that could result in the drop, not to mention that it doesn't have backing in external reports from Facebook employees.
Thank you, Manoel, for your interest in our Science eLetter and for your comments. I appreciate them greatly. I attach my responses to your Substack post here, since these exchanges may lead to a broader discussion about our eLetter and the original paper by Guess et al.
First, I'd like to clarify that the "debunk" framing originates from the University of Massachusetts Amherst's press release, not from Dublin, and it doesn't originate from me, but it does appear in both releases. It may sound as an exaggeration, since Science hasn't issued the correction. For this reason, I've requested already to take it out from the title of University College Dublin's press release.
Second, I also don't believe that the entire paper by Guess et al. is debunked, but it did miss crucial information. Without revealing *any* information about the 63 break-glass measures, it could arrive at any desired conclusion, since the result depends on the unrevealed emergency interventions, and nobody would know what this conclusion really means.
Third, yes we've read the "fairly precise description of algorithmic changes enacted by Facebook". However, when I talked with co-authors from Meta, they said that these dates are based on unofficial leaks and may be incorrect. That's why we are careful about the wording in our Science eLetter.
Finally, you write "This is false!". Would you mind clarifying what exactly is false in that statement “Our results show that social media companies can mitigate the spread of misinformation by modifying their algorithms but may not have financial incentives to do so.”?
Ciao Przemyslaw,
I'm glad re: the "debunking" framing, and I totally expected it not to come from you... This is clearly a journalist's thing :). I also agree with you that the paper should have included mentions to this, the letter does a great service in that sense. I also thank you for the clarification re: the exact dates, this is very helpful, and I should update the blog post.
I am totally on board with you that the original paper should have mentioned these break-glass measures. But I believe this is an incremental thread to validity — it is not like we could conclude that algorithms ALWAYS do or do not induce polarization from this study alone. The biggest thing we can learn from the study is that, Facebook's algorithm, as it was during the election, did not increase polarization.
Regarding the false statement. Your results do not show whether social media companies can mitigate the spread of misinformation. First, I don't know where Guess et al. 2023 talk about misinformation (this is not one of their outcomes). Second, your analysis simply shows that links from untrustworthy domains dropped after Joe Biden got elected: whether this is caused by Facebook changes or by changes in the news ecosystem is inconclusive!
Thank you so much for engaging!
Thanks for your prompt response, Manoel. To answer your last paragraph:
1. Guess et al. uses the term "proportion of feed from untrustworthy sources". That's the outcome I mean when writining "misinformation".
2. Facebook introduced the emergency measures in response to election fraud misinformation spreading on its platform, so the fraction of content from untrustworthy started to increase, rather than decrease, after the election day. In the eLetter, we write that "there may have been a surge in election-related misinformation (Vosoughi, Roy, and Aral 2018), which might increase our estimated drop".
I hate to be nitpicky regarding your second point because it is such an assymetrical situation: they have access to data that you do not (and even the dataset you have is hard to get). But I just don't think it is possible to make this claim given this data. I don't mean it is false as in, this could not have happened, but, it is unwise to conclude it from the time series.
1. The figure shared by the other letter suggests that the supply of news (in general) decreased after the election (until January 6th); see [1].
2. I don't think we have much clue on how the information ecosystem changed. Parler grew insanely much around this same time (see [2]). Does this decrease dodgy activity on Facebook? Is this related at all to their changes? How does the scenario look when considering content other than URLs (their letter says that not very different)?
Notably, we know some of these points just because of your letter, so we have you to thank for it!
[1] https://figshare.com/articles/figure/Figure_1_from_Response_to_Social_media_algorithms_can_curb_misinformation_but_do_they_/27058459?file=49279903
[2] https://academic.oup.com/view-large/figure/412131320/pgad035f1.tif
Thanks for recognizing that we've brought new relevant information to light. Note that so far we haven't claimed, nor we haven't provided any evidence, that "the fraction of content from untrustworthy (sources) started to increase", so "this claim" and "unwise conclusion" seem to be targeting something we didn't actually claim. Above I've written this sentence as an extension of the explanation of why Facebook introduced the emergency interventions back then, but of course we don't have the data that Facebook has, as you're correctly pointing out.
1. Note that my statement wasn't about "supply of news", but rather about fraction of misinformation under a hypothetical situation where the emergency interventions weren't introduced, so I don't see any inconsistency here.
2. I imagine it's related. These are all great questions, definitely worth further research!