Birdwatch, Twitter’s recently introduced community-driven approach to provide context to potentially misleading tweets, is the most ambitious and transparent effort by a social media company to combat misinformation on its platform. The tool is actively under development and they have recently introduced a new tool for rating notes and contributors using a PageRank style algorithm. This note provides an external perspective on this feature, highlights some of the remaining risks with such a system, and provides empirical and technical tools for monitoring and addressing those risks. All results preliminary and subject to change.
Birdwatch has done an unparalleled job in building out its product with openness and transparency and cultivating trust. Despite this, it still faces some… hostility?
— Ößlïgå†ê Çårñïvðrê 🥩 Ret’d CAF ((???)) June 25, 2021
When this hesitancy is widespread, no matter how shallow, it only takes a few errors to really erode whatever trust has been built up. That fragility could be viewed as an opportunity by individuals or groups of people who want to undermine such efforts. If actions could be taken to ‘break’ Birdwatch (i.e. rate otherwise helpful notes as not helpful and demonstrably false notes as helpful), groups could simultaneously spread misinformation and diminish trust in the institutions designed to combat it.
Importantly, we have precedent for similar coordinated actions on other platforms in other contexts.
“Cyberstriking” is a common tool used by politically motivated groups to trip the reporting systems of social media sites to get political opponents removed or temporarily banned. Identity Evropa, a white supremacist group now operating as the American Identity Movement, had a dedicated channel on their (leaked) Discord server to coordinate such actions with varying degrees of success:
The challenge with this type of coordination is that there is often limited spatial clustering or publicly shared social graphs of coordinated actors (i.e. people within the group often do not follow each other on Twitter or Facebook to maintain separation between their public profile and group affiliations). This makes it difficult to use standard community detection tools to identify these groups.
Birdwatch is actively addressing the challenge of coordinated adversarial actors and currently has many safeguards in place to prevent coordinated action, many of which I am sure are not in the public eye. Access to the pilot is limited to accounts that have verified contacts, use two factor authentication, and have not recently been found to violate Twitter’s terms and conditions (among others). Birdwatch is also intentionally recruiting a diverse contributor base to get broader and more varied perspectives.
Most notably for the purposes of the issue discussed here is that Birdwatch limits the degree to which any single author can improve the helpfulness score of another:
Each rater’s ratings of any particular author are only counted once in this weighted average, using the average helpfulness rating/ from that particular rater of the particular author. For example: if a rater rated 10 notes from the same author, those will count as 1 author-level rating instead of 10 ratings. - Birdwatch
However, as the program scales and more users are permitted to join the platform, the issue of coordinated abuse may still come to the fore. The goal of this analysis is to ask whether small, coordinated groups could overcome these safeguards to manipulate the platform or otherwise undermine public trust in Birdwatch as a whole.
I have done my best to match the methodology of Birdwatch in calculating author and note scores as precisely as possible, please shoot me an email if you spot any mistakes. All errors are my own.
The new PageRank-based system recently introduced by Birdwatch uses rankings of submitted notes to generate contributor helpfulness scores. If other Birdwatch contributors rate your notes as helpful, your score goes up. If other contributors with high helpfulness ratings themselves rate you as helpful, your score will go up even more.
PageRank was, of course, phenomenally successful and instrumental in Google’s rise to prominence as the default search engine for the World Wide Web. However, it was not without its vulnerabilities to abuse. In the early days of the blogosphere, link farms allowed websites to artificially increase their PageRank by creating many new websites and linking them to one another. Thus, allowing ill-intentioned people with access to a hosting server to boost content-poor, ad-rich websites to the top of search results, ruining the internet for the rest of us.
Birdwatch notes are anonymously contributed, meaning that contributors and Twitter users do not see the account names of those who write notes. This anonymity, combined with the account verification requirements, makes it much more difficult for disconnected users to batch produce ratings and preferentially support members of their own in-group (as a link farm might). Nevertheless, it may struggle to fully mitigate coordination now or in the future.
To illustrate how it might be feasible, in principle, to game such a system, we can look at a (charmingly?) simple agent based model.
Agent based modeling can be really helpful in understanding emergent behavior in complex systems. By design, Birdwatch (transparently) complicates community ratings system to make it exceedingly difficult for a stand-alone user to game the system. This also makes it quite hard to model from a pure game theoretic perspective. By recreating the system, modeling user behavior at their level, and observing outcomes of interest we can take a brute force approach to understanding the expected efficacy of different strategies and, therefore, potential vulnerabilities.
I try to replicate the data generating process of Birdwatch in as simple a framework as possible using only R
and the tidyverse. The flow of the simulation is straightforward:
Once we have this system established, I’ll define a set of strategies that coordinated users could employ to try to game the system and perform a grid search over that parameter space to identify potential vulnerabilities in how ratings are calculated. The benefit of codifying the system as that we can quickly and easily add in additional strategy parameters at a later date or update note rating protocols as the system evolves.
We’ll first create a universe of 1,000 posts that are split into five topics. All of these topics are super cool but are purely illustrative. In a real world scenario, we should think of these topics as being highly specific claims (e.g. vaccines make you magnetic). One of these illustrative topics (“Politics”) is the target topic of any potential bad actors who will try to undermine the ratings system with regard to that topic. Within this universe, each individual post has a 10% probability of being a pure fabrication, which we code as a blatant_lie
. Clearly, this is a dramatic over-simplification of the what Birdwatch is trying to accomplish, but it provides a useful and clear starting point for analysis.
👇 Code to generate posts.
# Setting seed for replicability ----
set.seed(1355)
#' @name create_posts
#' @description This function creates the universe of posts
#' split into five topics. Each post has a five-percent
#' chance of being a blatant_lie.
#' @param n_posts The number of posts to create.
create_posts <- function(n_posts) {
fabricatr::fabricate(
ID_label = "post_id",
N = n_posts,
topic = fabricatr::draw_categorical(
N = N,
prob = c(.2, .05, .3, .2, .25),
category_labels =
c("Formula One",
"Coffee",
"Data Science",
"Gardening",
"Politics")
),
blatant_lie = fabricatr::draw_binary(N = N, p = .1)
)
}
posts <- create_posts(n_posts = 1000)
post_id | topic | blatant_lie |
---|---|---|
0132 | Data Science | 1 |
0844 | Politics | 1 |
0581 | Coffee | 0 |
0506 | Coffee | 0 |
0072 | Politics | 0 |
0188 | Data Science | 0 |
We’ll now create a population of 1000 contributors. Most of them will be birders
and a small proportion \(\rho\) will be twitchers
. Birders follow the rules and faithfully try to report misleading information. Twitchers, on the other hand, try to seek out posts related to a specific topic and report misleading information as truthful and vice versa.
👇 Code to generate contributors.
#' @name create_contributors
#' @description This function creates a population of contributors
#' who are of two types: birders (good) and twitchers (bad).
#' @param n_contributors The size of the population of contributors.
#' @param rho The probability an account is a twitcher.
create_contributors <- function(n_contributors, rho) {
fabricatr::fabricate(
ID_label = "contributor_id",
N = n_contributors,
type = fabricatr::draw_categorical(
N = N,
prob = c(1-rho, rho),
category_labels = c("birder","twitcher"),
)
)
}
contributors_data <- create_contributors(n_contributors = 1000, rho = .02)
With the current random seed, our population of contributors has 977 birders and 23 twitchers. This is a really small number of inauthentic accounts, but we will see if they can have an outsized effect on individual cases (i.e. can they flag a true post as potentially misleading). To do this, we need to model actor search and rating behavior. A birder
acts in the intended way; randomly sorting through posts and identifying (with some error) posts that they think are misleading. If the birder
rates a post as misleading, it will be flagged and added to a notes database.
The twitcher
acts in much the same way with the exception of a topic of interest. Here, we’ll say that the twitcher
s are interested in the “Politics” topic. These contributors will go our of their way to identify posts in this topic and only flag them if they are true. So, blatant lies will not added to the notes database, but truthful posts will be flagged as potentially misleading and added as a note. The breakdown of how frequently the twitcher
s consider the targeted topic is encoded in the parameter \(\gamma\) which we can think of as defining how discreet the twitcher
s are. A \(\gamma\) of 1 would mean they only target the Politics topic and 0 would mean they would never look at posts in the politics topic. twitcher
s can also be more or less active than birder
s when considering posts to tag. This is encoded in the multiplier
parameter of the create_notes_dataset()
function.
We also need to provide a means for twitcher
s to coordinate with one another. I add a new variable named whistle
for this purpose. When a twitcher
adds a note to a post , they also change this variable to 1. We can think about this signalling as occurring either through a content based code word that twitcher
s know and birder
s do not (see 💬) OR through some other off-platform means of communication (e.g. a Discord server in the case of “Cyberstriking”). With a signal in place, twitcher
s can identify one another and rate themselves as helpful if they come across them.
👇 Code to generate a simulated notes dataset.
#' @name create_notes_dataset
#' @description creates the data frame of notes (posts that are flagged).
#' @param attention_span How many posts the contributor considers in total.
#' @param param_gamma The degree to which twitchers focus on target topic.
#' @param contributors_data_frame The data frame produced by `create_contributors`.
#' @param posts_data The data frame created by `create_posts`.
#' @param multiplier scalar to change twitcher attention relative to birder.
create_notes_dataset <- function (
contributors_data_frame = contributors_data,
posts_data = posts,
param_gamma = .1,
attention_span = 10,
multiplier = 1
) {
# randomly sample sum of attention spans with replacement and randomly assign
# attention_span posts to each contributor.
n_birders <- contributors_data_frame %>%
filter(type == "birder") %>%
count() %>%
pull()
n_twitchers <- contributors_data_frame %>%
filter(type == "twitcher") %>%
count() %>%
pull()
# Get birder Notes ----
birder_notes <- posts_data %>%
# get full list
slice_sample(
n = (attention_span * n_birders),
replace = TRUE
) %>%
# assign to contributors
mutate(contributor_id =
contributors_data_frame %>%
filter(type == "birder") %>%
select(contributor_id) %>%
slice(
rep(1:n(), each = attention_span)
) %>%
pull()
) %>%
# decide whether to flag or not
mutate(
error = rbinom((attention_span * n_birders), 1, .05),
flag = if_else(
error == 0,
as.integer(blatant_lie),
as.integer(1-blatant_lie)
),
whistle = 0
) %>%
filter(flag == 1)
# Get twitcher notes ----
non_target_notes <- posts_data %>%
filter(topic != "Politics") %>%
slice_sample(
n = round(multiplier * attention_span * (1 - param_gamma)) * n_twitchers,
replace = TRUE
) %>%
# assign to contributors
mutate(contributor_id =
contributors_data_frame %>%
filter(type == "twitcher") %>%
select(contributor_id) %>%
slice(
rep(1:n(), each = round(multiplier * attention_span * (1 - param_gamma)))
) %>%
pull()
) %>%
# decide whether to flag or not
mutate(
error = rbinom(round(multiplier * attention_span * (1 - param_gamma)) * n_twitchers, 1, .05),
flag = if_else(
error == 0,
as.integer(blatant_lie),
as.integer(1-blatant_lie)
),
whistle = 1
) %>%
filter(flag == 1)
target_notes <- posts_data %>%
filter(topic == "Politics") %>%
slice_sample(
n = round(multiplier * attention_span * param_gamma) * n_twitchers,
replace = TRUE
) %>%
# assign to contributors
mutate(contributor_id =
contributors_data_frame %>%
filter(type == "twitcher") %>%
select(contributor_id) %>%
slice(
rep(1:n(), each = round(multiplier * attention_span * param_gamma))
) %>%
pull()
) %>%
# decide whether to flag or not
mutate(
flag = if_else(blatant_lie == 0, 1, 0),
whistle = 1
) %>%
filter(flag == 1)
notes <- bind_rows(
birder_notes, non_target_notes
) %>%
bind_rows(target_notes)
return(notes)
}
notes <- create_notes_dataset(
contributors_data_frame = contributors_data,
posts_data = posts,
param_gamma = .1,
attention_span = 10,
multiplier = 1
)
We can look at the distribution of true (blatant) lies across topics. If everything is working as expected, there should be mostly lies across topics (though some true posts are reported due to honest errors) and many more “true” posts should be flagged by the twitcher
s, leading to an overall lower proportion of lies in the “Politics” topic. Looking at the plot below, we see that this is the case.
Figure 1: Distribution of blatant lies that are entered into the notes database by birders (🔴s) and twitchers (🔺s).
Now that we have this set of notes we can allow contributors to rate one another’s posts. Again this requires explicitly modeling contributor behavior. The birder
s will act like the good citizens they are and will give positive ratings to correctly flagged posts (those that are blatant lies). Twitcher
s, on the other hand, will actively search out posts with a whistle
and will always rate them as helpful. Doing so, under certain conditions, could allow them to artificially increase author and rater scores among members of their cabal.
For the purposes of constructing helpfulness scores, we now have contributors rate the helpfulness of a sample of notes and then use these values to build author helpfulness scores, rater helpfulness scores, and combined helpfulness scores as faithfully as possible to the methods described by Birdwatch.
👇 Code to create ratings dataset.
#' @name create_ratings_dataset
#' @description This function creates a set of ratings. Each contributor
#' looks at a set of tweets and gives it a rating based on their rate_ function.
#' @param attention_span The number of posts each contributor considers.
#' @param contributors_data_frame The data frame produced by `create_contributors`.
#' @param notes_data The dataframe produced by `create_notes_dataset`.
#' @param multiplier scalar to change twitcher attention relative to birder.
create_ratings_dataset <- function (
contributors_data_frame = contributors_data,
notes_data = notes,
attention_span = 30,
multiplier = 1
) {
n_birders <- contributors_data_frame %>%
filter(type == "birder") %>%
count() %>%
pull()
n_twitchers <- contributors_data_frame %>%
filter(type == "twitcher") %>%
count() %>%
pull()
# Get birder ratings ----
birder_ratings <- notes_data %>%
# get full list
slice_sample(
n = (attention_span * n_birders),
replace = TRUE
) %>%
# assign to contributors
mutate(rater_id =
contributors_data_frame %>%
filter(type == "birder") %>%
select(contributor_id) %>%
slice(
rep(1:n(), each = attention_span)
) %>%
pull()
) %>%
# decide whether to rate as helpful or not
mutate(
error = rbinom((attention_span * n_birders), 1, .05),
rate_helpful = if_else(
error == 0,
as.integer(blatant_lie),
as.integer(1-blatant_lie)
)
) %>%
select(-error, -flag, - whistle)
# Get twitcher ratings ----
twitcher_ratings <- notes_data %>%
filter(whistle == 1) %>%
slice_sample(
n = (multiplier * attention_span * n_twitchers),
replace = TRUE
) %>%
# assign to contributors
mutate(rater_id =
contributors_data_frame %>%
filter(type == "twitcher") %>%
select(contributor_id) %>%
slice(
rep(1:n(), each = multiplier * attention_span)
) %>%
pull()
) %>%
# decide whether to flag or not
mutate(
rate_helpful = 1
) %>%
select(-error, -flag, - whistle)
ratings <- bind_rows(birder_ratings, twitcher_ratings)
return(ratings)
}
ratings <- create_ratings_dataset(
contributors_data_frame = contributors_data,
notes_data = notes,
attention_span = 20,
multiplier = 1
)
Each contributor starts out with a default helpfulness score of 1 following the procedure outlined in Birdwatch’s ranking methodology notes. We then iterate over the following calculation until these scores are stable. Note that only contributors with at least one note that has been rated will receive an author helpfulness score. In this simulation, that means there are only 773 authors out of the original set of 1000 contributors that receive a score.
\[a_i(u) = \max\left(0, \frac{3}{2} \times \frac{2 + \sum_{\text{rater}\in R(u)} a_{i-1}(rater)\times \text{Rating}(\text{rater}, u)}{6 + \sum_{\text{rater}\in R(u)} a_{i-1}(\text{rater})} - \frac{1}{2}\right) \]
👇 Code to calculate author helpfulness ratings.
#' @name calculate_author_helpfulness
#' @description calculates author helpfulness scores as
#' outlined by Birdwatch methodology.
#' @param contributors_data_frame data frame made by `create_contributors`
#' @param ratings_data_frame data frame made by `create_ratings_dataset`
#' @param iterations number of iterations to calculate author scores.
calculate_author_helpfulness <- function(
contributors_data_frame = contributors_data,
ratings_data_frame = ratings,
iterations = 10
) {
# Initialize author_rating data frame
author_ratings <- contributors_data_frame %>%
mutate(author_helpfulness_score = 1,
iteration = 0) %>%
select(-type) %>%
filter(
contributor_id %in% unique(ratings_data_frame$contributor_id)
)
# Iterate until convergence
for (i in 1:iterations) {
# Get new author ratings
new_author_ratings <- ratings_data_frame %>%
# get average helpfulness rating by id pair
group_by(contributor_id, rater_id) %>%
summarise(rate_helpful = mean(rate_helpful)) %>%
# get current author ratings
left_join(., author_ratings %>%
filter(
iteration == max(author_ratings$iteration)
) %>%
rename(rater_id = contributor_id),
by = "rater_id"
) %>%
na.omit() %>%
mutate(
numerator_sum_item = author_helpfulness_score * rate_helpful
) %>%
summarise(
numerator_sum = sum(numerator_sum_item),
denominator_sum = sum(author_helpfulness_score)
) %>%
mutate(
author_helpfulness_score =
(3 / 2) * (2 + numerator_sum) / (6 + denominator_sum) - (1 / 2)
) %>%
mutate(
author_helpfulness_score =
if_else(
author_helpfulness_score < 0,
0,
author_helpfulness_score
)
) %>%
mutate(iteration = i) %>%
select(contributor_id, author_helpfulness_score, iteration)
# Append them to author ratings dataset
author_ratings <-
bind_rows(author_ratings, new_author_ratings)
}
# get only the most recent iteration
# --------------------------------------------
# NOTE: This is commented out *only* for the purposes of this document,
# see the full source code for deets.
# --------------------------------------------
#author_ratings <- author_ratings %>%
# filter(iteration == 10) %>%
# select(-iteration) %>%
# right_join(contributors_data_frame, by = "contributor_id") %>%
# mutate(author_helpfulness_score = replace_na(author_helpfulness_score, 0))
# --------------------------------------------
return(author_ratings)
}
author_ratings <- calculate_author_helpfulness(
iterations = 10,
contributors_data_frame = contributors_data,
ratings_data_frame = ratings
)
author_helpfulness_scores <- author_ratings %>%
filter(iteration == 10) %>%
select(-iteration) %>%
right_join(contributors_data, by = "contributor_id") %>%
mutate(author_helpfulness_score = replace_na(author_helpfulness_score, 0))
Looking across a random sample of twitcher
s and birder
s, we can see that scores do indeed converge over ten iterations.
Figure 2: Convergence in author helpfulness scores across 10 iterations based on a random sample of 5 birders and 5 twitchers.
Birdwatch calculates a preliminary note score using the following calculation: \[\text{preliminary_note_score}(n) = \frac{\sum_{\text{rater}\in R(n)}a(\text{rater})\times \text{Rating}(\text{rater}, n)}{\sum_{\text{rater}\in R(n)}a(\text{rater})} \]
👇 I replicate this in R
using the code outlined here.
#' @name calculate_preliminary_note_scores
#' @description calculates preliminary note scores for the purposes of
#' constructing rater helpfulness scores.
#' @param ratings_data_frame data frame made by `create_ratings_dataset`
#' @param author_scores_data data frame made by `calculate_author_helpfulness`
calculate_preliminary_note_scores <- function(
ratings_data_frame = ratings,
author_scores_data = author_helpfulness_scores
) {
prelim_note_scores <- ratings_data_frame %>%
left_join(
.,
author_scores_data %>%
rename(rater_id = contributor_id),
by = "rater_id"
) %>%
na.omit() %>%
group_by(post_id, contributor_id) %>%
summarise(
numerator =
sum(author_helpfulness_score * rate_helpful),
denominator = sum(author_helpfulness_score)
) %>%
mutate(
preliminary_note_score = numerator/denominator
) %>%
mutate(
prelim_rating = case_when(
preliminary_note_score >= .84 ~ "Currently Rated Helpful",
preliminary_note_score <= .29 ~ "Currently Rated Not Helpful",
TRUE ~ "Needs More Ratings"
)
) %>%
select(-numerator, -denominator)
return(prelim_note_scores)
}
prelim_note_scores <- calculate_preliminary_note_scores(
ratings_data_frame = ratings,
author_scores_data = author_helpfulness_scores
)
At this point, it may be helpful to recall that notes are uniquely identified by the combination of post_id
and contributor_id
. Using these preliminary scores, we can identity the set of notes which we will use to construct the rater helpfulness scores (i.e. those that are rated as helpful or not helpful).
I now recreate the Rater Helpfulness Score. The actual construction here is slightly different from that used by Birdwatch. Whereas Birdwatch takes the first 5 ratings in a given time period, here we randomly sample 5 ratings (we don’t have a time dimension in this very rudimentary framework). The ratings are filtered to only notes that have a definitive rating using the preliminary note score.
Currently only the first 5 ratings on each note that were made within 48 hours of the note’s creation are used when evaluating a Rater Helpfulness Score (hereafter called “valid ratings”). This is done to both reward quick rating, and also so that retroactively rating old notes with clear labels doesn’t boost Rater Helpfulness Score. - Birdwatch
If there is coordination, this may, however, open up additional vulnerabilities to abuse. One can imagine, for instance, if a twitcher
is forewarned that a note will be generated, they may be able to rate it before any birder
s get the opportunity. To model this, we can add a parameter twitcher_speed_param
which changes the probability that a twitcher
rating is randomly selected as a valid rating.
👇 Rater helpfulness function.
#' @name calculate_rater_helpfulness
#' @description Calculates the rater halpfulness score using
#' the ratings data set and the preliminary note scores.
#' @param ratings_data_frame data created by `create_ratings_dataset`
#' @param prelim_scores_data data created by `calculate_preliminary_notes`
#' @param contributors_data_frame data created by `create_contributors`
#' @param twitcher_speed_param Increases the probability that ratings from
#' twitchers will be selected as 'valid ratings'.
calculate_rater_helpfulness <- function(
ratings_data_frame = ratings,
prelim_scores_data = prelim_note_scores,
contributors_data_frame = contributors_data,
twitcher_speed_param = twitcher_speed
) {
# Rater Helpfulness Scores
rater_helpfulness_scores <- ratings_data_frame %>%
group_by(post_id, contributor_id) %>%
# Get preliminary note scores
left_join(.,
prelim_scores_data %>%
select(post_id, contributor_id, prelim_rating),
by = c("post_id", "contributor_id")
) %>%
# subset to only those with ratings
filter(prelim_rating %in% c(
"Currently Rated Helpful",
"Currently Rated Not Helpful"
)) %>%
# subset to where there are a least 5 ratings
mutate(count = 1) %>%
group_by(post_id, contributor_id) %>%
mutate(count = sum(count)) %>%
filter(count >= 5) %>%
select(-count) %>%
# Weight probability by twitcher speed
left_join(.,
contributors_data_frame %>%
rename(rater_id = contributor_id),
by = "rater_id") %>%
mutate(speed = if_else(type == "twitcher", twitcher_speed_param, 1)) %>%
# Randomly select 5
slice_sample(n = 5, weight_by = speed) %>%
select(-speed, -type) %>%
# Calculate consensus without current rating
mutate(
consensus = (5/4) * (mean(rate_helpful) - rate_helpful/5)
) %>%
mutate(
consensus = case_when(
consensus >= .75 ~ 1,
consensus == .5 ~ consensus,
consensus <= .25 ~ 0
)
) %>%
# Subset to notes with consensus
filter(consensus %in% c(0,1)) %>%
group_by(rater_id) %>%
# Calculate rater scores
mutate(
valid_rating = 1
) %>%
summarise(
num_valid_ratings_match = sum(rate_helpful == consensus),
valid_ratings = sum(valid_rating)
) %>%
mutate(
rater_helpfulness_score =
(3/2) * (2 + num_valid_ratings_match) / (6 + valid_ratings) - 1/2
) %>%
mutate(
rater_helpfulness_score = if_else(
rater_helpfulness_score < 0,
0,
rater_helpfulness_score
)
) %>%
select(rater_id, rater_helpfulness_score) %>%
rename(contributor_id = rater_id) %>%
right_join(
contributors_data_frame,
by = "contributor_id"
) %>%
mutate(
rater_helpfulness_score = replace_na(rater_helpfulness_score, 0)
)
return(rater_helpfulness_scores)
}
rater_helpfulness_scores <- calculate_rater_helpfulness(
ratings_data_frame = ratings,
prelim_scores_data = prelim_note_scores,
contributors_data_frame = contributors_data,
twitcher_speed_param = 1
)
To get the final note scores we first need to calculate the combined helpfulness score which is simply an average of the author helpfulness score and the rater helpfulness score.
👇 Combined helpfulness score function.
#' @name calculate_combined_helpfulness_score
#' @description Averages together the author and
#' rater helpfulness scores.
#' @param author_helpfulness_data data created by `calculate_author_helpfulness`
#' @param rater_helpfulness_data data created by `calculate_rater_helpfulness`.
calculate_combined_helpfulness_score <- function(
author_helpfulness_data = author_helpfulness_scores,
rater_helpfulness_data = rater_helpfulness_scores
) {
combined_helpfulness_scores <-
left_join(
author_helpfulness_data,
rater_helpfulness_data,
by = c("contributor_id", "type")
) %>%
mutate(
combined_helpfulness_score =
((author_helpfulness_score + rater_helpfulness_score) / 2)
)
return(combined_helpfulness_scores)
}
combined_helpfulness_scores <- calculate_combined_helpfulness_score(
author_helpfulness_data = author_helpfulness_scores,
rater_helpfulness_data = rater_helpfulness_scores
)
Finally, we are able to calculate the final note scores, where the numeric score is calculated following:
\[\text{note_score}(n) = \frac{\sum_{\text{rater}\in R(n)}c(\text{rater})\times rating(\text{rater}, n)}{\sum_{\text{rater}\in R(n)}c(\text{rater})} \]
Where this numeric score is greater than or equal to .84
, the note is classified as “Currently Rated Helpful”. When it is less than or equal to .29
it receives a “Currently Rated Not Helpful” rating. Otherwise, it’s tagged “Needs More Ratings”.
👇 Code to classify notes.
#' @name calculate_final_note_scores
#' @description This function uses the contributor helpfulness scores to
#' calculate the final note score.
#' @param ratings_data data created by `create_ratings_dataset`
#' @param combined_scores_data data created by `calculated_combined_helpfulness_score`
calculate_final_note_scores <- function(
ratings_data = ratings,
combined_scores_data = combined_helpfulness_scores
){
final_note_scores <- ratings_data %>%
left_join(
.,
combined_scores_data %>%
rename(rater_id = contributor_id,
rater_type = type),
by = "rater_id"
) %>%
group_by(post_id, contributor_id) %>%
summarise(
numerator = sum(combined_helpfulness_score * rate_helpful),
denominator = sum(combined_helpfulness_score)
) %>%
mutate(
note_score = numerator / denominator
) %>%
select(
post_id, contributor_id, note_score
) %>%
mutate(
note_rating = case_when(
note_score >= .84 ~ "Currently Rated Helpful",
note_score <= .29 ~ "Currently Rated Not Helpful",
TRUE ~ "Needs More Ratings"
)
)
}
note_scores <- calculate_final_note_scores(
ratings_data = ratings,
combined_scores_data = combined_helpfulness_scores
)
Now that we have our scores, we can take a look and see whether twitcher
s were able to accomplish anything with the strategy they pursued in this simulation. We’re primarily interested in two outcomes of interest 1) contributor ratings, and 2) note ratings.
Let’s take a look at the contributor scores. While twitcher
s were able to get their author scores to be comparable to those of birders, their ratings were sufficiently different from those of birders (who on average made up the consensus of valid ratings) that their Rater Helpfulness Scores, and thus their Combined Helpfulness Scores, suffered. I’ll call that a win for Birdwatch.
Figure 3: Distribution of contributor helpfulness scores by type.
But what about the notes themselves? The most important indicator, from my perspective, is whether or not the system is able to sort out fact from fiction. If everything is working properly, notes that flag blatant_lie
s should be rated as helpful and those that flag non-misleading posts should be rated as not-helpful despite collusion by the twitcher
s.
In the figure below, we can see that this is the case. Under these settings and the random seed assigned above, we achieve near perfect separation. Some notes are classified as “Needs More Ratings”, but not a single note flagging a lie is classified as “not helpful” and no notes flagging not misleading posts are classified as “helpful”.