Identifying Coordinated Abuse in Community Rating Systems: Evidence from Twitter’s Birdwatch

Birdwatch, Twitter’s recently introduced community-driven approach to provide context to potentially misleading tweets, is the most ambitious and transparent effort by a social media company to combat misinformation on its platform. The tool is actively under development and they have recently introduced a new tool for rating notes and contributors using a PageRank style algorithm. This note provides an external perspective on this feature, highlights some of the remaining risks with such a system, and provides empirical and technical tools for monitoring and addressing those risks. All results preliminary and subject to change.

Benjamin Crisman https://bencrisman.com/ (Department of Politics, Princeton University)https://politics.princeton.edu/
2021-07-02

Research Questions

Research Summary

  1. I recreate Birdwatch’s system for rating notes and provide a simple tool for examining the consequences of different adversarial strategies using a simulation approach inspired by agent based modeling.
  2. In principle, there are some strategies by which small but coordinated groups could overcome Birdwatch’s current safeguards and get objectively false notes rated as helpful.
  3. I highlight existing community detection tools and indicators that could help identify groups using these strategies and demonstrate their utility in simulated data.
  4. I apply these tools to actual Birdwatch data and find that there is no evidence of coordinated groups currently using the identified strategies to undermine Birdwatch ratings.
  5. All the results presented here are preliminary and serve mostly as a proof of concept. Further iterations would greatly benefit from feedback of Birdwatch participants and creators. Please DM me with comments if this is you!

Background and Context

Birdwatch has done an unparalleled job in building out its product with openness and transparency and cultivating trust. Despite this, it still faces some… hostility?

When this hesitancy is widespread, no matter how shallow, it only takes a few errors to really erode whatever trust has been built up. That fragility could be viewed as an opportunity by individuals or groups of people who want to undermine such efforts. If actions could be taken to ‘break’ Birdwatch (i.e. rate otherwise helpful notes as not helpful and demonstrably false notes as helpful), groups could simultaneously spread misinformation and diminish trust in the institutions designed to combat it.

Importantly, we have precedent for similar coordinated actions on other platforms in other contexts.

“Cyberstriking” is a common tool used by politically motivated groups to trip the reporting systems of social media sites to get political opponents removed or temporarily banned. Identity Evropa, a white supremacist group now operating as the American Identity Movement, had a dedicated channel on their (leaked) Discord server to coordinate such actions with varying degrees of success:

The challenge with this type of coordination is that there is often limited spatial clustering or publicly shared social graphs of coordinated actors (i.e. people within the group often do not follow each other on Twitter or Facebook to maintain separation between their public profile and group affiliations). This makes it difficult to use standard community detection tools to identify these groups.

Birdwatch is actively addressing the challenge of coordinated adversarial actors and currently has many safeguards in place to prevent coordinated action, many of which I am sure are not in the public eye. Access to the pilot is limited to accounts that have verified contacts, use two factor authentication, and have not recently been found to violate Twitter’s terms and conditions (among others). Birdwatch is also intentionally recruiting a diverse contributor base to get broader and more varied perspectives.

Most notably for the purposes of the issue discussed here is that Birdwatch limits the degree to which any single author can improve the helpfulness score of another:

Each rater’s ratings of any particular author are only counted once in this weighted average, using the average helpfulness rating/ from that particular rater of the particular author. For example: if a rater rated 10 notes from the same author, those will count as 1 author-level rating instead of 10 ratings. - Birdwatch

However, as the program scales and more users are permitted to join the platform, the issue of coordinated abuse may still come to the fore. The goal of this analysis is to ask whether small, coordinated groups could overcome these safeguards to manipulate the platform or otherwise undermine public trust in Birdwatch as a whole.

I have done my best to match the methodology of Birdwatch in calculating author and note scores as precisely as possible, please shoot me an email if you spot any mistakes. All errors are my own.

Q1: Is it feasible for coordinated groups to game PageRank?

The new PageRank-based system recently introduced by Birdwatch uses rankings of submitted notes to generate contributor helpfulness scores. If other Birdwatch contributors rate your notes as helpful, your score goes up. If other contributors with high helpfulness ratings themselves rate you as helpful, your score will go up even more.

PageRank was, of course, phenomenally successful and instrumental in Google’s rise to prominence as the default search engine for the World Wide Web. However, it was not without its vulnerabilities to abuse. In the early days of the blogosphere, link farms allowed websites to artificially increase their PageRank by creating many new websites and linking them to one another. Thus, allowing ill-intentioned people with access to a hosting server to boost content-poor, ad-rich websites to the top of search results, ruining the internet for the rest of us.

Birdwatch notes are anonymously contributed, meaning that contributors and Twitter users do not see the account names of those who write notes. This anonymity, combined with the account verification requirements, makes it much more difficult for disconnected users to batch produce ratings and preferentially support members of their own in-group (as a link farm might). Nevertheless, it may struggle to fully mitigate coordination now or in the future.

A simple Agent Based Model (ABM)

To illustrate how it might be feasible, in principle, to game such a system, we can look at a (charmingly?) simple agent based model.

Agent based modeling can be really helpful in understanding emergent behavior in complex systems. By design, Birdwatch (transparently) complicates community ratings system to make it exceedingly difficult for a stand-alone user to game the system. This also makes it quite hard to model from a pure game theoretic perspective. By recreating the system, modeling user behavior at their level, and observing outcomes of interest we can take a brute force approach to understanding the expected efficacy of different strategies and, therefore, potential vulnerabilities.

I try to replicate the data generating process of Birdwatch in as simple a framework as possible using only R and the tidyverse. The flow of the simulation is straightforward:

  1. There exists a universe of posts, some of which are misleading.
  2. We have contributors of two types who create notes to flag posts as potentially misleading.
  3. Contributors can rate one another’s notes as helpful or not.
  4. Authors are assigned scores based on notes they have contributed and the behaviors of other players using the PageRank system described by Birdwatch.
  5. Notes are allocated scores and classifications (“Currently Rated Helpful”, “Currently Rated Not Helpful”, and “Needs More Ratings”).

Once we have this system established, I’ll define a set of strategies that coordinated users could employ to try to game the system and perform a grid search over that parameter space to identify potential vulnerabilities in how ratings are calculated. The benefit of codifying the system as that we can quickly and easily add in additional strategy parameters at a later date or update note rating protocols as the system evolves.

The environment

We’ll first create a universe of 1,000 posts that are split into five topics. All of these topics are super cool but are purely illustrative. In a real world scenario, we should think of these topics as being highly specific claims (e.g. vaccines make you magnetic). One of these illustrative topics (“Politics”) is the target topic of any potential bad actors who will try to undermine the ratings system with regard to that topic. Within this universe, each individual post has a 10% probability of being a pure fabrication, which we code as a blatant_lie. Clearly, this is a dramatic over-simplification of the what Birdwatch is trying to accomplish, but it provides a useful and clear starting point for analysis.

👇 Code to generate posts.

Show code
# Setting seed for replicability ----
set.seed(1355) 

#' @name  create_posts
#' @description This function creates the universe of posts 
#'   split into five topics. Each post has a five-percent 
#'   chance of being a blatant_lie.
#' @param n_posts The number of posts to create.
create_posts <- function(n_posts) {
  fabricatr::fabricate(
    ID_label = "post_id",
    N = n_posts,
    topic = fabricatr::draw_categorical(
      N = N,
      prob = c(.2, .05, .3, .2, .25),
      category_labels = 
        c("Formula One",
          "Coffee",
          "Data Science",
          "Gardening",
          "Politics")
    ),
    blatant_lie = fabricatr::draw_binary(N = N, p = .1)
  )
}

posts <- create_posts(n_posts = 1000)
post_id topic blatant_lie
0132 Data Science 1
0844 Politics 1
0581 Coffee 0
0506 Coffee 0
0072 Politics 0
0188 Data Science 0

The players and their behavior

We’ll now create a population of 1000 contributors. Most of them will be birders and a small proportion \(\rho\) will be twitchers. Birders follow the rules and faithfully try to report misleading information. Twitchers, on the other hand, try to seek out posts related to a specific topic and report misleading information as truthful and vice versa.

👇 Code to generate contributors.

Show code
#' @name  create_contributors
#' @description This function creates a population of contributors
#'   who are of two types: birders (good) and twitchers (bad).
#' @param n_contributors The size of the population of contributors.
#' @param rho The probability an account is a twitcher.
create_contributors <- function(n_contributors, rho) {
  fabricatr::fabricate(
    ID_label = "contributor_id",
    N = n_contributors,
    type = fabricatr::draw_categorical(
      N = N, 
      prob = c(1-rho, rho),
      category_labels = c("birder","twitcher"),
    )
  )
}

contributors_data <- create_contributors(n_contributors = 1000, rho = .02)

With the current random seed, our population of contributors has 977 birders and 23 twitchers. This is a really small number of inauthentic accounts, but we will see if they can have an outsized effect on individual cases (i.e. can they flag a true post as potentially misleading). To do this, we need to model actor search and rating behavior. A birder acts in the intended way; randomly sorting through posts and identifying (with some error) posts that they think are misleading. If the birder rates a post as misleading, it will be flagged and added to a notes database.

The twitcher acts in much the same way with the exception of a topic of interest. Here, we’ll say that the twitchers are interested in the “Politics” topic. These contributors will go our of their way to identify posts in this topic and only flag them if they are true. So, blatant lies will not added to the notes database, but truthful posts will be flagged as potentially misleading and added as a note. The breakdown of how frequently the twitchers consider the targeted topic is encoded in the parameter \(\gamma\) which we can think of as defining how discreet the twitchers are. A \(\gamma\) of 1 would mean they only target the Politics topic and 0 would mean they would never look at posts in the politics topic. twitchers can also be more or less active than birders when considering posts to tag. This is encoded in the multiplier parameter of the create_notes_dataset() function.

We also need to provide a means for twitchers to coordinate with one another. I add a new variable named whistle for this purpose. When a twitcher adds a note to a post , they also change this variable to 1. We can think about this signalling as occurring either through a content based code word that twitchers know and birders do not (see 💬) OR through some other off-platform means of communication (e.g. a Discord server in the case of “Cyberstriking”). With a signal in place, twitchers can identify one another and rate themselves as helpful if they come across them.

👇 Code to generate a simulated notes dataset.

Show code
#' @name create_notes_dataset
#' @description creates the data frame of notes (posts that are flagged).
#' @param attention_span How many posts the contributor considers in total.
#' @param param_gamma The degree to which twitchers focus on target topic.
#' @param contributors_data_frame The data frame produced by `create_contributors`.
#' @param posts_data The data frame created by `create_posts`.
#' @param multiplier scalar to change twitcher attention relative to birder.
create_notes_dataset <- function (
  contributors_data_frame = contributors_data,
  posts_data = posts,
  param_gamma = .1,
  attention_span = 10,
  multiplier = 1
) {
  # randomly sample sum of attention spans with replacement and randomly assign
  # attention_span posts to each contributor.
  n_birders <- contributors_data_frame %>% 
    filter(type == "birder") %>% 
    count() %>% 
    pull()

  n_twitchers <- contributors_data_frame %>% 
    filter(type == "twitcher") %>% 
    count() %>% 
    pull()
  
  # Get birder Notes ----
  birder_notes <- posts_data %>% 
    # get full list
    slice_sample(
      n = (attention_span * n_birders),
      replace = TRUE
      ) %>% 
    # assign to contributors
    mutate(contributor_id = 
             contributors_data_frame %>% 
             filter(type == "birder") %>% 
             select(contributor_id) %>% 
             slice(
               rep(1:n(), each = attention_span)
               ) %>% 
             pull()
           ) %>% 
    # decide whether to flag or not
    mutate(
      error = rbinom((attention_span * n_birders), 1, .05),
      flag = if_else(
        error == 0,
        as.integer(blatant_lie),
        as.integer(1-blatant_lie)
      ),
      whistle = 0
    ) %>% 
    filter(flag == 1)
    
  # Get twitcher notes ----
  non_target_notes <- posts_data %>% 
    filter(topic != "Politics") %>% 
    slice_sample(
      n = round(multiplier * attention_span * (1 - param_gamma)) * n_twitchers,
      replace = TRUE
    ) %>% 
    # assign to contributors
    mutate(contributor_id = 
             contributors_data_frame %>% 
             filter(type == "twitcher") %>% 
             select(contributor_id) %>% 
             slice(
               rep(1:n(), each = round(multiplier * attention_span * (1 - param_gamma)))
             ) %>% 
             pull()
    ) %>% 
    # decide whether to flag or not
    mutate(
      error = rbinom(round(multiplier * attention_span * (1 - param_gamma)) * n_twitchers, 1, .05),
      flag = if_else(
        error == 0,
        as.integer(blatant_lie),
        as.integer(1-blatant_lie)
      ),
      whistle = 1 
    ) %>% 
    filter(flag == 1)
  
  target_notes <- posts_data %>% 
    filter(topic == "Politics") %>% 
    slice_sample(
      n = round(multiplier * attention_span * param_gamma) * n_twitchers,
      replace = TRUE
    ) %>% 
    # assign to contributors
    mutate(contributor_id = 
             contributors_data_frame %>% 
             filter(type == "twitcher") %>% 
             select(contributor_id) %>% 
             slice(
               rep(1:n(), each = round(multiplier * attention_span * param_gamma))
             ) %>% 
             pull()
    ) %>% 
    # decide whether to flag or not
    mutate(
      flag = if_else(blatant_lie == 0, 1, 0),
      whistle = 1 
    ) %>% 
    filter(flag == 1)
  
  notes <- bind_rows(
    birder_notes, non_target_notes
  ) %>% 
    bind_rows(target_notes)
  
  return(notes)
}

notes <- create_notes_dataset(
    contributors_data_frame = contributors_data,
    posts_data = posts,
    param_gamma = .1,
    attention_span = 10,
    multiplier = 1
  )

We can look at the distribution of true (blatant) lies across topics. If everything is working as expected, there should be mostly lies across topics (though some true posts are reported due to honest errors) and many more “true” posts should be flagged by the twitchers, leading to an overall lower proportion of lies in the “Politics” topic. Looking at the plot below, we see that this is the case.

Figure 1: Distribution of blatant lies that are entered into the notes database by birders (🔴s) and twitchers (🔺s).

Rating Period

Now that we have this set of notes we can allow contributors to rate one another’s posts. Again this requires explicitly modeling contributor behavior. The birders will act like the good citizens they are and will give positive ratings to correctly flagged posts (those that are blatant lies). Twitchers, on the other hand, will actively search out posts with a whistle and will always rate them as helpful. Doing so, under certain conditions, could allow them to artificially increase author and rater scores among members of their cabal.

For the purposes of constructing helpfulness scores, we now have contributors rate the helpfulness of a sample of notes and then use these values to build author helpfulness scores, rater helpfulness scores, and combined helpfulness scores as faithfully as possible to the methods described by Birdwatch.

👇 Code to create ratings dataset.

Show code
#' @name create_ratings_dataset
#' @description This function creates a set of ratings. Each contributor
#'  looks at a set of tweets and gives it a rating based on their rate_ function.
#' @param attention_span The number of posts each contributor considers.
#' @param contributors_data_frame The data frame produced by `create_contributors`.
#' @param notes_data The dataframe produced by `create_notes_dataset`.
#' @param multiplier scalar to change twitcher attention relative to birder.
create_ratings_dataset <- function (
  contributors_data_frame = contributors_data,
  notes_data = notes,
  attention_span = 30,
  multiplier = 1
) {
  n_birders <- contributors_data_frame %>% 
    filter(type == "birder") %>% 
    count() %>% 
    pull()
  
  n_twitchers <- contributors_data_frame %>% 
    filter(type == "twitcher") %>% 
    count() %>% 
    pull()
  
  # Get birder ratings ----
  birder_ratings <- notes_data %>% 
    # get full list
    slice_sample(
      n = (attention_span * n_birders),
      replace = TRUE
    ) %>% 
    # assign to contributors
    mutate(rater_id = 
             contributors_data_frame %>% 
             filter(type == "birder") %>% 
             select(contributor_id) %>% 
             slice(
               rep(1:n(), each = attention_span)
             ) %>% 
             pull()
    ) %>% 
    # decide whether to rate as helpful or not
    mutate(
      error = rbinom((attention_span * n_birders), 1, .05),
      rate_helpful = if_else(
        error == 0,
        as.integer(blatant_lie),
        as.integer(1-blatant_lie)
      )
    ) %>% 
    select(-error, -flag, - whistle)
  
  # Get twitcher ratings ----
  twitcher_ratings <- notes_data %>% 
    filter(whistle == 1) %>% 
    slice_sample(
      n = (multiplier * attention_span * n_twitchers),
      replace = TRUE
    ) %>% 
    # assign to contributors
    mutate(rater_id = 
             contributors_data_frame %>% 
             filter(type == "twitcher") %>% 
             select(contributor_id) %>% 
             slice(
               rep(1:n(), each = multiplier * attention_span)
             ) %>% 
             pull()
    ) %>% 
    # decide whether to flag or not
    mutate(
      rate_helpful = 1
    ) %>% 
    select(-error, -flag, - whistle)
  
  
  ratings <- bind_rows(birder_ratings, twitcher_ratings)
  return(ratings)
}

ratings <- create_ratings_dataset(
    contributors_data_frame = contributors_data,
    notes_data = notes,
    attention_span = 20,
    multiplier = 1
  )

Author helpfulness score

Each contributor starts out with a default helpfulness score of 1 following the procedure outlined in Birdwatch’s ranking methodology notes. We then iterate over the following calculation until these scores are stable. Note that only contributors with at least one note that has been rated will receive an author helpfulness score. In this simulation, that means there are only 773 authors out of the original set of 1000 contributors that receive a score.

\[a_i(u) = \max\left(0, \frac{3}{2} \times \frac{2 + \sum_{\text{rater}\in R(u)} a_{i-1}(rater)\times \text{Rating}(\text{rater}, u)}{6 + \sum_{\text{rater}\in R(u)} a_{i-1}(\text{rater})} - \frac{1}{2}\right) \]

👇 Code to calculate author helpfulness ratings.

Show code
#' @name calculate_author_helpfulness
#' @description calculates author helpfulness scores as
#'  outlined by Birdwatch methodology.
#' @param contributors_data_frame data frame made by `create_contributors`
#' @param ratings_data_frame data frame made by `create_ratings_dataset`
#' @param iterations number of iterations to calculate author scores.
calculate_author_helpfulness <- function(
  contributors_data_frame = contributors_data,
  ratings_data_frame = ratings,
  iterations = 10
) {
  # Initialize author_rating data frame
  author_ratings <- contributors_data_frame %>% 
    mutate(author_helpfulness_score = 1,
           iteration = 0) %>% 
    select(-type) %>% 
    filter(
      contributor_id %in% unique(ratings_data_frame$contributor_id)
    )
  
  # Iterate until convergence
  for (i in 1:iterations) {
    # Get new author ratings
    new_author_ratings <- ratings_data_frame  %>% 
      # get average helpfulness rating by id pair
      group_by(contributor_id, rater_id) %>% 
      summarise(rate_helpful = mean(rate_helpful)) %>% 
      # get current author ratings
      left_join(., author_ratings %>% 
                  filter(
                    iteration == max(author_ratings$iteration)
                  ) %>% 
                  rename(rater_id = contributor_id),
                by = "rater_id"
      ) %>% 
      na.omit() %>% 
      mutate(
        numerator_sum_item = author_helpfulness_score * rate_helpful
      ) %>% 
      summarise(
        numerator_sum = sum(numerator_sum_item),
        denominator_sum = sum(author_helpfulness_score)
      ) %>% 
      mutate(
        author_helpfulness_score = 
          (3 / 2) * (2 + numerator_sum) / (6 + denominator_sum) - (1 / 2)
      ) %>% 
      mutate(
        author_helpfulness_score = 
          if_else(
            author_helpfulness_score < 0,
            0,
            author_helpfulness_score
          )
      ) %>% 
      mutate(iteration = i) %>% 
      select(contributor_id, author_helpfulness_score, iteration)
    
    # Append them to author ratings dataset
    author_ratings <- 
      bind_rows(author_ratings, new_author_ratings) 
  }
  
  # get only the most recent iteration
  # --------------------------------------------
  # NOTE: This is commented out *only* for the purposes of this document,
  #    see the full source code for deets.
  # --------------------------------------------
  #author_ratings <- author_ratings %>% 
  #  filter(iteration == 10) %>% 
  #  select(-iteration) %>% 
  #  right_join(contributors_data_frame, by = "contributor_id") %>% 
  #  mutate(author_helpfulness_score = replace_na(author_helpfulness_score, 0))
  # --------------------------------------------
  return(author_ratings)
}

author_ratings <- calculate_author_helpfulness(
    iterations = 10,
    contributors_data_frame = contributors_data,
    ratings_data_frame = ratings
  )

author_helpfulness_scores <- author_ratings %>% 
  filter(iteration == 10) %>% 
  select(-iteration) %>% 
  right_join(contributors_data, by = "contributor_id") %>% 
  mutate(author_helpfulness_score = replace_na(author_helpfulness_score, 0))

Looking across a random sample of twitchers and birders, we can see that scores do indeed converge over ten iterations.

Figure 2: Convergence in author helpfulness scores across 10 iterations based on a random sample of 5 birders and 5 twitchers.

Preliminary note scoring

Birdwatch calculates a preliminary note score using the following calculation: \[\text{preliminary_note_score}(n) = \frac{\sum_{\text{rater}\in R(n)}a(\text{rater})\times \text{Rating}(\text{rater}, n)}{\sum_{\text{rater}\in R(n)}a(\text{rater})} \]

👇 I replicate this in R using the code outlined here.

Show code
#' @name calculate_preliminary_note_scores
#' @description calculates preliminary note scores for the purposes of
#'    constructing rater helpfulness scores.
#' @param ratings_data_frame data frame made by `create_ratings_dataset`
#' @param author_scores_data data frame made by `calculate_author_helpfulness`
calculate_preliminary_note_scores <- function(
  ratings_data_frame = ratings,
  author_scores_data = author_helpfulness_scores
) {
  prelim_note_scores <- ratings_data_frame %>% 
    left_join(
      .,
      author_scores_data %>% 
        rename(rater_id = contributor_id),
      by = "rater_id"
    ) %>% 
    na.omit() %>% 
    group_by(post_id, contributor_id) %>% 
    summarise(
      numerator = 
        sum(author_helpfulness_score * rate_helpful),
      denominator = sum(author_helpfulness_score)
    ) %>% 
    mutate(
      preliminary_note_score = numerator/denominator
    ) %>% 
    mutate(
      prelim_rating = case_when(
        preliminary_note_score >= .84 ~ "Currently Rated Helpful",
        preliminary_note_score <= .29 ~ "Currently Rated Not Helpful",
        TRUE ~ "Needs More Ratings"
      )
    ) %>% 
    select(-numerator, -denominator)
  return(prelim_note_scores)
}

prelim_note_scores <- calculate_preliminary_note_scores(
    ratings_data_frame = ratings,
    author_scores_data = author_helpfulness_scores
  )

At this point, it may be helpful to recall that notes are uniquely identified by the combination of post_id and contributor_id. Using these preliminary scores, we can identity the set of notes which we will use to construct the rater helpfulness scores (i.e. those that are rated as helpful or not helpful).

Rater helpfulness score

I now recreate the Rater Helpfulness Score. The actual construction here is slightly different from that used by Birdwatch. Whereas Birdwatch takes the first 5 ratings in a given time period, here we randomly sample 5 ratings (we don’t have a time dimension in this very rudimentary framework). The ratings are filtered to only notes that have a definitive rating using the preliminary note score.

Currently only the first 5 ratings on each note that were made within 48 hours of the note’s creation are used when evaluating a Rater Helpfulness Score (hereafter called “valid ratings”). This is done to both reward quick rating, and also so that retroactively rating old notes with clear labels doesn’t boost Rater Helpfulness Score. - Birdwatch

If there is coordination, this may, however, open up additional vulnerabilities to abuse. One can imagine, for instance, if a twitcher is forewarned that a note will be generated, they may be able to rate it before any birders get the opportunity. To model this, we can add a parameter twitcher_speed_param which changes the probability that a twitcher rating is randomly selected as a valid rating.

👇 Rater helpfulness function.

Show code
#' @name calculate_rater_helpfulness
#' @description Calculates the rater halpfulness score using
#'    the ratings data set and the preliminary note scores.
#' @param ratings_data_frame data created by `create_ratings_dataset`
#' @param prelim_scores_data data created by `calculate_preliminary_notes`
#' @param contributors_data_frame data created by `create_contributors`
#' @param twitcher_speed_param Increases the probability that ratings from 
#'   twitchers will be selected as 'valid ratings'.
calculate_rater_helpfulness <- function(
  ratings_data_frame = ratings,
  prelim_scores_data = prelim_note_scores,
  contributors_data_frame = contributors_data,
  twitcher_speed_param = twitcher_speed
) {
  # Rater Helpfulness Scores 
  rater_helpfulness_scores <- ratings_data_frame %>% 
    group_by(post_id, contributor_id) %>% 
    # Get preliminary note scores
    left_join(., 
              prelim_scores_data %>% 
                select(post_id, contributor_id, prelim_rating),
              by = c("post_id", "contributor_id")
    ) %>% 
    # subset to only those with ratings
    filter(prelim_rating %in% c(
      "Currently Rated Helpful",
      "Currently Rated Not Helpful"
    )) %>% 
    # subset to where there are a least 5 ratings
    mutate(count = 1) %>% 
    group_by(post_id, contributor_id) %>% 
    mutate(count = sum(count)) %>% 
    filter(count >= 5) %>% 
    select(-count) %>% 
    # Weight probability by twitcher speed
    left_join(., 
              contributors_data_frame %>% 
                rename(rater_id = contributor_id),
              by = "rater_id") %>% 
    mutate(speed = if_else(type == "twitcher", twitcher_speed_param, 1)) %>% 
    # Randomly select 5
    slice_sample(n = 5, weight_by = speed) %>% 
    select(-speed, -type) %>% 
    # Calculate consensus without current rating
    mutate(
      consensus = (5/4) * (mean(rate_helpful) - rate_helpful/5)
    ) %>% 
    mutate(
      consensus = case_when(
        consensus >= .75 ~ 1,
        consensus == .5 ~ consensus,
        consensus <= .25 ~ 0
      )
    ) %>% 
    # Subset to notes with consensus
    filter(consensus %in% c(0,1)) %>% 
    group_by(rater_id) %>% 
    # Calculate rater scores
    mutate(
      valid_rating = 1
    ) %>% 
    summarise(
      num_valid_ratings_match = sum(rate_helpful == consensus),
      valid_ratings = sum(valid_rating)
    ) %>% 
    mutate(
      rater_helpfulness_score = 
        (3/2) * (2 + num_valid_ratings_match) / (6 + valid_ratings) - 1/2
    ) %>% 
    mutate(
      rater_helpfulness_score = if_else(
        rater_helpfulness_score < 0,
        0,
        rater_helpfulness_score
      )
    ) %>% 
    select(rater_id, rater_helpfulness_score) %>% 
    rename(contributor_id = rater_id) %>% 
    right_join(
      contributors_data_frame,
      by = "contributor_id"
    ) %>% 
    mutate(
      rater_helpfulness_score = replace_na(rater_helpfulness_score, 0)
    ) 
  return(rater_helpfulness_scores)
}

rater_helpfulness_scores <- calculate_rater_helpfulness(
    ratings_data_frame = ratings,
    prelim_scores_data = prelim_note_scores,
    contributors_data_frame = contributors_data,
    twitcher_speed_param = 1
  )

Combined helpfulness score

To get the final note scores we first need to calculate the combined helpfulness score which is simply an average of the author helpfulness score and the rater helpfulness score.

👇 Combined helpfulness score function.

Show code
#' @name calculate_combined_helpfulness_score
#' @description Averages together the author and 
#'     rater helpfulness scores.
#' @param author_helpfulness_data data created by `calculate_author_helpfulness`
#' @param rater_helpfulness_data data created by `calculate_rater_helpfulness`.
calculate_combined_helpfulness_score <- function(
  author_helpfulness_data = author_helpfulness_scores,
  rater_helpfulness_data = rater_helpfulness_scores
) {
  combined_helpfulness_scores <- 
    left_join(
      author_helpfulness_data,
      rater_helpfulness_data,
      by = c("contributor_id", "type")
    ) %>% 
    mutate(
      combined_helpfulness_score = 
        ((author_helpfulness_score + rater_helpfulness_score) / 2)
    )
  return(combined_helpfulness_scores)
}

combined_helpfulness_scores <- calculate_combined_helpfulness_score(
    author_helpfulness_data = author_helpfulness_scores,
    rater_helpfulness_data = rater_helpfulness_scores
  )

Note scoring and classification

Finally, we are able to calculate the final note scores, where the numeric score is calculated following:

\[\text{note_score}(n) = \frac{\sum_{\text{rater}\in R(n)}c(\text{rater})\times rating(\text{rater}, n)}{\sum_{\text{rater}\in R(n)}c(\text{rater})} \]

Where this numeric score is greater than or equal to .84, the note is classified as “Currently Rated Helpful”. When it is less than or equal to .29 it receives a “Currently Rated Not Helpful” rating. Otherwise, it’s tagged “Needs More Ratings”.

👇 Code to classify notes.

Show code
#' @name calculate_final_note_scores
#' @description This function uses the contributor helpfulness scores to
#'    calculate the final note score.
#' @param ratings_data data created by `create_ratings_dataset`
#' @param combined_scores_data data created by `calculated_combined_helpfulness_score`
calculate_final_note_scores <- function(
  ratings_data = ratings,
  combined_scores_data = combined_helpfulness_scores
){
  final_note_scores <- ratings_data %>% 
    left_join(
      ., 
      combined_scores_data %>% 
        rename(rater_id = contributor_id,
               rater_type = type),
      by = "rater_id"
    ) %>%  
    group_by(post_id, contributor_id) %>% 
    summarise(
      numerator = sum(combined_helpfulness_score * rate_helpful),
      denominator = sum(combined_helpfulness_score)
    ) %>% 
    mutate(
      note_score = numerator / denominator
    ) %>% 
    select(
      post_id, contributor_id, note_score
    ) %>% 
    mutate(
      note_rating = case_when(
        note_score >= .84 ~ "Currently Rated Helpful",
        note_score <= .29 ~ "Currently Rated Not Helpful",
        TRUE ~ "Needs More Ratings"
      )
    )
}

note_scores <- calculate_final_note_scores(
    ratings_data = ratings,
    combined_scores_data = combined_helpfulness_scores
  )

Examination

Now that we have our scores, we can take a look and see whether twitchers were able to accomplish anything with the strategy they pursued in this simulation. We’re primarily interested in two outcomes of interest 1) contributor ratings, and 2) note ratings.

Let’s take a look at the contributor scores. While twitchers were able to get their author scores to be comparable to those of birders, their ratings were sufficiently different from those of birders (who on average made up the consensus of valid ratings) that their Rater Helpfulness Scores, and thus their Combined Helpfulness Scores, suffered. I’ll call that a win for Birdwatch.

Figure 3: Distribution of contributor helpfulness scores by type.

But what about the notes themselves? The most important indicator, from my perspective, is whether or not the system is able to sort out fact from fiction. If everything is working properly, notes that flag blatant_lies should be rated as helpful and those that flag non-misleading posts should be rated as not-helpful despite collusion by the twitchers.

In the figure below, we can see that this is the case. Under these settings and the random seed assigned above, we achieve near perfect separation. Some notes are classified as “Needs More Ratings”, but not a single note flagging a lie is classified as “not helpful” and no notes flagging not misleading posts are classified as “helpful”.