Identifying Coordinated Abuse in Community Rating Systems: Evidence from Twitter's Birdwatch

Research Questions

Is it feasible for coordinated groups to to game Twitter’s new system for ranking Birdwatch participants? (Yes).
Is there evidence of coordinated adversarial action in Birdwatch data? (No).
How can Twitter monitor and combat these risks?

Research Summary

I recreate Birdwatch’s system for rating notes and provide a simple tool for examining the consequences of different adversarial strategies using a simulation approach inspired by agent based modeling.
In principle, there are some strategies by which small but coordinated groups could overcome Birdwatch’s current safeguards and get objectively false notes rated as helpful.
I highlight existing community detection tools and indicators that could help identify groups using these strategies and demonstrate their utility in simulated data.
I apply these tools to actual Birdwatch data and find that there is no evidence of coordinated groups currently using the identified strategies to undermine Birdwatch ratings.
All the results presented here are preliminary and serve mostly as a proof of concept. Further iterations would greatly benefit from feedback of Birdwatch participants and creators. Please DM me with comments if this is you!

Background and Context

Birdwatch has done an unparalleled job in building out its product with openness and transparency and cultivating trust. Despite this, it still faces some… hostility?

pic.twitter.com/m88x0LGhEi
— Ößlïgå†ê Çårñïvðrê 🥩 Ret’d CAF ((???)) June 25, 2021

When this hesitancy is widespread, no matter how shallow, it only takes a few errors to really erode whatever trust has been built up. That fragility could be viewed as an opportunity by individuals or groups of people who want to undermine such efforts. If actions could be taken to ‘break’ Birdwatch (i.e. rate otherwise helpful notes as not helpful and demonstrably false notes as helpful), groups could simultaneously spread misinformation and diminish trust in the institutions designed to combat it.

Importantly, we have precedent for similar coordinated actions on other platforms in other contexts.

“Cyberstriking” is a common tool used by politically motivated groups to trip the reporting systems of social media sites to get political opponents removed or temporarily banned. Identity Evropa, a white supremacist group now operating as the American Identity Movement, had a dedicated channel on their (leaked) Discord server to coordinate such actions with varying degrees of success:

The challenge with this type of coordination is that there is often limited spatial clustering or publicly shared social graphs of coordinated actors (i.e. people within the group often do not follow each other on Twitter or Facebook to maintain separation between their public profile and group affiliations). This makes it difficult to use standard community detection tools to identify these groups.

Birdwatch is actively addressing the challenge of coordinated adversarial actors and currently has many safeguards in place to prevent coordinated action, many of which I am sure are not in the public eye. Access to the pilot is limited to accounts that have verified contacts, use two factor authentication, and have not recently been found to violate Twitter’s terms and conditions (among others). Birdwatch is also intentionally recruiting a diverse contributor base to get broader and more varied perspectives.

Most notably for the purposes of the issue discussed here is that Birdwatch limits the degree to which any single author can improve the helpfulness score of another:

Each rater’s ratings of any particular author are only counted once in this weighted average, using the average helpfulness rating/ from that particular rater of the particular author. For example: if a rater rated 10 notes from the same author, those will count as 1 author-level rating instead of 10 ratings. - Birdwatch

However, as the program scales and more users are permitted to join the platform, the issue of coordinated abuse may still come to the fore. The goal of this analysis is to ask whether small, coordinated groups could overcome these safeguards to manipulate the platform or otherwise undermine public trust in Birdwatch as a whole.

I have done my best to match the methodology of Birdwatch in calculating author and note scores as precisely as possible, please shoot me an email if you spot any mistakes. All errors are my own.

Q1: Is it feasible for coordinated groups to game PageRank?

The new PageRank-based system recently introduced by Birdwatch uses rankings of submitted notes to generate contributor helpfulness scores. If other Birdwatch contributors rate your notes as helpful, your score goes up. If other contributors with high helpfulness ratings themselves rate you as helpful, your score will go up even more.

PageRank was, of course, phenomenally successful and instrumental in Google’s rise to prominence as the default search engine for the World Wide Web. However, it was not without its vulnerabilities to abuse. In the early days of the blogosphere, link farms allowed websites to artificially increase their PageRank by creating many new websites and linking them to one another. Thus, allowing ill-intentioned people with access to a hosting server to boost content-poor, ad-rich websites to the top of search results, ruining the internet for the rest of us.

Birdwatch notes are anonymously contributed, meaning that contributors and Twitter users do not see the account names of those who write notes. This anonymity, combined with the account verification requirements, makes it much more difficult for disconnected users to batch produce ratings and preferentially support members of their own in-group (as a link farm might). Nevertheless, it may struggle to fully mitigate coordination now or in the future.

A simple Agent Based Model (ABM)

To illustrate how it might be feasible, in principle, to game such a system, we can look at a (charmingly?) simple agent based model.

Agent based modeling can be really helpful in understanding emergent behavior in complex systems. By design, Birdwatch (transparently) complicates community ratings system to make it exceedingly difficult for a stand-alone user to game the system. This also makes it quite hard to model from a pure game theoretic perspective. By recreating the system, modeling user behavior at their level, and observing outcomes of interest we can take a brute force approach to understanding the expected efficacy of different strategies and, therefore, potential vulnerabilities.

I try to replicate the data generating process of Birdwatch in as simple a framework as possible using only R and the tidyverse. The flow of the simulation is straightforward:

There exists a universe of posts, some of which are misleading.
We have contributors of two types who create notes to flag posts as potentially misleading.
Contributors can rate one another’s notes as helpful or not.
Authors are assigned scores based on notes they have contributed and the behaviors of other players using the PageRank system described by Birdwatch.
Notes are allocated scores and classifications (“Currently Rated Helpful”, “Currently Rated Not Helpful”, and “Needs More Ratings”).

Once we have this system established, I’ll define a set of strategies that coordinated users could employ to try to game the system and perform a grid search over that parameter space to identify potential vulnerabilities in how ratings are calculated. The benefit of codifying the system as that we can quickly and easily add in additional strategy parameters at a later date or update note rating protocols as the system evolves.

The environment

We’ll first create a universe of 1,000 posts that are split into five topics. All of these topics are super cool but are purely illustrative. In a real world scenario, we should think of these topics as being highly specific claims (e.g. vaccines make you magnetic). One of these illustrative topics (“Politics”) is the target topic of any potential bad actors who will try to undermine the ratings system with regard to that topic. Within this universe, each individual post has a 10% probability of being a pure fabrication, which we code as a blatant_lie. Clearly, this is a dramatic over-simplification of the what Birdwatch is trying to accomplish, but it provides a useful and clear starting point for analysis.

👇 Code to generate posts.

Show code

# Setting seed for replicability ----
set.seed(1355) 

#' @name  create_posts
#' @description This function creates the universe of posts 
#'   split into five topics. Each post has a five-percent 
#'   chance of being a blatant_lie.
#' @param n_posts The number of posts to create.
create_posts <- function(n_posts) {
  fabricatr::fabricate(
    ID_label = "post_id",
    N = n_posts,
    topic = fabricatr::draw_categorical(
      N = N,
      prob = c(.2, .05, .3, .2, .25),
      category_labels = 
        c("Formula One",
          "Coffee",
          "Data Science",
          "Gardening",
          "Politics")
    ),
    blatant_lie = fabricatr::draw_binary(N = N, p = .1)
  )
}

posts <- create_posts(n_posts = 1000)

post_id	topic	blatant_lie
0132	Data Science	1
0844	Politics	1
0581	Coffee	0
0506	Coffee	0
0072	Politics	0
0188	Data Science	0

The players and their behavior

We’ll now create a population of 1000 contributors. Most of them will be birders and a small proportion \(\rho\) will be twitchers. Birders follow the rules and faithfully try to report misleading information. Twitchers, on the other hand, try to seek out posts related to a specific topic and report misleading information as truthful and vice versa.

👇 Code to generate contributors.

Show code

#' @name  create_contributors
#' @description This function creates a population of contributors
#'   who are of two types: birders (good) and twitchers (bad).
#' @param n_contributors The size of the population of contributors.
#' @param rho The probability an account is a twitcher.
create_contributors <- function(n_contributors, rho) {
  fabricatr::fabricate(
    ID_label = "contributor_id",
    N = n_contributors,
    type = fabricatr::draw_categorical(
      N = N, 
      prob = c(1-rho, rho),
      category_labels = c("birder","twitcher"),
    )
  )
}

contributors_data <- create_contributors(n_contributors = 1000, rho = .02)

With the current random seed, our population of contributors has 977 birders and 23 twitchers. This is a really small number of inauthentic accounts, but we will see if they can have an outsized effect on individual cases (i.e. can they flag a true post as potentially misleading). To do this, we need to model actor search and rating behavior. A birder acts in the intended way; randomly sorting through posts and identifying (with some error) posts that they think are misleading. If the birder rates a post as misleading, it will be flagged and added to a notes database.

The twitcher acts in much the same way with the exception of a topic of interest. Here, we’ll say that the twitchers are interested in the “Politics” topic. These contributors will go our of their way to identify posts in this topic and only flag them if they are true. So, blatant lies will not added to the notes database, but truthful posts will be flagged as potentially misleading and added as a note. The breakdown of how frequently the twitchers consider the targeted topic is encoded in the parameter \(\gamma\) which we can think of as defining how discreet the twitchers are. A \(\gamma\) of 1 would mean they only target the Politics topic and 0 would mean they would never look at posts in the politics topic. twitchers can also be more or less active than birders when considering posts to tag. This is encoded in the multiplier parameter of the create_notes_dataset() function.

We also need to provide a means for twitchers to coordinate with one another. I add a new variable named whistle for this purpose. When a twitcher adds a note to a post , they also change this variable to 1. We can think about this signalling as occurring either through a content based code word that twitchers know and birders do not (see 💬) OR through some other off-platform means of communication (e.g. a Discord server in the case of “Cyberstriking”). With a signal in place, twitchers can identify one another and rate themselves as helpful if they come across them.

👇 Code to generate a simulated notes dataset.

Show code

#' @name create_notes_dataset
#' @description creates the data frame of notes (posts that are flagged).
#' @param attention_span How many posts the contributor considers in total.
#' @param param_gamma The degree to which twitchers focus on target topic.
#' @param contributors_data_frame The data frame produced by `create_contributors`.
#' @param posts_data The data frame created by `create_posts`.
#' @param multiplier scalar to change twitcher attention relative to birder.
create_notes_dataset <- function (
  contributors_data_frame = contributors_data,
  posts_data = posts,
  param_gamma = .1,
  attention_span = 10,
  multiplier = 1
) {
  # randomly sample sum of attention spans with replacement and randomly assign
  # attention_span posts to each contributor.
  n_birders <- contributors_data_frame %>% 
    filter(type == "birder") %>% 
    count() %>% 
    pull()

  n_twitchers <- contributors_data_frame %>% 
    filter(type == "twitcher") %>% 
    count() %>% 
    pull()
  
  # Get birder Notes ----
  birder_notes <- posts_data %>% 
    # get full list
    slice_sample(
      n = (attention_span * n_birders),
      replace = TRUE
      ) %>% 
    # assign to contributors
    mutate(contributor_id = 
             contributors_data_frame %>% 
             filter(type == "birder") %>% 
             select(contributor_id) %>% 
             slice(
               rep(1:n(), each = attention_span)
               ) %>% 
             pull()
           ) %>% 
    # decide whether to flag or not
    mutate(
      error = rbinom((attention_span * n_birders), 1, .05),
      flag = if_else(
        error == 0,
        as.integer(blatant_lie),
        as.integer(1-blatant_lie)
      ),
      whistle = 0
    ) %>% 
    filter(flag == 1)
    
  # Get twitcher notes ----
  non_target_notes <- posts_data %>% 
    filter(topic != "Politics") %>% 
    slice_sample(
      n = round(multiplier * attention_span * (1 - param_gamma)) * n_twitchers,
      replace = TRUE
    ) %>% 
    # assign to contributors
    mutate(contributor_id = 
             contributors_data_frame %>% 
             filter(type == "twitcher") %>% 
             select(contributor_id) %>% 
             slice(
               rep(1:n(), each = round(multiplier * attention_span * (1 - param_gamma)))
             ) %>% 
             pull()
    ) %>% 
    # decide whether to flag or not
    mutate(
      error = rbinom(round(multiplier * attention_span * (1 - param_gamma)) * n_twitchers, 1, .05),
      flag = if_else(
        error == 0,
        as.integer(blatant_lie),
        as.integer(1-blatant_lie)
      ),
      whistle = 1 
    ) %>% 
    filter(flag == 1)
  
  target_notes <- posts_data %>% 
    filter(topic == "Politics") %>% 
    slice_sample(
      n = round(multiplier * attention_span * param_gamma) * n_twitchers,
      replace = TRUE
    ) %>% 
    # assign to contributors
    mutate(contributor_id = 
             contributors_data_frame %>% 
             filter(type == "twitcher") %>% 
             select(contributor_id) %>% 
             slice(
               rep(1:n(), each = round(multiplier * attention_span * param_gamma))
             ) %>% 
             pull()
    ) %>% 
    # decide whether to flag or not
    mutate(
      flag = if_else(blatant_lie == 0, 1, 0),
      whistle = 1 
    ) %>% 
    filter(flag == 1)
  
  notes <- bind_rows(
    birder_notes, non_target_notes
  ) %>% 
    bind_rows(target_notes)
  
  return(notes)
}

notes <- create_notes_dataset(
    contributors_data_frame = contributors_data,
    posts_data = posts,
    param_gamma = .1,
    attention_span = 10,
    multiplier = 1
  )

We can look at the distribution of true (blatant) lies across topics. If everything is working as expected, there should be mostly lies across topics (though some true posts are reported due to honest errors) and many more “true” posts should be flagged by the twitchers, leading to an overall lower proportion of lies in the “Politics” topic. Looking at the plot below, we see that this is the case.

Figure 1: Distribution of blatant lies that are entered into the notes database by birders (🔴s) and twitchers (🔺s).

Rating Period

Now that we have this set of notes we can allow contributors to rate one another’s posts. Again this requires explicitly modeling contributor behavior. The birders will act like the good citizens they are and will give positive ratings to correctly flagged posts (those that are blatant lies). Twitchers, on the other hand, will actively search out posts with a whistle and will always rate them as helpful. Doing so, under certain conditions, could allow them to artificially increase author and rater scores among members of their cabal.

For the purposes of constructing helpfulness scores, we now have contributors rate the helpfulness of a sample of notes and then use these values to build author helpfulness scores, rater helpfulness scores, and combined helpfulness scores as faithfully as possible to the methods described by Birdwatch.

👇 Code to create ratings dataset.

Show code

#' @name create_ratings_dataset
#' @description This function creates a set of ratings. Each contributor
#'  looks at a set of tweets and gives it a rating based on their rate_ function.
#' @param attention_span The number of posts each contributor considers.
#' @param contributors_data_frame The data frame produced by `create_contributors`.
#' @param notes_data The dataframe produced by `create_notes_dataset`.
#' @param multiplier scalar to change twitcher attention relative to birder.
create_ratings_dataset <- function (
  contributors_data_frame = contributors_data,
  notes_data = notes,
  attention_span = 30,
  multiplier = 1
) {
  n_birders <- contributors_data_frame %>% 
    filter(type == "birder") %>% 
    count() %>% 
    pull()
  
  n_twitchers <- contributors_data_frame %>% 
    filter(type == "twitcher") %>% 
    count() %>% 
    pull()
  
  # Get birder ratings ----
  birder_ratings <- notes_data %>% 
    # get full list
    slice_sample(
      n = (attention_span * n_birders),
      replace = TRUE
    ) %>% 
    # assign to contributors
    mutate(rater_id = 
             contributors_data_frame %>% 
             filter(type == "birder") %>% 
             select(contributor_id) %>% 
             slice(
               rep(1:n(), each = attention_span)
             ) %>% 
             pull()
    ) %>% 
    # decide whether to rate as helpful or not
    mutate(
      error = rbinom((attention_span * n_birders), 1, .05),
      rate_helpful = if_else(
        error == 0,
        as.integer(blatant_lie),
        as.integer(1-blatant_lie)
      )
    ) %>% 
    select(-error, -flag, - whistle)
  
  # Get twitcher ratings ----
  twitcher_ratings <- notes_data %>% 
    filter(whistle == 1) %>% 
    slice_sample(
      n = (multiplier * attention_span * n_twitchers),
      replace = TRUE
    ) %>% 
    # assign to contributors
    mutate(rater_id = 
             contributors_data_frame %>% 
             filter(type == "twitcher") %>% 
             select(contributor_id) %>% 
             slice(
               rep(1:n(), each = multiplier * attention_span)
             ) %>% 
             pull()
    ) %>% 
    # decide whether to flag or not
    mutate(
      rate_helpful = 1
    ) %>% 
    select(-error, -flag, - whistle)
  
  
  ratings <- bind_rows(birder_ratings, twitcher_ratings)
  return(ratings)
}

ratings <- create_ratings_dataset(
    contributors_data_frame = contributors_data,
    notes_data = notes,
    attention_span = 20,
    multiplier = 1
  )

Author helpfulness score

Each contributor starts out with a default helpfulness score of 1 following the procedure outlined in Birdwatch’s ranking methodology notes. We then iterate over the following calculation until these scores are stable. Note that only contributors with at least one note that has been rated will receive an author helpfulness score. In this simulation, that means there are only 773 authors out of the original set of 1000 contributors that receive a score.

\[a_i(u) = \max\left(0, \frac{3}{2} \times \frac{2 + \sum_{\text{rater}\in R(u)} a_{i-1}(rater)\times \text{Rating}(\text{rater}, u)}{6 + \sum_{\text{rater}\in R(u)} a_{i-1}(\text{rater})} - \frac{1}{2}\right) \]

👇 Code to calculate author helpfulness ratings.

Show code

#' @name calculate_author_helpfulness
#' @description calculates author helpfulness scores as
#'  outlined by Birdwatch methodology.
#' @param contributors_data_frame data frame made by `create_contributors`
#' @param ratings_data_frame data frame made by `create_ratings_dataset`
#' @param iterations number of iterations to calculate author scores.
calculate_author_helpfulness <- function(
  contributors_data_frame = contributors_data,
  ratings_data_frame = ratings,
  iterations = 10
) {
  # Initialize author_rating data frame
  author_ratings <- contributors_data_frame %>% 
    mutate(author_helpfulness_score = 1,
           iteration = 0) %>% 
    select(-type) %>% 
    filter(
      contributor_id %in% unique(ratings_data_frame$contributor_id)
    )
  
  # Iterate until convergence
  for (i in 1:iterations) {
    # Get new author ratings
    new_author_ratings <- ratings_data_frame  %>% 
      # get average helpfulness rating by id pair
      group_by(contributor_id, rater_id) %>% 
      summarise(rate_helpful = mean(rate_helpful)) %>% 
      # get current author ratings
      left_join(., author_ratings %>% 
                  filter(
                    iteration == max(author_ratings$iteration)
                  ) %>% 
                  rename(rater_id = contributor_id),
                by = "rater_id"
      ) %>% 
      na.omit() %>% 
      mutate(
        numerator_sum_item = author_helpfulness_score * rate_helpful
      ) %>% 
      summarise(
        numerator_sum = sum(numerator_sum_item),
        denominator_sum = sum(author_helpfulness_score)
      ) %>% 
      mutate(
        author_helpfulness_score = 
          (3 / 2) * (2 + numerator_sum) / (6 + denominator_sum) - (1 / 2)
      ) %>% 
      mutate(
        author_helpfulness_score = 
          if_else(
            author_helpfulness_score < 0,
            0,
            author_helpfulness_score
          )
      ) %>% 
      mutate(iteration = i) %>% 
      select(contributor_id, author_helpfulness_score, iteration)
    
    # Append them to author ratings dataset
    author_ratings <- 
      bind_rows(author_ratings, new_author_ratings) 
  }
  
  # get only the most recent iteration
  # --------------------------------------------
  # NOTE: This is commented out *only* for the purposes of this document,
  #    see the full source code for deets.
  # --------------------------------------------
  #author_ratings <- author_ratings %>% 
  #  filter(iteration == 10) %>% 
  #  select(-iteration) %>% 
  #  right_join(contributors_data_frame, by = "contributor_id") %>% 
  #  mutate(author_helpfulness_score = replace_na(author_helpfulness_score, 0))
  # --------------------------------------------
  return(author_ratings)
}

author_ratings <- calculate_author_helpfulness(
    iterations = 10,
    contributors_data_frame = contributors_data,
    ratings_data_frame = ratings
  )

author_helpfulness_scores <- author_ratings %>% 
  filter(iteration == 10) %>% 
  select(-iteration) %>% 
  right_join(contributors_data, by = "contributor_id") %>% 
  mutate(author_helpfulness_score = replace_na(author_helpfulness_score, 0))

Looking across a random sample of twitchers and birders, we can see that scores do indeed converge over ten iterations.

Figure 2: Convergence in author helpfulness scores across 10 iterations based on a random sample of 5 birders and 5 twitchers.

Preliminary note scoring

Birdwatch calculates a preliminary note score using the following calculation: \[\text{preliminary_note_score}(n) = \frac{\sum_{\text{rater}\in R(n)}a(\text{rater})\times \text{Rating}(\text{rater}, n)}{\sum_{\text{rater}\in R(n)}a(\text{rater})} \]

👇 I replicate this in R using the code outlined here.

Show code

#' @name calculate_preliminary_note_scores
#' @description calculates preliminary note scores for the purposes of
#'    constructing rater helpfulness scores.
#' @param ratings_data_frame data frame made by `create_ratings_dataset`
#' @param author_scores_data data frame made by `calculate_author_helpfulness`
calculate_preliminary_note_scores <- function(
  ratings_data_frame = ratings,
  author_scores_data = author_helpfulness_scores
) {
  prelim_note_scores <- ratings_data_frame %>% 
    left_join(
      .,
      author_scores_data %>% 
        rename(rater_id = contributor_id),
      by = "rater_id"
    ) %>% 
    na.omit() %>% 
    group_by(post_id, contributor_id) %>% 
    summarise(
      numerator = 
        sum(author_helpfulness_score * rate_helpful),
      denominator = sum(author_helpfulness_score)
    ) %>% 
    mutate(
      preliminary_note_score = numerator/denominator
    ) %>% 
    mutate(
      prelim_rating = case_when(
        preliminary_note_score >= .84 ~ "Currently Rated Helpful",
        preliminary_note_score <= .29 ~ "Currently Rated Not Helpful",
        TRUE ~ "Needs More Ratings"
      )
    ) %>% 
    select(-numerator, -denominator)
  return(prelim_note_scores)
}

prelim_note_scores <- calculate_preliminary_note_scores(
    ratings_data_frame = ratings,
    author_scores_data = author_helpfulness_scores
  )

At this point, it may be helpful to recall that notes are uniquely identified by the combination of post_id and contributor_id. Using these preliminary scores, we can identity the set of notes which we will use to construct the rater helpfulness scores (i.e. those that are rated as helpful or not helpful).

Rater helpfulness score

I now recreate the Rater Helpfulness Score. The actual construction here is slightly different from that used by Birdwatch. Whereas Birdwatch takes the first 5 ratings in a given time period, here we randomly sample 5 ratings (we don’t have a time dimension in this very rudimentary framework). The ratings are filtered to only notes that have a definitive rating using the preliminary note score.

Currently only the first 5 ratings on each note that were made within 48 hours of the note’s creation are used when evaluating a Rater Helpfulness Score (hereafter called “valid ratings”). This is done to both reward quick rating, and also so that retroactively rating old notes with clear labels doesn’t boost Rater Helpfulness Score. - Birdwatch

If there is coordination, this may, however, open up additional vulnerabilities to abuse. One can imagine, for instance, if a twitcher is forewarned that a note will be generated, they may be able to rate it before any birders get the opportunity. To model this, we can add a parameter twitcher_speed_param which changes the probability that a twitcher rating is randomly selected as a valid rating.

👇 Rater helpfulness function.

Show code

#' @name calculate_rater_helpfulness
#' @description Calculates the rater halpfulness score using
#'    the ratings data set and the preliminary note scores.
#' @param ratings_data_frame data created by `create_ratings_dataset`
#' @param prelim_scores_data data created by `calculate_preliminary_notes`
#' @param contributors_data_frame data created by `create_contributors`
#' @param twitcher_speed_param Increases the probability that ratings from 
#'   twitchers will be selected as 'valid ratings'.
calculate_rater_helpfulness <- function(
  ratings_data_frame = ratings,
  prelim_scores_data = prelim_note_scores,
  contributors_data_frame = contributors_data,
  twitcher_speed_param = twitcher_speed
) {
  # Rater Helpfulness Scores 
  rater_helpfulness_scores <- ratings_data_frame %>% 
    group_by(post_id, contributor_id) %>% 
    # Get preliminary note scores
    left_join(., 
              prelim_scores_data %>% 
                select(post_id, contributor_id, prelim_rating),
              by = c("post_id", "contributor_id")
    ) %>% 
    # subset to only those with ratings
    filter(prelim_rating %in% c(
      "Currently Rated Helpful",
      "Currently Rated Not Helpful"
    )) %>% 
    # subset to where there are a least 5 ratings
    mutate(count = 1) %>% 
    group_by(post_id, contributor_id) %>% 
    mutate(count = sum(count)) %>% 
    filter(count >= 5) %>% 
    select(-count) %>% 
    # Weight probability by twitcher speed
    left_join(., 
              contributors_data_frame %>% 
                rename(rater_id = contributor_id),
              by = "rater_id") %>% 
    mutate(speed = if_else(type == "twitcher", twitcher_speed_param, 1)) %>% 
    # Randomly select 5
    slice_sample(n = 5, weight_by = speed) %>% 
    select(-speed, -type) %>% 
    # Calculate consensus without current rating
    mutate(
      consensus = (5/4) * (mean(rate_helpful) - rate_helpful/5)
    ) %>% 
    mutate(
      consensus = case_when(
        consensus >= .75 ~ 1,
        consensus == .5 ~ consensus,
        consensus <= .25 ~ 0
      )
    ) %>% 
    # Subset to notes with consensus
    filter(consensus %in% c(0,1)) %>% 
    group_by(rater_id) %>% 
    # Calculate rater scores
    mutate(
      valid_rating = 1
    ) %>% 
    summarise(
      num_valid_ratings_match = sum(rate_helpful == consensus),
      valid_ratings = sum(valid_rating)
    ) %>% 
    mutate(
      rater_helpfulness_score = 
        (3/2) * (2 + num_valid_ratings_match) / (6 + valid_ratings) - 1/2
    ) %>% 
    mutate(
      rater_helpfulness_score = if_else(
        rater_helpfulness_score < 0,
        0,
        rater_helpfulness_score
      )
    ) %>% 
    select(rater_id, rater_helpfulness_score) %>% 
    rename(contributor_id = rater_id) %>% 
    right_join(
      contributors_data_frame,
      by = "contributor_id"
    ) %>% 
    mutate(
      rater_helpfulness_score = replace_na(rater_helpfulness_score, 0)
    ) 
  return(rater_helpfulness_scores)
}

rater_helpfulness_scores <- calculate_rater_helpfulness(
    ratings_data_frame = ratings,
    prelim_scores_data = prelim_note_scores,
    contributors_data_frame = contributors_data,
    twitcher_speed_param = 1
  )

Combined helpfulness score

To get the final note scores we first need to calculate the combined helpfulness score which is simply an average of the author helpfulness score and the rater helpfulness score.

👇 Combined helpfulness score function.

Show code

#' @name calculate_combined_helpfulness_score
#' @description Averages together the author and 
#'     rater helpfulness scores.
#' @param author_helpfulness_data data created by `calculate_author_helpfulness`
#' @param rater_helpfulness_data data created by `calculate_rater_helpfulness`.
calculate_combined_helpfulness_score <- function(
  author_helpfulness_data = author_helpfulness_scores,
  rater_helpfulness_data = rater_helpfulness_scores
) {
  combined_helpfulness_scores <- 
    left_join(
      author_helpfulness_data,
      rater_helpfulness_data,
      by = c("contributor_id", "type")
    ) %>% 
    mutate(
      combined_helpfulness_score = 
        ((author_helpfulness_score + rater_helpfulness_score) / 2)
    )
  return(combined_helpfulness_scores)
}

combined_helpfulness_scores <- calculate_combined_helpfulness_score(
    author_helpfulness_data = author_helpfulness_scores,
    rater_helpfulness_data = rater_helpfulness_scores
  )

Note scoring and classification

Finally, we are able to calculate the final note scores, where the numeric score is calculated following:

\[\text{note_score}(n) = \frac{\sum_{\text{rater}\in R(n)}c(\text{rater})\times rating(\text{rater}, n)}{\sum_{\text{rater}\in R(n)}c(\text{rater})} \]

Where this numeric score is greater than or equal to .84, the note is classified as “Currently Rated Helpful”. When it is less than or equal to .29 it receives a “Currently Rated Not Helpful” rating. Otherwise, it’s tagged “Needs More Ratings”.

👇 Code to classify notes.

Show code

#' @name calculate_final_note_scores
#' @description This function uses the contributor helpfulness scores to
#'    calculate the final note score.
#' @param ratings_data data created by `create_ratings_dataset`
#' @param combined_scores_data data created by `calculated_combined_helpfulness_score`
calculate_final_note_scores <- function(
  ratings_data = ratings,
  combined_scores_data = combined_helpfulness_scores
){
  final_note_scores <- ratings_data %>% 
    left_join(
      ., 
      combined_scores_data %>% 
        rename(rater_id = contributor_id,
               rater_type = type),
      by = "rater_id"
    ) %>%  
    group_by(post_id, contributor_id) %>% 
    summarise(
      numerator = sum(combined_helpfulness_score * rate_helpful),
      denominator = sum(combined_helpfulness_score)
    ) %>% 
    mutate(
      note_score = numerator / denominator
    ) %>% 
    select(
      post_id, contributor_id, note_score
    ) %>% 
    mutate(
      note_rating = case_when(
        note_score >= .84 ~ "Currently Rated Helpful",
        note_score <= .29 ~ "Currently Rated Not Helpful",
        TRUE ~ "Needs More Ratings"
      )
    )
}

note_scores <- calculate_final_note_scores(
    ratings_data = ratings,
    combined_scores_data = combined_helpfulness_scores
  )

Examination

Now that we have our scores, we can take a look and see whether twitchers were able to accomplish anything with the strategy they pursued in this simulation. We’re primarily interested in two outcomes of interest 1) contributor ratings, and 2) note ratings.

Let’s take a look at the contributor scores. While twitchers were able to get their author scores to be comparable to those of birders, their ratings were sufficiently different from those of birders (who on average made up the consensus of valid ratings) that their Rater Helpfulness Scores, and thus their Combined Helpfulness Scores, suffered. I’ll call that a win for Birdwatch.

Figure 3: Distribution of contributor helpfulness scores by type.

But what about the notes themselves? The most important indicator, from my perspective, is whether or not the system is able to sort out fact from fiction. If everything is working properly, notes that flag blatant_lies should be rated as helpful and those that flag non-misleading posts should be rated as not-helpful despite collusion by the twitchers.

In the figure below, we can see that this is the case. Under these settings and the random seed assigned above, we achieve near perfect separation. Some notes are classified as “Needs More Ratings”, but not a single note flagging a lie is classified as “not helpful” and no notes flagging not misleading posts are classified as “helpful”.

Figure 4: Distribution of final note ratings by accuracy and topic. If the note rating system functions as intended, we should see very clear separation in ratings across note accuracy. Accurate notes should be rated as ‘helpful’ and not accurate notes should be rated ‘not helpful’

All things considered, Birdwatch works well against a naive strategy. So, now let’s try to break it 💪.

Simulation experiments

Now that we’ve run through one example, we can re-conduct this exercise varying certain dimensions of the twitchers’ strategy or the environment. Namely, we’ll think about \(\rho\) (the size of the twitcher population), \(\gamma\) (how discreet twitchers need to be in targeting a particular topic), and twitcher activity multipliers (i.e. how much more active twitchers are compared to birders in note-making and note-rating). Then, we’ll look at a number of outcomes: 1) how often are posts mis-classified in the targeted topic compared to the others and 2) what is the average author_helpfulness_rating of twitchers. I write a simple helper function below to return these values.

👇 function to return measures we care about.

Show code

#' @name return_observables
#' @description This function returns the relevant observable
#'    outcomes in a list of two objects. The first element in 
#'    the list is average scores by contributor type. The second
#'    is the classification breakdown by blatant_lie and whether
#'    or not the topic is the target topic.
return_observables <- function(
  combined_scores_data = combined_helpfulness_scores,
  note_scores_data = note_scores,
  notes_data = notes
) {
  # return average author ratings by  type
  scores <- combined_scores_data %>% 
    group_by(type) %>% 
    summarise(
      rater_score = mean(rater_helpfulness_score),
      author_score = mean(author_helpfulness_score),
      combined_score = mean(combined_helpfulness_score)
    ) 
  # return accuracy by topic
  classifications <- notes_data %>% 
    left_join(
      .,
      note_scores_data,
      by = c("post_id","contributor_id")
    ) %>% 
    mutate(topic = if_else(topic == "Politics", "Politics", "Other")) %>% 
    group_by(
      blatant_lie, topic
    ) %>% 
    summarise(
      perc_helpful = mean(
        note_rating == "Currently Rated Helpful",
        na.rm = TRUE
        ),
      perc_not_helpful = mean(
        note_rating == "Currently Rated Not Helpful",
        na.rm = TRUE
        ),
      perc_needs_more = mean(
        note_rating == "Needs More Ratings",
        na.rm = TRUE
        ),
      average_note_score = mean(
        note_score,
        na.rm = TRUE
        )
    )
  return(
    list(scores, classifications)
  )
}

Then, we can cobble together a function that runs the whole system of functions with a set of arguments that allow us to change the value of these parameters on the fly. These parameters can be broken down into parameters which are nominally under control of the twitchers (i.e. their strategy) and those that are features of the environment:

	Parameter	Description
Strategy	`rho`	Proportion of the population that are `twitchers`.
Strategy	`gamma`	Degree to which `twitcher`s focus on target topic.
Strategy	`notes_attention_multiplier`	Ratio of `twitcher` attention to birder attention.
Strategy	`ratings_attention_multiplier`	Ratio of `twitcher` attention to birder attention.
Strategy	`twitcher_speed`	Relative likelihood of sampling `twitcher`s to get valid ratings.
Environment	`number_posts`	Total posts in the ‘universe’.
Environment	`number_contributors`	Total contributors in the universe.
Environment	`notes_attention`	Number of posts `birder`s look at to add notes.
Environment	`ratings_attention`	Number of notes `birder`s rate.

👇 birdwatchr().

Show code

#' @name birdwatchr
#' @description This function runs the whole system of functions we've
#'   created and allows us to experiment with different strategy profiles
#'   for the twitchers and see the resulting outcomes.
#' @param rho Proportion of the population that are twitchers.
#' @param gamma Degree to which twitchers focus on target topic.
#' @param notes_attention Number of posts birders look at to add notes.
#' @param notes_attention_multiplier Ratio of twitcher attention to birder attention.
#' @param ratings_attention Number of notes birders rate.
#' @param ratings_attention_multiplier Ratio of twitcher attention to birder attention.
#' @param number_posts Total posts in the 'universe'.
#' @param number_contributors Total contributors in the universe.
#' @param twitcher_speed Relative likelihood of sampling twitchers to get valid ratings.
birdwatchr <- function (
  rho = .01,
  gamma = .1,
  notes_attention = 10,
  notes_attention_multiplier = 1,
  ratings_attention = 30,
  ratings_attention_multiplier = 1,
  number_posts = 1000,
  number_contributors = 1000,
  twitcher_speed = 1
) {
  # create the environment and the players
  #print("1. create the environment and the players")
  posts <- create_posts(n_posts = number_posts)
  contributors_data <- create_contributors(
    n_contributors = number_contributors,
    rho = rho
    )
  
  # create notes period
  #print("2. create notes")
  notes <- create_notes_dataset(
    contributors_data_frame = contributors_data,
    posts_data = posts,
    param_gamma = gamma,
    attention_span = notes_attention,
    multiplier = notes_attention_multiplier
  )
  
  # create ratings period
  #print("3. create ratings")
  ratings <- create_ratings_dataset(
    contributors_data_frame = contributors_data,
    notes_data = notes,
    attention_span = ratings_attention,
    multiplier = ratings_attention_multiplier
  )
  
  # calculating scores
  #print("4. calculate author helpfulness")
  author_helpfulness_scores <- calculate_author_helpfulness(
    iterations = 10,
    contributors_data_frame = contributors_data,
    ratings_data_frame = ratings
  )  
  
  #print("5. calculate preliminary scores")
  prelim_note_scores <- calculate_preliminary_note_scores(
    ratings_data_frame = ratings,
    author_scores_data = author_helpfulness_scores
  )
  
  #print("6. calculate rater helpfulness")
  rater_helpfulness_scores <- calculate_rater_helpfulness(
    ratings_data_frame = ratings,
    prelim_scores_data = prelim_note_scores,
    contributors_data_frame = contributors_data,
    twitcher_speed_param = twitcher_speed
  )
  
  #print("7. calculate combined helpfulness")
  combined_helpfulness_scores <- calculate_combined_helpfulness_score(
    author_helpfulness_data = author_helpfulness_scores,
    rater_helpfulness_data = rater_helpfulness_scores
  )
  
  #print("8. calculate note scores")
  note_scores <- calculate_final_note_scores(
    ratings_data = ratings,
    combined_scores_data = combined_helpfulness_scores
  )
  
  # get observables
  #print("9. return observables")
  observables <- return_observables(
    combined_scores_data = combined_helpfulness_scores,
    note_scores_data = note_scores,
    notes_data = notes
  )
  
  return(observables)
}

Preliminary Trials

First we’ll conduct a preliminary exercise trying to identify whether or not there are any strategies that may allow twitchers to overcome Birdwatch safeguards. To do this, we’ll compare every potential strategy parameter at a baseline level (as close as possible to birder behavior) and with each parameter dialed up to extreme, probably ridiculous levels. We’ll then (later) use the findings from these simulations to conduct more targeted simulation experiments where we conduct each trial many times.

From the following figure, we can see that yes, there are certain strategies that users can pursue to manipulate the current system. The more accounts they have in the system, the easier it is to manipulate. However, even in trials with as few as 30 users twitchers were able to get more than half of erroneous notes rated helpful. Greediness (\(\gamma\)) was also positively correlated with targeted misclassifications. Twitcher speed did not appear to have too big an impact.

The largest differentiator in twitcher success was the level of effort devoted to spamming ratings. The more notes they could rate compared to birders, the better. Doing so, and targeting members of their group, allows them to artificially increase their contributor scores which in turn allows them to boost the ratings of inaccurate notes.

Figure 5: Preliminary simulation experiment results. More pink suggests that the twitchers are able to successfully flag true posts as misleading and get birdwatch to rate these incorrect notes as helpful.

Next steps

This analysis serves as a useful proof of concept for the approach but is fairly limited in what we can actually discern about strategies we should anticipate. It seems that the strategy profiles that are most successful in getting more than 80% of the target topic mislabeled are also those that are very detectable, as we will discuss later.

The next step in this analysis is to conduct more experiments across the parameters that appear to have a demonstrable effect on twitcher success. Rather than comparing only outcomes when the dial is at 1 or 11, we want to understand the relative tradeoffs at 1, 2, 3, and so on. I.e., what are the least detectable strategies twitchers could use to still consistently mislabel notes?

It is also worth noting that the model here is not inclusive of all potential twitcher strategies. There are many schemes that are not included in the potential parameter space that we have built in here. However, the benefit of using this simulation approach is that it is relatively straightforward to update models of user behavior and re-run experiments.

Q2: Can we detect contributors who employ these strategies?

We’ve demonstrated that, in theory, a coordinated group of participants could undermine the current system and flag true posts as misleading. However, it looks as though there are likely clear markers of these strategies in observable features of contributor behavior. Namely, we can examine the connectedness of users in a nominally anonymous system and their ratings behavior of participants if their group status is known (whether they are a twitcher or a birder).

The strategy we’ve identified relies on twitchers being able to build up each other’s contributor scores through targeted rating of notes. Identifying such behavior should be relatively straightforward from a community detection standpoint. Conditional on exposure to notes, which I can’t observe, the directional network of ratings should approximate a random graph if there is no coordination. If certain participants are coordinating, their in group network will be much more dense.

We can identify communities using standard community detection algorithms and then use measures of in-group network density to identify groups which are outliers in terms of the frequency of connections in local neighborhoods.

Based on the strategies we’ve identified above, we should also consider the following indicators based on rating behaviors:

The number and ratio of in-group ratings to out-groups ratings.
The ratio of how often group members rate notes written by other members of their groups as ‘helpful’ to how often they rate posts by other participants as helpful.

For both of these measures, a higher value would be more suggestive of coordinated abusive behavior.

Detecting communities using infomap

There are many approaches to identifying communities within larger networks. We take a standard, off-the-shelf approach and use the infomap algorithm presented in Rosvall and Bergstrom (2007) and implemented by Csardi, Nepusz, and others (2006). The advantage of this approach is that it is relatively good at detecting even small groups. Given that we are searching for potentially small or non-existent needles in a large haystack of helpful contributors, this is a decided advantage.

We can take our ratings data and convert it into a directional network on which we run infomap and calculate some basic network statistics.

👇 creating a directed network from ratings data.

Show code

#' @name create_network_data 
#' @description Reads in contributors data and the ratings data to produce a 
#'    tbl_graph object which we use to manipulate and calculate network measures.
#'    requires tidygraph.
#' @param ratings_data Ratings data frame. 
#' @param contributors_data_frame data object with contributors.
#' TODO: allow users to identify participant IDs and don't
#'  hard code to and from info. 
create_network_data <- function(
  contributors_data_frame = contributors_data,
  ratings_data = ratings
) {
  # create edges
  edges_data <- ratings_data %>%
    group_by(contributor_id, rater_id) %>%
    slice(1) %>%
    select(contributor_id, rater_id) %>%
    rename(from = rater_id, to = contributor_id)
  
  # convert to network data
  network_graph <- tbl_graph(
    nodes = contributors_data_frame,
    edges = edges_data,
    node_key = "contributor_id",
    directed = TRUE
  ) %>%
    # group with infomap
    mutate(group_im = group_infomap()) %>%
    # Calculate neighborhood graph size
    mutate(neighborhood_edges =
             map_local_dbl(
               .f = function(neighborhood, ...) {
                 igraph::gsize(neighborhood)
               }
             ))
  return(network_graph)
} 

network_graph <- create_network_data(
  contributors_data,
  ratings
  )

Infomap identifies potential groups and we can then calculate some group-level statistics to identify communities that may be engaging in coordinated behavior. For now, we simply take the group that has the highest average connections in a local neighborhood to identify over-connected groups. However, in a real world setting, as we will see with the real Birdwatch data, we will want to also look at in-group/out-group rating behaviors and the substantive topics of notes.

When we run this simple set-up in our simulated data, it works:

Figure 6: Community of twitchers detected using infomap. Plot includes 23 twitchers and a random sample of 100 birders. Twitchers in pink, birders in blue. The pink line is the convex hull of the detected community of twitchers.

Summary

Repeating this detection system 100 times in randomly generated data finds that 100% of the time, it works every time (in simulated data 🐁). However, we need to conduct additional sensitivity analysis and think about recursive strategies (i.e. what strategies might twitchers take to avoid detection).

For starters, we want to examine at what point a detection system falls apart. Does it fail with groups that are smaller than 20 participants? 10? 5? Can twitchers balance their ratings behavior to avoid detection while still undermining the ratings system?

We would also want to formally incorporate rating behavior into detecting outlier groups and potentially look for more than one group. Most importantly, we need to build up our simulation to allow our synthetic data to be as similar as possible to the real world Birdwatch data.

Show code

#' @name detection_summary
#' @description Simple helper function to calculate the proportion of the 
#'    identified group which is actually twitchers.
#' @param network_graph_object Data object created by `create_network_data`.
detection_summary <- function(network_graph_object) {
  network_summary <-
    network_graph_object %>%
    activate(nodes) %>%
    as_tibble() %>% 
    group_by(group_im) %>% 
    mutate(
      grp_neighborhood_size = mean(neighborhood_edges)
      ) %>% 
    ungroup() %>% 
    mutate(
      twitch_grp = if_else(
        grp_neighborhood_size == max(grp_neighborhood_size),
        1,
        0
        )
    ) %>% 
    mutate(twitcher = if_else(type == "twitcher", 1, 0)) %>%
    group_by(twitch_grp) %>% 
    summarise(
      perc_twitcher = mean(twitcher),
      num_twitcher = sum(twitcher)
      )
  return(network_summary)
}

Q3: Is there evidence of coordinated abuse in Birdwatch data?

With these caveats in mind, we can walk through the same community detection process on real world data to see whether there is any evidence of coordinated activity. With innumerable thanks to the @birdwatch team for making their data publicly available, we’ll use “ratings-00000.tsv” to construct a directed network of contributors. We’ll then use community detection approaches to identify groups of participants, and then we’ll look for groups that are outliers in terms of local network density and in-group favoritism.

When we do this, we see that there are at least five identified groups that are outliers in terms of the density of local connections. Is it possible that these are just active users who are exposed to similar posts and, therefore notes? Totally. But, they definitely merit a closer look.

Figure 7: Distribution of the number of edges in a node’s local neighborhood by group.

If these groups are indeed coordinating, there should be some additional digital breadcrumbs aside from just connectivity. Most importantly, twitchers should be rating members of their in group more positively compared to out groups.

For each group, we can calculate the percentage of notes members rate as helpful for both in-group members and out-group members. Then, dividing the former by the latter gives us a simple ratio of how much more likely a group is to rate its own group as helpful compared to others. A value greater than one would suggests preferential treatment.

Of the groups we identified as outliers, we see that none of them look particularly suspicious. While group 80 appears to preferentially rate members of their own group as helpful, this is based on only 7 in-group ratings which isn’t necessarily what we would expect from a group trying to boost its ratings. Group 28 was not identified as an outlier in terms of local connectivity, but it has a very high helpfulness ratio and more in group ratings but still far from what we would expect from coordinated activity.

More to the point, these groupings are totally consistent with similar users engaging with similar content. For example, if I only follow Formula 1 accounts and another Birdwatch contributor only follows Formula 1 accounts, I am more likely to see their notes. Furthermore, if we are both good and right-thinking @McLarenF1 supporters, I may also be more likely to rank their comments as helpful. I do not have the data to account for these dynamics, but it would likely be straightforward to do so for the folks over @birdwatch based on the topics and accounts participants follow.

Summary

In conjunction with additional research, Agent Based Modeling can be a useful tool for identifying vulnerabilities in community ratings systems. By reconstructing the Birdwatch rating systems and providing a simple tool to simulate user behavior, I’ve demonstrated that there are certain strategies that coordinated groups could take to abuse the Birdwatch ratings system, labeling erroneous notes as helpful and vice versa. Fortunately, these strategies leave clear fingerprints in observable user behavior data which can be used for detection. I find no evidence for coordinated abuse in the existing Birdwatch data, but there is still a lot to do to improve detection and understand the limits of potential approaches and how they fit into a broader strategy of building trust on the platform and limiting the spread of misinformation.

Limitations

There are many limitations to this approach in regards to answering the narrow question of whether or not coordinated behavior can undermine the Birdwatch ratings system. Chief among these is the degree to which the simulated environment is an abstraction of real world user behavior. However, while there are lots of small things that can be improved from a modeling perspective, the findings here are best understood in conjunction with traditional UX research aiming to understand how Twitter contributors perceive note ratings, the degree to which they attribute errors to the system as a whole or individual users, and trust in Birdwatch/Twitter as a whole.

With this in mind, it is not clear whether the best approach to remedying potential vulnerabilities is one of improving detection procedures or design procedures. In other words, is it better to have a more easily understood system for contributors and focus on detecting potential abuse or is it better to build a more robust system that makes it more difficult to undermine the ratings system.

These aren’t questions that simulations can answer. Understanding how and why Birdwatch participants and Twitter users writ large might come to trust or distrust ratings is a whole research program in its own right. Key to this program, I believe, is understanding how potential participants may react to seeing clearly false notes rated as helpful.

Nevertheless, there are many limitations of these simulations that shape how we should think about the scope and urgency of those questions. Namely, We currently run the simulation with a fixed topic size (i.e. the proportion of the universe of tweets devoted to a specific topic). It is likely easier to mess with the rating systems of smaller, less frequent topics.

Moreover, the over-connectedness of coordinated actors is somewhat exaggerated in the model which likely overestimates the success of the very preliminary community detection approach outlined above. Making these connections as close to what we would observe in the real world will be key to improving detection, understanding its limitations, and most importantly, understanding what strategies twitchers might take to limit detection while still mislabelling notes on a particular topic.

👇 Next steps:

For clarity, variable and object names should be made more consistent with actual Birdwatch data and naming conventions.
To make the simulation more consistent, we should model the attention span of users as a distribution and not as a constant by group following real world data.
Add in an additional parameter which allows twitchers to vary the degree to which they rate other twitchers versus birders.
Adjust topic size parameter (how big of a topic can twitchers undermine?).
Conduct additional trials across finer slices of the strategy space and repeat each of these multiple times.
Add in additional groups within birders that preferentially follow different topics.
After updating the model conduct additional sensitivity analysis for detection and manipulation.

Acknowledgments

Thanks to the amazing open source software developers who have built the suite of packages used to build this analysis. This note uses the following packages (all available on CRAN): tidyverse, showtext, gt, DT, ggiraph, fabricatr, knitr, tidygraph, and distill. Many thanks also to Birdwatch and Twitter for making their data publicly available.

Csardi, Gabor, Tamas Nepusz, and others. 2006. “The Igraph Software Package for Complex Network Research.” InterJournal, Complex Systems 1695 (5): 1–9.

Rosvall, Martin, and Carl T Bergstrom. 2007. “An Information-Theoretic Framework for Resolving Community Structure in Complex Networks.” Proceedings of the National Academy of Sciences 104 (18): 7327–31.

Identifying Coordinated Abuse in Community Rating Systems: Evidence from Twitter’s Birdwatch

Research Questions

Research Summary

Background and Context

Q1: Is it feasible for coordinated groups to game PageRank?

A simple Agent Based Model (ABM)

The environment

The players and their behavior

Rating Period

Author helpfulness score

Preliminary note scoring

Rater helpfulness score

Combined helpfulness score

Note scoring and classification

Examination

Simulation experiments

Preliminary Trials

Next steps

Q2: Can we detect contributors who employ these strategies?

Detecting communities using infomap

Summary

Q3: Is there evidence of coordinated abuse in Birdwatch data?

Summary

Limitations

Acknowledgments

References