Introduction

This report presents an analysis of data on the history of police violence in the United States from the past 20 years (2000-2019). Topics discussed include development of police violence in general and between different races, as well as a state-specific analysis which includes the occurrence of relevant protests in this time period.  

Overview and Motivation

Police brutality has gained an increased attention in the media lately. The rise of the Black Lives Matter movement after the killing of George Floyd on May 25th this year has led to heated discussions on whether police violence at a large scale is a result of systematic racism.  

The aim of this project is to shed some light on police brutality in the United States and analyze the different fatality rates between the different races for the past 20 years. For this project, a special focus is put on the different fatality rates between the African-American and the White population. Additionally, we aim to analyze the occurrence of police violence in specific states with a history of protests against police brutality.  

Initial Questions

We first aim to look at the trend of police-related fatalities between races throughout the whole United States in the years 2000-2019. We would like to see if there is any trend or changes in the rate of police fatalities in the years following large protests against police brutality. Since most protests have focused on police violence against African-Americans, we first wanted to visualize the proportion of African-Americans killed by police compared to their population in relation to other races to determine if there is a higher proportion of African-Americans killed by police, as protesters have claimed.  

After looking at the United States as a whole, we then want to see if there were any local changes to the amount of police-related fatalities in the specific states where the protests have occurred. Since there have been a large number of protests in 2020, we aim to determine how much effect, if any, protests have on systematically changing police behavior.  

Our analysis aims to answer the following questions for the nation as a whole, along with each state that the protests have occurred in:  

  1. Has the number of police fatalities per million people changed over time?  
  2. What is the growth rate of police-related fatalities over time?  
  3. How many total people are killed of each race each year?  
  4. How many fatalities are there of each race in the proportion to the population of that race? Is this different between races?  

For the United States and each state we examine, we analyze if there is a change in the trend for each of these factors in the years following each protest.  

Experimental Method

The Data Scientific Method is an experimental method named by Gaussian Engineering in their article3 about experimental methods for data science projects. The Data Scientific Method is an iterative process consisting of 6 steps. It is iterative in the sense that after the six steps are done, it encourages one to seek feedback and act upon it by repeating the process again, up to multiple times. The six stages of the process are:  

  1. Identify - formulate the goal of the data science project and plan the project.  
  2. Understand - get a feel of the data sets in question and ensure their quality.  
  3. Process - perform steps processes such as “wrangling” and “cleaning” on the data sets to get them in a state ready for analyses.  
  4. Analyze - explore, inspect, and model the data to discover patterns and relationships previously unknown.  
  5. Conclude - with the results from the previous steps in hand, conclude and draw a reliable and valuable conclusion from the results.  
  6. Communicate - the essential step. Formulate the findings in such a way that readers understand the conclusions obtained from the data   The method has been applied through workshops in the initial phase of the project resolving around project decision and what questions to be raised. This has been followed by collection, interpretation and review of data sets, to ensure quality and relevance.  

The following stage of the development include data manipulation, such that it is ready for analysis. The stage also includes altering the population data sets to create a new data set grouped by race and state only such that the total population for each race is obtained in each state. The numerical representations of states is also replaced, e.g., “36” to “NY” for “New York” to make it more intuitive when examining the data set. A year column is appended on the fatality data set, and the date and race columns are altered to reflect the population data set in terms of the mentioned column values.  

With the data sets ready for analysis, data frames that held data to answer the questions mentioned in the report’s introduction are created. Different ways to showcase the data through pie charts, bar and graph plots are also explored. Different conclusions to the mentioned questions can be made from these plots.  

The findings are communicated through a presentation for the entire class. This leads to feedback on raising questions to discover unknown patterns from the data. E.g., look at how the fatalities growth rates changed based on protests and president election results. The final findings is communicated through a screen cast. A link is provided in the projects folder.  

This method is reliable because it challenges the group to define concrete questions to answer early in the process and encourage them to understand the data sets used in the project. Besides, the process is an iterative process that enables seeking feedback and takes action from the input. This method is therefore applied.  

Data

This project has two public main sources for information, Fatal Encounters and United States Census Bureau. In this section a small description of each source and data set is given, and we show how we downloaded and manipulated the data frames.  

Fatal Encounters

Fatal Encounters is an organization that aims to create a national search-able database about police brutality in the USA4. They collect information in three ways; paid researchers, public records requests and crowd-sourced data. The data is available for anyone through an external link on their website5. Each entry in the data set contains information about victims of police-related fatalities including the most relevant variables for this project such as: location, race, state, and year.  

Data about fatalities is downloaded directly from the Fatal Encounters website. Since this is a Google document that is updated weekly, the data frame will be updated with the most recent fatalities each time the code runs. The data frame includes a column with the date of each fatality, which is used to add a column containing just the year. The reason for this is that police-related fatalities are going to be grouped by the year. The year of 2020 is not over yet, and it is therefore excluded from the analysis. The names of races and states are also updated to assure a consistent naming convention for the columns in this data frame and the population data frame.

# Download data frame from fatalencounters.org
df_fatalities_2000to2019 <- gsheet2tbl("https://docs.google.com/spreadsheets/d/1dKmaV_JiWcG8XBoRgP8b4e9Eopkpgt7FL7nyspvzAsE/edit#gid=0")

# Manipulate data frame
df_fatalities_2000to2019_edited <- df_fatalities_2000to2019 %>%
  rename("Date" = "Date of injury resulting in death (month/day/year)") %>%
  mutate(Date = mdy(Date), Year = year(Date)) %>%
  filter(Year < 2020) %>%
  mutate(Race = dplyr::recode(Race,     
                              "European-American/White" = "White",      
                              "Hispanic/Latino" = "Hispanic",       
                              "African-American/Black" = "African-American",        
                              "Native American/Alaskan" = "Native American",        
                              "Asian/Pacific Islander" = "Asian")) %>%
  mutate(State = dplyr::recode(State,
                               "AL" = "Alabama",
                               "AK" = "Alaska",
                               "AZ" = "Arizona",
                               "AR" = "Arkansas",
                               "CA" = "California",
                               "CO" = "Colorado",
                               "CT" = "Connecticut",
                               "DE" = "Delaware",
                               "DC" = "District of Columbia",
                               "FL" = "Florida",
                               "GA" = "Georgia",
                               "HI" = "Hawaii",
                               "ID" = "Idaho",
                               "IL" = "Illinois",
                               "IN" = "Indiana",
                               "IA" = "Iowa",
                               "KS" = "Kansas",
                               "KY" = "Kentucky",
                               "LA" = "Louisiana",
                               "ME" = "Maine",
                               "MD" = "Maryland",
                               "MA" = "Massachusetts",
                               "MI" = "Michigan",
                               "MN" = "Minnesota",
                               "MS" = "Mississippi",
                               "MO" = "Missouri",
                               "MT" = "Montana",
                               "NE" = "Nebraska",
                               "NV" = "Nevada",
                               "NH" = "New Hampshire",
                               "NJ" = "New Jersey",
                               "NM" = "New Mexico",
                               "NY" = "New York",
                               "NC" = "North Carolina",
                               "ND" = "North Dakota",
                               "OH" = "Ohio",
                               "OK" = "Oklahoma",
                               "OR" = "Oregon",
                               "PA" = "Pennsylvania",
                               "RI" = "Rhode Island",
                               "SC" = "South Carolina",
                               "SD" = "South Dakota",
                               "TN" = "Tennessee",
                               "TX" = "Texas",
                               "UT" = "Utah",
                               "VT" = "Vermont",
                               "VA" = "Virginia",
                               "WA" = "Washington",
                               "WV" = "West Virginia",
                               "WI" = "Wisconsin",
                               "WY" = "Wyoming"))

Here is an example of the format of the Fatality data frame.

## # A tibble: 6 x 6
##   Name                  Gender   Age Race             State       Year
##   <chr>                 <chr>  <dbl> <chr>            <chr>      <dbl>
## 1 Samuel H. Knapp       Male      17 White            California  2000
## 2 Mark A. Horton        Male      21 African-American Michigan    2000
## 3 Phillip A. Blurbridge Male      19 African-American Michigan    2000
## 4 Mark Ortiz            Male      23 Hispanic         New Mexico  2000
## 5 LaTanya Janelle McCoy Female    24 African-American California  2000
## 6 Lester Miller         Male      53 Race unspecified Georgia     2000

United States Census Bureau

The bureau provides data sets about the population in the USA. Each decade a national census is completed, and a data set is made available afterward. Thus, we only have two years (2000 and 2010) with the USA’s actual population, and the other years are a population estimate. A detailed description6 about the data set is provided by the bureau.  

Both data sets (actual and estimated population) groups the population by age, gender, race, and state. For this analysis, the total population of each race is the most interesting data point.  

Two data frames were downloaded from the United States Census Bureau, one for 2000-2010 and one for 2010-2019. They have the same format so they are edited in the same manner. The variables that cause duplicates in the total population estimate are also filtered out. For example, when the SEX variable is equal to 0 it shows the population including both males and females, when it is equal to 1 it includes only males, and when equal to 2 it includes only females. Therefore, only the rows where SEX is equal to 0 is included. These two data frames are then merged into one that includes all 20 years. The names of races and states are finally updated to match the fatality data frame.

# Download data frames directly from the United States Census Bureau
df_population_2000to2010 <- read.csv(text = getURL("https://www2.census.gov/programs-surveys/popest/datasets/2000-2010/intercensal/county/co-est00int-sexracehisp.csv"))

df_population_2010to2019 <- read.csv(text = getURL("https://www2.census.gov/programs-surveys/popest/tables/2010-2019/state/asrh/sc-est2019-alldata6.csv"))

# Find the population grouped by race and state
df_population_2000to2010_edited <- df_population_2000to2010 %>%
  filter(SEX == 0, ORIGIN == 0) %>%
  group_by(STATE, RACE) %>% 
  rename("State" = "STATE", "Race" = "RACE") %>%
  dplyr::summarize(pop2000 = sum(POPESTIMATE2000), 
                   pop2001=sum(POPESTIMATE2001),
                   pop2002=sum(POPESTIMATE2002),
                   pop2003=sum(POPESTIMATE2003),
                   pop2004=sum(POPESTIMATE2004),
                   pop2005=sum(POPESTIMATE2005),
                   pop2006=sum(POPESTIMATE2006),
                   pop2007=sum(POPESTIMATE2007),
                   pop2008=sum(POPESTIMATE2008),
                   pop2009=sum(POPESTIMATE2009))

df_population_2010to2019_edited <- df_population_2010to2019 %>%
  filter(SEX == 0, ORIGIN == 0) %>%
  group_by(STATE, RACE) %>%
  rename("State" = "STATE", "Race" = "RACE") %>%
  dplyr::summarize(pop2010 = sum(CENSUS2010POP),
                   pop2011=sum(POPESTIMATE2011),
                   pop2012=sum(POPESTIMATE2012),
                   pop2013=sum(POPESTIMATE2013),
                   pop2014=sum(POPESTIMATE2014),
                   pop2015=sum(POPESTIMATE2015),
                   pop2016=sum(POPESTIMATE2016),
                   pop2017=sum(POPESTIMATE2017),
                   pop2018=sum(POPESTIMATE2018),
                   pop2019=sum(POPESTIMATE2019))

#Merge to one population date frame
df_population_2000to2019 <- merge(df_population_2000to2010_edited, df_population_2010to2019_edited) %>%
  mutate(Race = dplyr::recode(Race,     
                              "1" = "White",
                              "2" = "African-American",
                              "3" = "Native American",
                              "4" = "Asian",
                              "5" = "Pacific Islander",
                              "6" = "Two or more races")) %>%
  mutate(State = dplyr::recode(State,
                               "1" = "Alabama",
                               "2" = "Alaska",
                               "4" = "Arizona",
                               "5" = "Arkansas",
                               "6" = "California",
                               "8" = "Colorado",
                               "9" = "Connecticut",
                               "10" = "Delaware",
                               "11" = "District of Columbia",
                               "12" = "Florida",
                               "13" = "Georgia",
                               "15" = "Hawaii",
                               "16" = "Idaho",
                               "17" = "Illinois",
                               "18" = "Indiana",
                               "19" = "Iowa",
                               "20" = "Kansas",
                               "21" = "Kentucky",
                               "22" = "Louisiana",
                               "23" = "Maine",
                               "24" = "Maryland",
                               "25" = "Massachusetts",
                               "26" = "Michigan",
                               "27" = "Minnesota",
                               "28" = "Mississippi",
                               "29" = "Missouri",
                               "30" = "Montana",
                               "31" = "Nebraska",
                               "32" = "Nevada",
                               "33" = "New Hampshire",
                               "34" = "New Jersey",
                               "35" = "New Mexico",
                               "36" = "New York",
                               "37" = "North Carolina",
                               "38" = "North Dakota",
                               "39" = "Ohio",
                               "40" = "Oklahoma",
                               "41" = "Oregon",
                               "42" = "Pennsylvania",
                               "44" = "Rhode Island",
                               "45" = "South Carolina",
                               "46" = "South Dakota",
                               "47" = "Tennessee",
                               "48" = "Texas",
                               "49" = "Utah",
                               "50" = "Vermont",
                               "51" = "Virginia",
                               "53" = "Washington",
                               "54" = "West Virginia",
                               "55" = "Wisconsin",
                               "56" = "Wyoming"))

Here is an example of the format of the Population data frame. As shown, it includes the number of people of each race in each state by year. The columns after 2005 were excluded for visual purposes, however, the data frame includes columns with the yearly data up to 2019.

##     State              Race pop2000 pop2001 pop2002 pop2003 pop2004 pop2005
## 1 Alabama             White 3196875 3201387 3204279 3215079 3227772 3249443
## 2 Alabama  African-American 1161454 1167403 1171672 1178398 1186375 1197062
## 3 Alabama   Native American   23262   24084   24966   25920   26755   27606
## 4 Alabama             Asian   32544   33971   35810   37929   40712   43406
## 5 Alabama  Pacific Islander    1608    1963    2272    2625    2983    3290
## 6 Alabama Two or more races   36430   38826   41090   43540   46132   48998

Protests

Information about when protests occurred and the background behind these protests is obtained from an article from Bloomberg News written in June 2020.7.

Analysis

National-Level Analysis

The following analysis is done on a national level, meaning the number of fatalities throughout the entire United States is taken into account, regardless of geographical position within the country itself. This is necessary to do to get a good overview of the development of police violence before analyzing more specific developments within specific states.  

Fatality Trend

The first graph shows the number of fatalities per million people in the United States from 2000-2019. This is done by first creating a data frame that includes the total population in the United States each year and the total number of fatalities for each year. Then, the proportion of fatalities per million people for each year is calculated by dividing the number of fatalities by the population and multiplying by a million.

# Function to create a data frame with the proportion of total fatalities to the total population for each year
make_dataframe_by_year <- function(){
  pop_df <- data_frame()
  
  for(year in 2000:2019){
    popyear <- paste("pop", year, sep = "")
    
    tmp <- df_population_2000to2019 %>%
      dplyr::select(Race, popyear) %>%
      rename("Pop" = popyear) %>%
      mutate(Year = year, Population = sum(Pop)) %>%
      dplyr::select(Year, Population) %>%
      unique()
    
    pop_df <- rbind(pop_df, tmp)
  }
  
  fat_df <- df_fatalities_2000to2019_edited %>%
    group_by(Year) %>%
    dplyr::summarize(Fatalities = n())
  
  new_df <- merge(pop_df, fat_df) %>%
    mutate(Proportion = (Fatalities/Population)*1000000)
  
  return(new_df)
}

# Create a data frame with the proportion of total fatalities by population each year
df_totalpropfat <- make_dataframe_by_year()


# Plot: Number of fatalities per million people over time
plot_proportion_fatalities <- df_totalpropfat %>%
  ggplot(aes(x=Year, y=Proportion)) +
  geom_line() +
  theme_minimal() +
  geom_vline(xintercept = c(2006, 2011, 2014, 2015, 2016), linetype = "dashed") +
  labs(title="Police-Related Fatalites per Million People", 
       subtitle="United States, 2000-2019",
       caption = "Figure 1. This graphs shows the number of police-related fatalities per million people each year in the
       United States from 2000-2019. The vertical dashed lines represent the years where there were 
       major protests against police brutality in the United States.",
       y = "Number of Fatalities") +
  theme(plot.caption = element_text(hjust = 0))

The second graph shows the growth rate of the number of police fatalities since the previous year. This is done by finding the change in the number of fatalities per million people since the previous year, and the growth rate for each year is calculated by dividing the change by the number of fatalities per million in the previous year, and finally multiplying by 100 to obtain the percentage.

# Plot: Fatality growth rate
plot_growthrate <- df_totalpropfat %>%
  mutate(Previous_Year = lag(Proportion, 1), 
         Change = Proportion - Previous_Year, 
         Growth_Rate = (Change/Previous_Year)*100) %>%
  ggplot(aes(x = Year, y = Growth_Rate)) +
  geom_line() +
  theme_minimal() +
  geom_vline(xintercept = c(2006, 2011, 2014, 2015, 2016), linetype = "dashed") +
    labs(title="Growth Rate of Police-Related Fatalities per Year",
       subtitle = "United States, 2000-2019",
       caption = "Figure 2. This graph shows the growth rate of police-related fatalities per million people compared to the previous year 
       in the United States from 2000-2019. The vertical dashed lines represent the years where there were major 
       protests against police brutality in the United States.",
       y="Growth Rate since Previous Year") +
  theme(plot.caption = element_text(hjust = 0))

The two plots show the development of total number of police-related fatalities per million people from 2000 to 2019 (Fig. 1) and the corresponding growth rate (Fig. 2). The graphs show that the total number of police-related fatalities has been increasing steadily for the past 20 years, despite a varying growth rate. In 2000 the number of fatality cases per million habitats were slightly above 3 cases, but by 2019 it increased to slightly above 6 cases (Fig. 1). The dashed vertical lines show the years where major protests occurred. Although the total fatalities per million continued increasing after the protests in 2006 and 2011, the growth rate decreased for a short period after them, before increasing again. The period of 2014-2016 had major protests every year, and the number of fatalities was decreasing for that period. However, the number of fatalities started increasing each year after that, implying that the protests did not have a long-term effect on police brutality.  

State-Specific Analysis

The following analysis is done for specific states in the United States where large protests were arranged somewhere in the period from 2000 to 2019. This enables us to compare how the protests impacted the development of police-related fatalities within the state and on a national level. The states included in this analysis include New York, Missouri, Maryland, Louisiana, and Minnesota.  

The plots for each state are similar to the plots created on the national level. For each state we focus on the number of fatalities per million people and the growth rate, along with the number of fatalities per million by race. Many of the states were lacking in data for Asian, Hispanic, and Native-American fatalities or population, so we place our focus on only comparing Whites and African-Americans. All of the protests we cover were all protesting against systematic racism and police brutality against African-Americans, therefore, it is fitting to focus on them. We discuss the protests that have occurred in each state, and determine if the protests had any effect on police-related fatalities in that state.

To find the number of fatalities per million people in each state, the total number of fatalities and population grouped by state is found. The number of fatalities per state divided by that state’s population is also found, and multiplied by a million. The growth rate is found using the same method as before, except with the data from the specified state.

# Function to create a data frame with the number of fatalities per million grouped by year and state
make_dataframe_by_year_state <- function(){
  pop_df <- data_frame()
  
  for(year in 2000:2019){
    popyear <- paste("pop", year, sep = "")
    
    tmp <- df_population_2000to2019 %>%
      rename("Pop" = popyear) %>%
      group_by(State) %>%
      mutate(Year = year, Population = sum(Pop)) %>%
      dplyr::select(State, Population, Year) %>%
      unique()
    
    pop_df <- rbind(pop_df, tmp)
  }
  
  fat_df <- df_fatalities_2000to2019_edited %>%
    group_by(State, Year) %>%
    dplyr::summarize(Fatalities = n()) %>%
    dplyr::select(State, Year, Fatalities) %>%
    unique()
  
  new_df <- merge(pop_df, fat_df) %>%
    mutate(Proportion = (Fatalities/Population)*1000000)
  
  return(new_df)
  
}

# Create a data frame with the number of fatalities per million grouped by year and state
df_propbyyearstate <- make_dataframe_by_year_state()

# Function to create a plot for the number of fatalities per million people in a specific state
createplot_propfat_state <- function(state, year_vec, fig_num){
  return(df_propbyyearstate %>%
    filter(State == state) %>%
    ggplot(aes(x=Year, y=Proportion)) +
    geom_line() +
    theme_minimal() +
    geom_vline(xintercept = year_vec, linetype = "dashed") +
    labs(title="Number of Police-Related Fatalities per Million People by Race",
        subtitle=paste0(state, ", ", "2000-2019"),
        caption = paste0("Figure ", as.character(fig_num), ". This graph shows the number of police-related fatalities per million people of each race
        per year in the state of ", state, " from 2000-2019. The proportion was found by taking the number
        of fatalities of a race over the population of that race.The vertical dashed lines represent the 
        years where there were major protests against police brutality in the state of ", state, "."),
        y = "Number of Fatalites per Million People") +
    theme(plot.caption = element_text(hjust = 0)))
}

# Function to create a plot for the growth rate of total fatalities for a state
createplot_growthrate_state <- function(state, year_vec, fig_num){
  return(df_propbyyearstate %>%
    filter(State == state) %>%
    mutate(Previous_Year = lag(Proportion, 1), 
          Change = Proportion - Previous_Year, 
          Growth_Rate= (Change/Previous_Year)*100) %>%
    ggplot(aes(x = Year, y = Growth_Rate)) +
    geom_line() +
    theme_minimal() +
    geom_vline(xintercept = year_vec, linetype = "dashed") +
    labs(title="Growth Rate of Police Fatalities per Year",
        subtitle = paste0(state, ", ", "2000-2019"),
        caption = paste0("Figure ", as.character(fig_num), ". This graph shows the growth rate of police-related fatalities compared to the previous year in the
        state of ", state, " from 2000-2019. The vertical dashed lines represent the years where there 
        were major protests against police brutality in the state of ", state, "."),
        y="Percent Growth since Previous Year") +
    theme(plot.caption = element_text(hjust = 0)))
}

# Function to create a data frame with the number of fatalities per million grouped by year, race, and state
make_dataframe_by_year_race_state <- function(){
  races <- list("White", "African-American", "Asian", "Native American")
  new_df <- data_frame()
  
  for(year in 2000:2019){
    popyear <- paste("pop", year, sep = "")
    
    for(race in races){
      pop_df <- df_population_2000to2019 %>%
        filter(Race == race) %>%
        mutate(Year = year) %>%
        dplyr::select(State, Race, popyear, Year) %>%
        rename("Population" = popyear)
  
  
    fat_df <- df_fatalities_2000to2019_edited %>%
      group_by(State, Year, Race) %>%
      dplyr::summarize(Fatalities = n()) %>%
      filter(Year == year, Race == race)
    
  # Number of fatalities per 1,000,000 people
  # Formula: (# fatalities by race/Population by race) * 1000000
  tmp <- merge(pop_df, fat_df) %>%
    mutate(Proportion = (Fatalities/Population)*1000000)
  
  new_df <- rbind(new_df, tmp)
    }
  }
  return(new_df)
}

# Create a data frame for with the number of fatalities per million grouped by race in each state
df_propfatbystate <- make_dataframe_by_year_race_state()

# Function to create a plot for the proportion of fatalities by race in a specific state
createplot_raceproportion_state <- function(state, year_vec, fig_num){
  return(df_propfatbystate %>%
    filter(State == state) %>%
    filter(Race == "African-American" | Race == "White") %>%
    ggplot(aes(x=Year, y =Proportion, group=Race)) +
    geom_line(aes(color=Race)) +
    theme_minimal() +
    geom_vline(xintercept = year_vec, linetype = "dashed") +
    scale_color_manual(values = c("black","tan"), name = "") +
    labs(title="Number of Police-Related Fatalities per Million People by Race",
        subtitle=paste0(state, ", ", "2000-2019"),
        caption = paste0("Figure ", as.character(fig_num), ". This graph shows the number of police-related fatalities per million people of each race
        per year in the state of ", state, " from 2000-2019. The proportion was found by taking the number
        of fatalities of a race over the population of that race.The vertical dashed lines represent the 
        years where there were major protests against police brutality in the state of ", state, "."),
        y = "Number of Fatalites per Million People") +
    theme(plot.caption = element_text(hjust = 0)))
}

New York

In New York, protests against police brutality occurred in the years 2006 and 2014. In 2006, the court findings of three police officers not guilty in the shooting of three African-American men was the cause of peaceful protests. The total amount of police-related fatalities did not decrease in the years following this protest; in fact, it continued increasing. However, the proportion of African-Americans killed by police drastically dropped in 2007, suggesting that there was possibly less systematic racism occurring after the protest. This, however, did not last, and the proportion of African-Americans killed by police started rising again in 2008. In 2014, an African-American man died after being held in a chokehold by a police officer. After the officer was not indicted for this, the saying “I can’t breathe” was birthed and protests against the decision went on for several nights. The graphs show that total police-related fatalities and the proportion of African-Americans killed by police decreased after 2014. However, they had already been decreasing since 2013, so it is hard to conclude if the continuing decline happened in part because of the protests.  

Due to the protests in 2014, the New York Assembly had written a bill named after the victim that makes the use of chokeholds illegal for police. This was not passed until June of 2020. It will be interesting to see in future years if this bill has any effect on police-related fatalities in New York in the coming years.  

Missouri

In Missouri, two major protests have occurred during the past 20 years, one in 2011 and a second in 2014.  

The plots show an increasing number of fatalities in advance of both protests. This is occurring both for total fatalities, and for fatalities within the White and the African American population. The outcome of the two protests are, however, a bit different. Based on the results from Missouri alone, it seems that the protests may have had an effect on the growth rate of police fatalities, as it decreased in the aftermath of both of the protests. Despite this, the occurrence is not reliable evidence alone, as the growth rate tends to spike every other year.  

The protest in 2011 was caused when the police shot and killed Anthony Lamar Smith. This led to protests in St. Louis that lasted for several days. In the aftermath, the family of Lamar Smith settled a lawsuit with the St. Louis Police for 900 000 dollars. The family gained another 500 000 dollars a year later after a re-opening of the case.  

The protest in 2014 happened after Michael Brown was shot 12 by a police officer after leaving a convenience store. Michael Brown was unarmed and the police officer was never charged. The protests in the aftermath were violent and led to the police using tear-gas against the crowd of demonstrators. Another set of protests happened after 12-year-old Tamir Rice was shot not long after Browns murder. In the aftermath the 1033 was canceled by president Obama (although it was renewed in 2017 by president Trump). Black Lives Matter also launched Campaign zero in 2015, after the protests.  

The aftermath of the protests in 2014 could explain the decrease in police-related fatality cases after the protests. Despite this, the plots show a big increase in the number of fatalities from 2015 to 2019. This indicates that the protests in Missouri alone did not have a large effect on the fatality rate in the long term.  

Maryland

Maryland has had only one protest in the last 20 years. The protest occurred in 2015 after the mysterious death of Freddie Gray. Freddie died one week after being arrested because of severe damage to his neck. The mysterious death made the protests erupt, and the citizens of Maryland demanded answers and insight into the incident’s investigation.  

The plots about police fatalities in Maryland show that the state had decreased police fatalities at the start of 2000. Following the decrease, a stable increase in total fatalities is observed until 2007, where they have a drop in fatalities before a drastic increase follows. A drastic increase is seen until the protest. After the mentioned protests, the graphs show that the total number of fatalities per million increases but that the fatalities of African-Americans decreases and the fatalities of White people increases.  

Louisiana

Protests occurred in Louisiana in 2016 after an African-American man was shot and killed by a police officer who thought that the victim was reaching for a gun, yet videos showed that the victim was not moving. Although a civil rights investigation into the shooting was opened, the officers involved were not charged. However, the officer who delivered the fatal shot was fired. The graphs show that there was actually a large spike in the proportion of African-American fatalities in 2017 after the protest, along with an increase in total police fatalities for all races. Although the protests were non-violent, the police responded by being heavy-militarized and many protesters were arrested. This increase in tension may have been the cause of the spike in 2017, due to officers feeling more aggression towards people of color. After 2017, police-related fatalities started decreasing, a possible result of the tension passing and the protests being an event of the past.  

Minnesota

Protests occurred in Minnesota in 2016 the day after the protests in Louisiana, and President Barack Obama gave a speech about both of the incidents in the following days. The protests in Minnesota were sparked by an African-American man being shot and killed by a police officer in his car when reaching in his glove compartment, even though he had calmly alerted the officer that he had a concealed weapon there that he was licensed to carry. The victim’s girlfriend was in the car at the time, and live-streamed the aftermath of the shooting on Facebook. However, the officer was found not guilty. These protests ended up spreading across the United States and were more violent - police were also injured. The police department in this city in Minnesota started receiving additional training in 2017. They also had plans for increased data transparency, body cams, and banned a training that taught officers to shoot if they felt threatened. However, they never followed through on the plans.

There was a decrease in African-American and overall fatalities in 2017 and 2018 following the additional training, but it does not seem to be having a long-term effect as the police-related fatalities, particularly for African-Americans, spiked greatly again in 2019.  

Final Analysis

National Level

The number of fatalities each year of each race has increased every year from 2000-2013, with a drop of White fatalities, and a leveling off of fatalities for African-Americans and Hispanics after that. However, this is also around the time when there is an increase in “Race unspecified” reports, meaning that race has been less frequently reported, so it is hard to say if the number of fatalities each year actually decreased.  

Although there is more White fatalities total, Native-American and African-American fatalities are over-represented compared to their population, with African-Americans having the highest proportion of fatalities. Over the past 20 years, the proportion of African-American fatalities has seen a steady increase, while the proportion of White fatalities has seen a small increase and then leveled off at around 2013.  

In 2019 alone, African-Americans are the only group representing a higher percentage of fatalities than their portion of the population. African-Americans made up 13% of the population, yet 24% of the fatalities. Whites made up 76% of the population, yet only 36% of the fatalities.  

Our graphs show that there was no trend in the number of police-related fatalities nationally in the years after protests occurred. This is most likely because police departments are regulated on a state- and city-level, and therefore, changes on this level would be insignificant on the national level.  

State Level

In the states analyzed, the protests did not appear to have a long-term effect on systemic racism against African-Americans or police-related fatalities overall. The states of New York, Missouri, Maryland, and Minnesota all had a pattern of having a lower proportion of African-American fatalities for two years following the protests, however this proportion started increasing again after those couple of years. Although protests help to increase social awareness on the topic of police brutality, we can conclude from our data that they have not had any long-term effect on police brutality yet.  

Conclusion

It is evident that police-related fatalities have been on a steady increase since 2000. While the data does not reveal direct evidence that police-fatalities are a result of systematic racism, it is still an important case to discuss and is relevant for further investigation.  

Although we realize that the factors that systematically change how police behave are things such as policy changes and elections, we decided to look at protests since these bring social awareness and can put pressure on people in power to make changes, in addition to affect how people vote in order to bring in politicians that can change the police system.  

The protests in 2020 that were the motivation for this project have been larger and more widespread than ever before. They have spread to all 50 states, along with other countries around the world. Due to these, there have been more changes at the local level. In the city of Los Angeles in California, they decided to cut the police department’s budget and put that money into programs that help African-American communities. In the city of Minneapolis in Minnesota, they have decided to disband the police department. Additionally, all the officers involved in the death of George Floyd - the victim who motivated the protests this year - have all been arrested and charged with his murder. This is different from previous years when officers have not been convicted. It will be interesting to see in the coming years if the protests of 2020 will have any long-term effect on police brutality, due to the large amount of media attention and awareness of police methods they have brought forth.


  1. https://mappingpoliceviolence.org/↩︎

  2. https://policeviolencereport.org/↩︎

  3. https://www.gauseng.com/single-post/2019/02/14/A-Data-Scientific-Method↩︎

  4. https://fatalencounters.org/↩︎

  5. https://docs.google.com/spreadsheets/d/1dKmaV_JiWcG8XBoRgP8b4e9Eopkpgt7FL7nyspvzAsE/edit#gid=0↩︎

  6. https://www2.census.gov/programs-surveys/popest/technical-documentation/file-layouts/2010-2019/sc-est2019-alldata6.pdf↩︎

  7. https://www.bloomberg.com/news/articles/2020-06-09/a-history-of-protests-against-police-brutality↩︎