Introduction

In this assignment, you will demonstrate your ability to understand and conduct a regression discontinuity design by replicating the main analysis from the article “Deadly Populism: How Local Political Outsiders Drive Duterte’s War on Drugs in the Philippines” (Ravanilla et al., Journal of Politics, 2022).

Abstract

Formalities

Your assignment must be a maximum of 18,000 characters.

All of the requirements below are standard in basically all working papers in political science and for submission to journals, so please stick to the following conventions:

12-point font
1 inch margins
Double-spaced
No Table of Content
18,000 characters maximum

If you write your assignment in LaTeX (e.g. Overleaf) it’s therefore:

\documentclass[12pt, A4]{article}
\usepackage[margin = 1in]{geometry}
\doublespacing after your \begin{document}

If you want to use the Overleaf template that I use for my own academic manuscripts, you can find it here.

Expectations

The technical side of your assignment should, of course, be well-executed. But make sure to focus a lot of your energy on the structure and clarity of your writing. The reason that students often do not receive the grade that they might hope for is because the written part of their assignment is unclear. A few points:

When discussing a method or test used in the article, provide the intution behind it in plain language (non-technical, non-abstract langauge)
- i.e. explain a method or test in plain non-technical language before turning to any abstract or technical language
When discussing a statistical test, provide the theoretical or substantive reason why it is being used (again, in plain, non-technical, non-abstract language)
Even if your audience understands the technical side of things, you always want to give a reader some intuition behind what you’re doing

Instructions

Data: Assignment_2C_Data.zip

Note 1: The Stata code for the replication package is available in the replication data above. You are free to look at it to see how the authors run their models.

Codebook (for the variables you will use only):

unaligned_vote_share_margin: Vote margin for outsider mayors
tight: Binary variable indicating whether election was “very close” (<5% margin)
treatment: Whether an outsider mayor was newly elected in the post-Duterte period
municipality: Municipality ID
postduterte: Binary variable for post-Duterte period (i.e. time period variable)
region2: Region ID
—
fatal_acled2: Any fatal drug war incident
fatal_acled2_diff: Any fatal drug war incident (pre-post Duterte difference)
fatal_pnp2: Fatal Philippine National Police incident
fatal_pnp2_diff: Fatal Philippine National Police incident (pre-post Duterte difference)
fatal_vigilante2: Fatal incident by vigilantes
fatal_vigilante2_diff: Fatal incident by vigilantes (pre-post Duterte difference)
—
amount_pcap_M: Public works procurement
amount_pcap_M_diff: Public works procurement (pre-post Duterte difference) horizonal_pcap_M: Roads/flood abatement procurement
horizonal_pcap_M_diff: Roads/flood abatement procurement (pre-post Duterte difference)
vertical_pcap_M: Schools/health facilities procurement
vertical_pcap_M_diff: Schools/health facilities procurement (pre-post Duterte difference)
—
non_establishment_mayor: Whether municipality has an outsider mayor
drug_related_pcap: Drug-related crime rate per capita
assault_pcap: Assaults crime rate per capita
theft_pcap: Theft crime rate per capita
total_nodrug_pcap: Total (non-drug-related) crime rate per capita

Your assignment must proceed as follows:

Read the article: “Deadly Populism: How Local Political Outsiders Drive Duterte’s War on Drugs in the Philippines”.

Now, do the following for your written assignment and accompanying R code:

Introduction

Briefly introduce the article that you are replicating:
- Introduce the topic
- Introduce the research question
- Introduce the hypothesis or hypotheses

Research design

Explain the authors’ research design:
- The empirical strategy and data
- The identifying assumptions
- i.e. how does their research design allow them to estimate a causal effect and what are the potential threats to causal inference?
  - Note: It should be clear from this section that you know how a regression discontinuity design works.

Analysis

Note: Results from the replication data will be slightly different from those in the article (e.g. sample sizes or regression estimates vary somewhat from those in the article). Part of the reason is that the calculation of the optimal bandwidth in R’s rdrobust is different from the analogous function in Stata (which the authors use). The number of observations included in the regression will therefore differ, and thus so will the estimates.

Replicate Figure 4.
- Explain what the figure is showing and why the authors are showing it (also report and explain the p-value of the McCrary sorting test).
- You are welcome to split this figure into two separate figures.
- Your figure might look something like this, but make it look however you want:

Example figure

Create an rdplot() figure for the outcomes in Table 4: Any fatal incident (fatal_acled2_diff), PNP fatal (fatal_pnp2_diff), Vigilante fatal (fatal_vigilante2_diff).
- Explain the purpose of this kind of figure (Note: you’ll be replicating Table 4 itself in the next step).
- Note: It should be somewhat similar to Figure 6 in the article. However, it will show points representing the binned means of the running variable (rdplot() does this automatically), and doesn’t need to include a confidence interval. It might look something like this (feel free to separate it into 3 separate figures):

Example figure

Replicate Table 4.
- Explain the table’s purpose and the results.
- Note: For their diff-in-diff regressions, they use only data from “tight” races
- Note: You do not need to include the row “LP post-Duterte mean”.
- Note: Add the following to the top of your R code. It is code I wrote so that modelsummary() can extract the coefficients, SEs, p-values, and bandwidth from a rdrobust object. This will also allow you to pull out the “Optimal banwidth” and “Clusters” from the models into the table.

# Add custom functions to extract coefs, SEs, p-values, and the bandwidth
# from a rdrobust model object
# This returns results specifically from the bias-corrected estimator
# with robust standard errors
tidy.rdrobust <- function(x, ...) {
    ret <- data.frame(
      term      = "RDD estimate",
      estimate  = x$coef["Robust", 1],
      std.error = x$se["Robust", 1],
      conf.low  = x$ci["Robust", 1],
      conf.high = x$ci["Robust", 2],
      p.value = x$pv["Robust", 1])
    return(ret)
}

# Change "_b"/"b" to "_h"/"h " to switch from "Robust" or "Conventional"
glance.rdrobust <- function(x, ...) {
    ret <- data.frame(
      nobs   = sum(x$N_b), 
      Optimal.bandwidth = as.character(round(x$bws["b", 1], 2)))
    names(ret) <- c("nobs", "Optimal bandwidth")
    return(ret)
}

# Add a "Clusters" attribute, which returns the number of clusters
# based on the variable that the clustered SEs are calculated from
glance_custom.fixest <- function(x, ...) {
  data.frame(`Clusters` = attr(x$cov.scaled, "G"))
}

Replicate Table 5.
- Explain its purpose and the results.
- Note: You do not need to include the row “LP post-Duterte mean”.
- Note: For their diff-in-diff regressions, they use only data from “tight” races
- Note: You should be able to simply copy and paste your code for Table 4 that you did in step “2.”, and change the variable names and labels/title in the table.
Replicate Table A22 (as discussed on p. 1052, and is found in the Appendix).
- Explain the Table’s purpose and the results.
- Note: You might want to look at the authors’ code in Ravanilla-Sexton-Haim-Deadly-Populism-JOP.do to see the variables and subset of the data that the authors use here. They are purposefully using pre-Duterte data for a reason (also only data from tight races).

Discussion

Explain the overall findings of the article. One paragraph should suffice.
Discuss the benefits and drawbacks of the research design as it is applied in the article.

Submission instructions

Formatting requirements!
- 12-point font
- 1 inch margins
- Double-spaced
- No Table of Contents
- 18,000 characters maximum
Your code should be commented so that it is clear to me that you know what each piece of code does. You don’t need to be excessive, but comment your code in a way that you think is reasonable.
If you are in a group, only one of you needs to submit the assignment. Just ensure that all of your student IDs are on the title page.
Submit your assignment and R code separately. Please send your assignment as a PDF and use an equivalent file name across the files:
- Assignment_2C.pdf
- Assignment_2C.R
Submit through Absalon under “Assignment 2C”.