Regression Discontinuity: Finding the Exact Moment Your Design Started Working

Oby Anagwu
Sep 30
5 min read

Every design decision creates a before and after. A team decides when to launch. Designers choose where to begin. Program managers determine who qualifies. These decisions feel pragmatic in the moment, driven by budget realities or implementation capacity. The realization that these choices also create conditions for learning whether the intervention works often comes later.

The power lies in the boundaries. When designing a service that cannot reach everyone at once, a natural comparison emerges. The household that barely qualifies for a housing program versus the one that barely misses, the neighborhood that receives a transit system in Phase 1 versus the one scheduled for Phase 2, the person born just before an eligibility cutoff versus the one born just after. These boundaries are often arbitrary from a design standpoint, yet they generate meaningful evidence about impact.

Consider what happens with phased implementation. Cape Town’s MyCiTi bus rapid transit opened Phase 1A along the West Coast corridor in 2011. The choice reflected sensible reasons: available funding, political support, technical feasibility. Adjacent neighborhoods would wait for Phase 2. This decision created a boundary where similar communities experienced different realities. One side had rapid transit access. The other did not. The difference in their commute times, employment patterns and economic activity reveals what the transit design accomplished.

Or consider eligibility thresholds. When Rwanda expanded Mutuelles de Santé health insurance, the program used the Ubudehe wealth ranking to determine subsidies. Households in the poorest categories received government support. Households just above that line paid full price. This threshold served an equity goal: directing resources to those with greatest need. It also created a comparison point. Families just below and just above the threshold were otherwise similar, yet they faced different healthcare costs and access. Their health outcomes diverged. That divergence indicates what the insurance design achieved.

Reading the Discontinuity

The graph tells the story directly. Place the outcome of interest on the vertical axis: electricity consumption, school enrollment, health facility visits. Place the assignment variable on the horizontal axis: household income, birth date, geographic distance. Draw a vertical line where the threshold sits.

When a design works, a jump appears right at that line. Not a gradual slope. Not a smooth curve. A clear, visible break where the data points on one side sit consistently higher or lower than the points on the other side. Households earning £2.99 per day cluster at one level of electricity use. Households earning £3.01 per day cluster at a noticeably different level. The vertical distance between these clusters estimates the program’s impact.

Look for problems in the pattern. If data points thin out approaching the threshold from either side, people may be manipulating their reported income to qualify or avoid qualifying. If the jump looks gradual rather than sharp, implementation may be fuzzy, with some ineligible people receiving benefits while some eligible people get excluded. If the two sides were already diverging before the threshold, something else is driving the difference and the program may not be the cause.

The bandwidth question becomes crucial. How far from the threshold should the analysis extend? Include data too far away and the comparison involves fundamentally different groups. A household earning £1.50 per day faces different circumstances than one earning £4.50 per day, even if both fall on the same side of a £3.00 threshold. Include too little data and the estimate bounces around from random variation. The optimal bandwidth captures enough observations to be stable while keeping conditions similar enough to be comparable.

The visual test works because it reflects the underlying logic. If an eligibility threshold truly creates the only meaningful difference between otherwise identical people, their outcomes should diverge sharply right at that point. Smooth the data however appropriate, fit whatever curve makes sense, but if no visible jump appears at the threshold, the design probably had no effect.

These boundaries appear in nearly every intervention. Lagos opened its BRT system in 2008 along the Mile 12 to CMS corridor, then extended to Ikorodu in a second phase. Addis Ababa’s light rail opened with 39 stations at defined locations across the city. Ghana’s LEAP cash transfer program used proxy means test scores to identify eligible households. Ethiopia’s Productive Safety Net Program selected districts based on food insecurity rankings. Each decision about sequencing, location or eligibility created a sharp distinction between who received the intervention and who did not.

The sharpness matters. Gradual rollouts blur the comparison. If a program seeps slowly from one area to another, isolating its effect becomes impossible. If eligibility depends on factors people can manipulate, selection bias confounds results. But when implementation creates a clear, enforced boundary that individuals cannot easily cross, clean evidence emerges. The arbitrariness becomes analytical strength. This means operational decisions carry measurement implications. Phasing a rollout determines which comparisons become possible. Setting eligibility criteria defines the population whose outcomes reveal program effects. Choosing hard thresholds over fuzzy boundaries creates sharper evidence.

Design processes typically prioritize feasibility, fairness and efficiency when making implementation decisions. Considering whether those decisions position teams to learn matters too. A phased rollout that randomizes which areas go first generates better evidence than one driven purely by political pressure. An eligibility threshold using objective, verifiable criteria produces cleaner comparisons than one subject to discretionary override. Fixed start dates create sharper boundaries than gradual introductions.

The evidence appears right at the threshold. When South Africa expanded the child support grant by age, children born just before the cutoff suddenly qualified while those born slightly later did not. The age threshold was arbitrary in developmental terms, but administratively necessary. Children on either side were similar, yet one group received monthly payments and the other did not. Their school attendance, household nutrition and caregiver employment diverged. The size of that divergence, right at the birthday cutoff, measured the grant’s impact.

Kenya’s rural electrification programs connected villages sequentially as infrastructure expanded. The timing of when a village received electricity often reflected administrative convenience or construction logistics rather than village characteristics. This meant neighboring villages experienced dramatically different access to power based largely on arbitrary sequencing. Some villages gained electricity and the business activity, study time and health outcomes that followed. Others waited in darkness. The boundary between them revealed electrification’s effects.

Design teams can build measurement capacity into programs without compromising goals. Phasing implementation due to capacity constraints works better when exact boundaries and timing get documented. Setting eligibility thresholds creates cleaner evidence when criteria are objective and consistently applied. Prioritizing certain areas first allows for learning when sequencing decisions create clean comparisons.

This matters because resources are constrained and failures are costly. Every intervention that does not work represents foregone opportunities to improve lives. Knowing what works requires evidence. The boundaries created through ordinary design decisions can provide that knowledge, but only when teams recognize their analytical value and design accordingly.

The threshold becomes the measurement tool. Elaborate experiments or withholding services from control groups become unnecessary. Operational constraints already force boundary decisions. The question is whether to draw them in ways that generate evidence. Sharp boundaries in place of fuzzy zones. Objective criteria rather than discretionary decisions and documented implementation rather than informal rollouts.

When designing the next service rollout, certain questions matter: Where does the boundary fall? How sharp is it? Can individuals manipulate which side they land on? Does documentation capture exactly who receives what and when? These questions do not change primary design goals. They simply acknowledge that every implementation decision also shapes learning potential.

The moment a design starts is the moment measurement becomes possible. But that measurement only works when boundaries are clear, enforced and documented. Operational constraints already force these choices. Making them in ways that enable learning reveals whether the design actually works.

MDE

Regression Discontinuity: Finding the Exact Moment Your Design Started Working

Recent Posts