Cash Benchmarking: A Solution in Search of a Problem

September 29, 2020

About a week ago, I was procrastinating on Twitter and came across a flurry of excitement about a new “benchmarking” study ostensibly showing that cash had more impact than an employment training program in Rwanda. This study came on the heels of another benchmarking study purporting to show that cash was better than a nutrition program, which was also in Rwanda.

I took a quick look at the two papers, noted that the employment program had failed to increase employment and the nutrition program had failed to improve nutrition and I tweeted in response that “cash has now outperformed two crap programs that didn’t work. So...don’t do crap programs that don’t work.” OK, that was pretty harsh, and I figured it was now incumbent upon me to do some homework, so I sat down and tried to read the two papers. I say “tried” because I only have 22 years of formal education so a lot of it went right past me. However, I got enough out of the process to say this comfortably:

Cash benchmarking is a solution in search of a problem. And cash didn’t really perform better. And one-time unconditional cash transfers probably shouldn’t be used as a benchmark anyway.

I should say right up front that these are really well-done studies. They were done in good faith, and they surface important issues. I totally trust the numbers, and they provide valuable fodder for reflection and discussion. I happen to disagree with their conclusions.

So.

The fundamental problem in the social sector is not the lack of benchmarks, it’s the pervasive lack of evidence for impact across a broad range of programs and interventions. To even contemplate comparing with cash, you have to have a reasonable estimate of cost-effectiveness, and to get that you of course have to determine both impact and cost. The problem is that this doesn’t happen remotely close to enough. Even if you’re a fan of benchmarking, you have to have something to benchmark against.

As a funder, I need to know what problem a program is trying to solve, what impact it had, and how much it cost to get that impact. It’s my job to decide whether I like that impact and cost or not, because value is a function of what someone is willing to pay. The problem, again, is that too often I can’t get that information. What I need is this:

What the program is trying to accomplish, in simple, clear terms: “Get youth employed,” “reduce malnutrition in at-risk kids.”
The basic metrics that will capture the degree to which that happened. Just a couple of the right things is far better than a confusing array of unranked measures.
Good quality numbers that demonstrate a change (a big-enough sample size, good survey methods, right interval, all that stuff.)
A persuasive counterfactual that reveals the true impact.
Cost estimates that allow a credible calculation of the cost per unit impact.

As Dean Karlan talks about in his Goldilocks book, there’s a lot of different ways to get to good estimates of impact, but the problem is that way too few programs even get close. In any case, once I have a credible measure of the cost of impact - say, the cost per additional youth employed - I don’t want to compare it to cash, I want to compare it to other employment programs!

We don’t need benchmarking if we simply and consistently measure enough to judge these programs in terms of what the implementers said they were setting out to do. The title of the IPA report on the first study is “Benchmarking a Nutrition Program Against Cash Transfers in Rwanda.” The report is admirably clear on what problem the program set out to address:

“Rwanda has seen improvements in child nutrition in recent years, but...37 percent of children are anemic and 38 percent of children under 5 are stunted. Malnutrition rates are much higher in rural areas than in urban areas.”

This is a program to improve child nutrition and it should be judged as such. So did it? Nope. It had no effect on child growth, diet diversity, anemia, or consumption. In terms of the problem it set out to address, it was an utter failure and the real lesson should be “don’t do it again”. We don’t need to judge it against anything but what it explicitly set out to accomplish and in that sense it just plain failed.

As to benchmarking, when the researchers took the amount of money that the program cost - $142 - and just gave it to people, that had no effect on child nutrition either. Both failed! The program didn’t work, the cash didn’t work — what did we learn? There’s no useful “benchmarking” here. (The study did have an additional treatment group that got $500, which did result in some nutrition - and mortality - impact, but I’m not sure what comparing a $142 intervention with a $500 transfer is supposed to prove, and even then, there is plenty of reason to doubt the effect will last - see below.).

The situation with the employment study is similar. IPA’s report on the study is titled “Benchmarking Cash to an Employment Program in Rwanda.” The problem they set out to address was that “in Rwanda, about 35% of the youth population is neither employed, nor in training, nor in school.” The program intervention included training on 1) employment soft skills, 2) microenterprise start-up, and 3) microenterprise biz dev.

So how’d it do? Well, the program had no effect on employment, nor did it increase income. In terms of the problem it set out to address, it failed. Completely. We don’t need to benchmark it against anything but its own aspirations and against those, it was a bust. (These two programs really make me worry about the rest of USAID’s portfolio. Perhaps the best outcome of these papers is that they create momentum toward a broad review of the impact of USAID-funded projects and programs.)

Curiously, some have characterized the program as having “worked reasonably well,” because those who completed the program worked 3 more hours a week, had more assets, and saved more. OK, but assets and savings are derived from income, and it seems weird to call it success when people work a lot more (17%) with no increase in income. Maybe there’s something to build on here for a next iteration, but it’s behavior, not impact. The nutrition study was characterized as a partial success because the parents of no-less-malnourished kids had more savings. Yes, if you hurl a kitchen sink’s worth of unranked metrics into a study, you’re bound to find something positive.

Despite the failure of the employment training program, the researchers went ahead and compared it to a cash transfer roughly equivalent to the program’s cost (the cost was $330/participant and the cash transfer was $317). So what happened when they handed youth the equivalent of 160% of their annual income? Well, their income went up (yes, you gave them a bunch of money), consumption went up (yes, they spent the money), assets rose (yes, some of that money was spent on assets). What didn’t happen was a job, and if the cash had any effect on earned income, I couldn’t find it in the paper.

Don’t get me wrong, I’d rather get a bunch of cash than sit in a training program that didn’t get me a job. I’d consume, I’d buy some assets, and for a while, at least, I’d be way better off. But ultimately I’d much rather have a program that actually worked — I’d rather have a stable income (and I think that it’s bogus to refer to a short-term increase in income - i.e. the cash you just handed me - as “impact). The autonomy and choice inherent in cash is the same whether given or earned, and earned income suggests the capacity to generate more of it over time.

Moreover, there’s no real evidence that a one-time unconditional cash transfer (UCT) with no accompanying intervention (which describes the transfers in these two studies - I’ll call them “isolated UCTs”) creates lasting impact equal to the cost of the transfer. It’s not even close. In the employment paper, the authors say the evidence of durable impact from cash transfers is “mixed,” but of the five references provided, three are multiple transfers over a lengthy period (and two of those are conditional), and the other two aren’t cash - they’re cows and food stamps, respectively. They’re mostly irrelevant in this context.

The paper does reference the excellent Blattman et al 9-year study of cash transfers to youth groups in Uganda that did show a 38% increase in income (off a base of $400) at four years out (yay!). However, unlike the isolated UCTs in the benchmarking studies - giving random people a grant out of the blue - 1) these youth in this study had gone through group formation, rudimentary business planning, and a selection process based on a review of that planning, and 2) the impact had dissipated 9 years out (super-impressive that they studied it that long).

Ephemeral impact is often worse than nothing, and the goal of development programs should be impact that is sustained over time. At the very least, if we’re going to use a benchmark, it ought to be something that creates durable impact. And the only study I can find that investigates the long-term impact of the same kind of cash transfer done in the benchmarking studies is the three-year follow-up of GiveDirectly’s isolated UCTs in western Kenya. (Laura and I wrote a piece in SSIR after the short-term study arguing that what really mattered is whether the UCT had significant long-term impact.) They found that, on average, families were not much better off than control subjects, except for having more assets (and that’s $400 worth of assets in the wake of a $700 cash transfer: it doesn’t even break even in terms of durable impact.). The broad short-term “impact” didn’t last, a family’s trajectory wasn’t altered. (There was a pretty vigorous debate over this interpretation. I go with Berk Ozler’s take, but even if you’re more impressed than I am, it’s still pretty weird to base a whole benchmarking movement on one disputed study.) I wish I were more surprised that this disappointing long-term study created about 0.01% of the hoopla that surrounded the original short-term study.

So that’s it. These two benchmarking studies used a “benchmark” that has zero evidence of durable impact commensurate with the original investment. I don’t think we should be using isolated one-time unconditional cash transfers to benchmark anything. Perhaps we shouldn’t even do them until we get more evidence that they accomplish something lasting. It’s nice to give a man an isolated UCT - or a fish, for that matter - but if you’re a funder on the hunt for lasting change, cash benchmarking is not the tool you need.

Blogs

Comments