Thoughts from an agricultural development gal in Ghana

Evaluating Complexity

Warning for my non-development-worker readers: this post is a bit technical. But I hope I’ve explained everything clearly, and please ask questions if I haven’t! I would love to hear some new perspectives on this topic.

Also, I started writing this post a few months ago, when two big themes in the aid blogosphere were complexity and impact evaluation. People seem to be writing about this a bit less these days, but they’re still important topics (and we don’t have any clear answers) so it’s my turn to weigh in!

Where does all the money go?

So what is impact evaluation? This is the attempt to use rigorous methods to understand what actually works, or has impact, in aid and development. Of course there is no “silver bullet” solution to global poverty, but which interventions are more effective than others? Which things have no impact at all, or a negative impact? Ultimately, rigorous impact evaluation should lead funders and policy-makers to direct funds towards interventions which are most effective and reliable in improving people’s lives.

There is a huge push right now to “show what works” in foreign aid. Citizens are seeing that the Western world has spent trillions of dollars on foreign aid in the past 50 years, yet global inequality is worse than ever before. Of course progress has been made, but have the results justified the spending? Citizens want accountability from their governments, and proof that their hard-earned tax dollars are actually having a positive impact on the lives of those they are targeting in the developing world.

Many agree that the most rigorous form of impact evaluation today is the Randomized Control Trial (RCT). This technique takes a statistically significant sample size and randomizes selection into two groups: control and treatment. The treatment group receives a development intervention, such as crop insurance, or microcredit, or whatever you are trying to test, while the control group receives nothing. Both groups are evaluated over the course of the study (usually 1-5 years) and the results come out somewhere along the spectrum of “yes this works” (how?) to “this has no effect”. Sometimes it’s inconclusive, or the results are not easily generalizable, or there is further research to be done, but RCTs are generally considered the gold standard in evaluating development interventions.

There is a lot of controversy around RCTs because of the high cost and time involved. These studies are not appropriate in all cases. They shouldn’t be used by organizations looking to evaluate past programs, or smallscale projects looking for continued funding. Instead, RCTs should be used to inform development and foreign policy on a large scale. Citizens giving foreign aid want to know, for a fact, which development interventions are the best bang for their buck – and RCTs should, in theory, be able to tell them.

Inside the black box

The second trend these days in the blogosphere is complexity, or complex adaptive systems. Aid on the Edge of Chaos has a good round-up on complexity posts. The bottom line here is that in a complex system, results are unpredictable. The system is not static, or linear, or deterministic; it evolves over time, adapting and growing based on both internal and external influences. When dealing with a complex system, you need to take a “systems approach” by monitoring the whole rather than the individual pieces.

What does this mean for development? Well, people and communities and indeed the world are all complex systems. It is hard to predict when something will change, as we’ve seen with the recent wave of revolutions across the Middle East. From this perspective, it is hard to ever know which development interventions will achieve the results we want.

Most development interventions are designed around something like an impact chain – what will you do, and what results will it produce? However, complexity theory tells us that we should monitor a system generally for results, not just for our predicted or desired results. There are often unintended results of our actions, sometimes negative and sometimes positive. In addition, it can be hard to attribute positive changes to our particular intervention in a complex system – so much is happening at once and there are so many stimuli to the system, you never really know where something originated. This also poses problems when we’re discussing replicability of development interventions – just because something worked once in a particular set of circumstances, doesn’t mean it will work again in a different setting.

So what does it all add up to?

Impact evaluation and complexity. Now it’s time to bring these two concepts together. The big question here is this: Can we understand “what works” in enough detail to be able to predict future results of our interventions?

My answer is yes, but maybe not in the way you’re thinking. In the past, evaluation has usually focused on the question “what intervention worked?” where the answer is “fertilizer subsidies” or “school feeding programs”. I think we need to start looking more at HOW things work. Instead of looking for programs that we can replicate across entire regions, we should be asking, “what worked?”, “under what conditions?” and “with what approach?” which give answers more like “foster innovation”, “promote local ownership” and “give people a choice”. These conditions may be found across many different areas, but may have more of an effect on the success of an intiative than the WHAT of the initiative itself.

I generally support rigorous impact evaluation for 2 reasons:

  1. fostering a culture of accountability to donors and stakeholders (and taxpayers);
  2. foster learning so that we understand conditions for success and can set projects up for success in the future.

I think the aid industry has learned (and can learn even more) something about what works, or probably more about what doesn’t work. I also think aid can’t be prescriptive since human beings are complex and our behaviour is irrational and unpredictable. But we can set conditions for success when designing our interventions. And while the results may not be wholly predictable, at least the intervention will be more likely to succeed.

What does this look like in practice?

We are currently in the process of team transition and strategy re-development. Here are a few principles I’m looking to follow with our team strategy as we go forward:

  • always have a portfolio of initiatives on the go (don’t put all your eggs in one basket)
  • make sure these initiatives all contribute toward the bigger change we’re trying to make in the agric sector
  • range of timescales: short-, medium- and long-term changes, informing and building on each other
  • constant learning and iteration: testing, getting feedback, adapting, and testing again
  • focus on articulation of our observations and learning to external audiences
  • high awareness of the system as a whole: what does it look like? where are the strongest influences? the most volatile players? who exerts the most force on the system?

What principles would you add to my list?


17 responses

  1. Inga Rinne

    It seems to me that the question in the past has too often been-What works? when really the question is Why does it work? Without the answer to that it will be more luck than mangement if you replicate that success.

    March 21, 2011 at 6:24 pm

    • Exactly, well said Inga! Thanks for the comment!

      March 22, 2011 at 9:24 am

  2. To riff on your statement: “When dealing with a complex system, you need to take a ‘systems approach’ by monitoring the whole rather than the individual pieces.”

    Check out the work I’m doing for globalgiving in East Africa on exactly that:

    I try to report weekly on my blog ( about how this is unfolding. So far, there are a lot of “quick wins” in terms of completing feedback loops between communities and local organizations. I believe that lack of information being spread around, along with a lack of tools to allow you to compare your own world view (as a manager of some complex social intervention) against other perspectives is the problem.

    March 21, 2011 at 9:17 pm

    • Hi Marc,
      Thanks for the comment! Yes I’ve heard of this approach before that Globalgiving is taking, a colleague told me about it not long ago. Very interesting! I’m more curious about how you actually USE this information after you collect it, and how quickly it becomes out-of-date. I guess I’ll have to read your blog to find out more!
      Thanks again,

      March 22, 2011 at 1:50 pm

  3. Nice post! Remember your previous post about how the need for reporting was diverting important resources from the people working on the ground from actual development work to reporting? How do those two important aspects balance out?

    March 22, 2011 at 4:53 pm

    • Sorry Majd, can you clarify which 2 important aspects you’re referring to? One is reporting taking time away from action, the other… RCTs? or complexity? or something else from that post? or do you mean the “taking action” part? Sorry just need some more clarity on your question! Because I’m sure it’s good 🙂

      March 23, 2011 at 10:28 am

  4. Mary Roach

    Erin, great post.

    I’ve been thinking about impact evaluation for a while and I still have no good answers.

    That being said i echo Inga’s comment and would like to add the importance of the people running the program, their attitudes towards learning and how they manage relationships with the people they are trying to support.

    Also, I think that the dev. sector can learn a lot from the private sector where metrics are continually monitored, assumptions altered and issues discussed. The metrics often end up changing as they may be supporting the wrong behavior. An example that stands out in my head is engineering drawings delivered to schedule versus the average number of revisions per drawing. What I have found is that if you push on-time-delivery this often comes at the cost of drawing quality and hence revisions. Having the flexibility to change and or add metrics is really useful!

    March 22, 2011 at 7:35 pm

    • Great point again about people’s attitudes.
      The challenge with changing metrics and comparing private sector I think is that in private sector you come up with your own metrics and it’s easy to change them. If you’re an NGO running your own project and have no reporting requirements back to donors (like EWB), it’s also easier (and should be encouraged). But as soon as you add a level, like a donor funding a project through an NGO, the rigidity of reporting requirements becomes a huge obstacle. Then the NGO has to make a choice: do they double up on indicators, collecting data for both the donor and their own metrics and thus increasing M&E&reporting time for their staff? Or just go with the rigid, unchangeable indicators they were first given? I think it’s a lot less easy to change these after the fact – but that’s what we should be pushing for as an industry!
      Anyway, great comment – thanks Mary!

      March 23, 2011 at 10:33 am

      • I agree – rigid ways of reporting lead to situations where people are writing up stuff that they know nobody cares about, because it fulfills some outdated requirement. This might be why people spend a lot more time reporting than they ought to, while not learning much from it themselves.

        It’s a little like teachers that are testing kids on science material from outdated textbooks. Most of the “facts” in neuroscience go obsolete every 10 years, so classrooms are always behind the latest understandings.

        Do you still believe “brain cells don’t divide?” Guess again.

        March 23, 2011 at 9:10 pm

  5. Erin, I think it’s a herculean task to mesh impact evaluations/RCTs on the one hand, with acknowledgement of complexity on the other. The former assume a linear causal logic (the epitome of which is the log frame) and the latter throws that out the window. They come from very different world views. But I think we need to figure out how to fit them together. As you mention, a focus on the “how” lessons (rather than the “what” lessons) is important. But those “how” lessons are harder to operationalize. Of course “promote local ownership” is important, but what does that actually look like? How do we know if we’re doing it? And more importantly for an organization: how do you hold someone accountable for it?

    As for your list of principals: One thing I would add, from a complexity perspective, is to pay attention to the other impacts the program may be having, such as increasing conflict in the community. This is similar to the point Mary made above, but you also have to think horizontally to seemingly unrelated issues. There’s a useful related article by Sigrid Gruener and Tom Hill called “Introducing Conflict Sensitive Community Development to Iraq” (with a copy available here:,33&as_vis=1 ) which discusses how seemingly straightforward infrastructure or social projects can be implemented “successfully” while still creating tensions and potentially violence.

    P.S.I took a stab at some similar issues a few months back. See here:

    March 22, 2011 at 9:15 pm

    • Yep, I’ve never been one to shy away from something I think is important, no matter how difficult it may be! I was pretty sure I wasn’t going to hit the nail on the head on my first try, but wanted to open up the conversation since I’ve seen these topics dealt with so much in isolation lately.
      The sentiment above holds again with your comment that “‘how’ lessons are hard to operationalize”. Hard, yes, but not impossible! I would say that we are doing this in EWB (or at least trying) and it comes with what Mary said before – the attitude and approach of those running the initiatives. If I go into my district and do some work, say introducing a new practice in the office, then go away and see whether it continues, is that not testing local ownership? I think there are ways to do these things, and to measure them, and to hold people accountable for them. It just takes a little extra effort.
      I totally agree that we should be on the lookout for unintended consequences. I think I mentioned that in the “what is complexity?” part of my post, but failed to put it in my list of operational principles. Thanks for adding it, and for the related resource!
      Finally, yes, I read your post as one of the many on complexity lately. One comment I had is that I don’t think it matters if the world is becoming more or less complex, what matters is that it IS complex, so we have to treat it that way. I also love the idea of complex adaptive leaders – as both Mary and Inga mentioned, this is KEY to navigating complexity!
      Ok long response, but thanks so much for your comment!

      March 23, 2011 at 10:45 am

  6. I agree with Dave that it’s a huge task to square complex systems and RCTs.

    RCTs are very good to determine the impact of a specific intervention in a specific context, or to compare interventions. It assumes you already have a model of what works and why.
    RCTs are not that useful for circumstances when you can’t easily construct a randomized experiment (e.g. for policy interventions) or when you don’t really know how to tackle a problem and you don’t have a very clear change model.

    They might be useful to help test interventions in a part of a complex system, or be useful when a complex system can effectively be reduced to a simple one for the purposes of the interventions you are trying – but they don’t help you understand the whole system.

    I think of RCTs as a useful tool in a wider toolbox – good for some things and not for others.

    Here’s a piece I wrote about complexity some time ago

    it echoes some of your principles as well as Dave’s point on looking at unintended consequences – which can be both negative and positive. I’d also add beneficiary feedback as critical (both their opinions and their actions).

    The blog also mentions a couple of tools that exist for “learning while doing” and for evaluating impact in complex systems – but there is a lot more out there on specific tools you can use.

    March 23, 2011 at 10:09 am

    • Thanks for the comment!
      I agree that RCTs are useful under the conditions you outlined. But the challenge is, how do you generalize these results, then apply them to new circumstances, within a system that is undoubtedly complex? I guess I don’t see much point in RCTs unless they can indicate what should be done next. I don’t think they’re justified purely for evaluation/accountability purposed (cause they’re too damned expensive).
      I definitely read your post as part of the bigger conversation on complexity over the past several months (year?). I love your principles there as well, and it might even be what inspired me to start my own list (though apparently I’ve forgotten to reference that or give kudos – sorry! But thanks!).
      I’d be interested to find out about some of the tools you mentioned for evaluating impact in complex systems. I’m sure I know about some, but what’s out there? Can you provide some references, or do a post of your own on these resources? Would be very useful!
      Anyway, thanks again for commenting. Cheers!

      March 23, 2011 at 10:50 am

  7. During my time in the development aid sector, I’ve seen an increasing desperation to “know” what is inherently beyond logic and induction. It is certainly time to examine our belief that there are technocratic, precise ways of measuring progress in order to make consequential judgments based on these measures. The increasing obsession with RCTs, abstract metrics and experimental design, stemming from a reductive, managerial approach in aid, is quite far from the intimate, difficult, and complex factors at play at the grassroots level. The business sector seems to have a healthier relationship with risk in their for-profit endeavours, perhaps something we may need to explore in the development sector.

    As an aid worker who has worked extensively in building the monitoring & evaluation capacity of grassroots organizations in Africa, the latest hyperbole about “results-driven development” is especially troubling when one is talking about community initiatives. Imposing such incredibly risk-averse behavior, evaluating every single intervention on people who are in the process of organizing at the local level is most certainly a drain on their time and scarce resources. And what so many people on the ground have told me again and again is that abstract metrics don’t help them understand their relationship to improving the well-being of the people they serve. Here’s a recent post from me on this topic:

    I say definitely let’s pursue and obtain useful data, but at a scale at which information can be easily generated, utilized, and acted upon by those we are trying to serve. Monitoring and evaluation implemented solely for the purpose of accountability fails to result in improved programming and, in many cases in my experience, has undermined the effectiveness of the very interventions it is trying to measure.

    For me, combining impact evaluation and complexity inevitably brings up a spiritual question. That question is…where’s the faith?

    March 23, 2011 at 7:43 pm

    • Relationship between risk and innovation:

      In plain terms –

      * risky organization wants to run an orphanage (bad idea)

      * risky entrepreneur thinks she has a way stop the spread of AIDS (better idea)

      * 3 risky NGOs that are new, local, and haven’t been funded, but all have different ideas on economic empowerment: (even better idea – because you can compare them?)

      * risky organization getting girls in school in Afghanistan. Founder takes personal risk to achieve this mission (good or bad idea?)

      Just trying to think about specific examples where supporting “risky” change agents is warranted.

      March 23, 2011 at 9:20 pm

  8. Evan Walsh

    I really like the principles you’ve outlined for the team planning process. I’m excited to hear about what the team has planned.

    Thanks for another great post!

    June 3, 2011 at 6:19 pm

  9. Lee

    Friedrich Hayek came up with a pretty nifty solution to the problem of complex systems quite some time ago – use markets, which allow easy aggregation of decentralized knowledge and built-in feedback mechanisms, rather than central planning. What does this mean for the aid world of planning and public goods? How about those cash transfers to individuals?

    June 6, 2011 at 8:29 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s