You should subscribe to our blog! RSS feed Email

You can't have big data without lots of little data

Data has the potential to really transform the way foreign aid is done. Better data makes it possible for donors to be transparent. It makes it possible for decisions and priorities to be evidence-based. It makes it possible to hold everyone involved more accountable. Let's go beyond the hype, though, and talk about the actual mechanics of capturing and using development data.

This article is adapted from a talk that was originally part of a NetHope.org webinar. You may also watch the video version of this talk.

Most of the challenges we actually face don't involve big data, they involve little data. So I'd like to focus a bit on how little data grows up to be at least medium-sized data that people can use to make decisions.

Where data comes from

First of all, where does development data come from? Take this picture of a cute kid receiving a de-worming treatment as part of a multilateral-funded effort. (And hopefully we're past the days where this sort of picture was all that you needed to demonstrate that your program was successful!)

A datum is born

Now, just outside the frame of this picture is someone with a clipboard or a laptop or a tablet and they’re keeping track of each individual treatment. At least they’re counting the number of people, but hopefully they’re also capturing some demographic information.

So a little datum has just come into the world, and that's very exciting. But unfortunately I have bad news for that little datum, because it’s all misery and pain from here on out.

Let’s take a big step back and look at the obstacle course facing that datum in this de-worming program.

The obstacle course facing a newborn datum

So in this case you have a donor that has engaged various implementers to distribute the treatment, each of which is working in specific clinics or health posts or other facilities.

So the facilities report, say, to a national-level field office. The field office in turn has to report to the home office so that they can roll this information up worldwide. And of course each field office has to report to the donor. And ideally the various donor-funded projects are also sharing data with each other. And finally, the donor is reporting both to the host country government and its citizens, and to the donor government and its taxpayers.

Data friction

Now, there’s a lot of focus on those last two arrows, and all of this data that’s going to come spilling out of the donors once they open the floodgates, and all the cool things citizens will be able to do with that data. And I’m not being cynical – that is very exciting.

But right now it's a mess up in here, in these boxes and in these arrows. It’s a mess because at each step of the way there’s a lot of friction that keeps the data from flowing smoothly, or prevents it from flowing altogether.

Now, on one level this is a data interchange problem, and a lot of smart people have zeroed in on the standard format aspect of this problem, and kudos to them. It's still early days, but for example IATI, the International Aid Transparency Initiative, has made a terrific start establishing a common data standard for development information.

But what’s keeping this data from flowing smoothly isn’t just the lack of XML standards. Even inside each of these boxes you have a lot of data friction.

So I’m going to propose to you a framework that I'm going to call "Herb’s Hierarchy of Data Friction." It’s like Maslow’s Hierarchy of Needs, in that you can’t talk about the higher levels until you’ve addressed the lower levels.

  1. First you have to collect the data. That’s hard.
  2. Then you have to bring the data together in a central place. That’s also hard.
  3. Then you have to organize the data so you can make sense of it. That’s super hard.
  4. Finally, you have to report the data in the format the next person down the line wants. That’s hard too.

What’s especially horrifying to me as a data nerd is that information is destroyed at every step of the way, both in this sequence here and in the larger scheme of things. It’s destroyed by human error – you’d be amazed how much of this data is re-keyed by hand. (Or maybe you wouldn’t be suprrised.) It’s also destroyed by overaggregation and summarization, which makes it less useful for analysis down the road.

Now interoperability is great, but it only comes into play when the data is going from one organization to another.

So data standards are really important but they’re not enough; we need better tools.

So if you’re one of those boxes with data flowing into it and out of it and trying to get a better handle on that information, what do you do?

How do we get better tools?

Here’s the way I would break down the options.

Option 1. Business as usual

Option 1 is to just muddle through, which lots and lots or organizations are doing. People have lots of different spreadsheets and Word documents and PDFs and PowerPoint decks and that’s where the data is. So you take this screen and multiply it by fifty or a thousand or whatever, depending on how big you are. So even if you have a very simple question to answer, you have to go to all these people, figure out who has what information, synthesize it, and answer the question. And that can be extraordinarily labor-intensive, and it stops you from doing the work you're supposed to be doing. On the other hand, it's not the worst option, because at least we're talking about tools that every already has — there's no expense involved. Everyone has and knows how to use Excel and Word, everyone can read a PDF document. There are worst ways to solve this problem.

Option 2. Roll your own

The next step an organization usually tries is to develop a better data management system in-house. Now, if you have some serious software development talent internally, that can turn out great. But that’s rare, because most international development organizations are not software development organizations; and you’re more likely to end up with a complex and hard-to-use system that only one person in the organization is able to modify.

Option 3. Hire a contractor

The next thing many organizations often turn to is to contract out the work of building a better data management system to professionals. If you’re really careful about defining what constitutes acceptability, this can work out. But more often than not you give a contractor (hopefully just one!) a list of requirements, which they check off, and then if the system is slow or hard to use or your requirements change or weren’t accurately captured in the first place, well the contractor has done their job, they’ve been paid, and you’re out of luck.

Option 4. Build on an existing platform

The fourth option is to build on an existing platform: A CRM tool like Salesforce, or a document management tool like Sharepoint. This has a lot of obvious appeal, especially if you’re already using one of those systems for something else. But the fact is that these systems weren’t made for managing foreign aid data, and there are a lot of pieces that just don’t fit. And either way, you're still in the business of developing software, and that's not a business that international development organizations ought to be in. I’ve personally seen nothing but frustration and heartbreak down that road.

Option 5. Use an existing, purpose-built tool

So the fifth option is to use an existing tool that was built for the express purpose of managing aid data. DevResults is one example; there are other software tools out there that overlap with what we’re doing as well.

But I have to think that this is the future of international development - tools that are specifically designed for our needs, that organize information the way that we think about our work, and that actually reflect the realities of doing foreign aid in the field. That has to be the future, and that's what we're working towards here at DevResults.

Watch and share the video

FWIW, this visual presentation was created using Prezi, which is a terrific alternative to PowerPoint. This presentation can be viewed, copied and remixed here.