We're in the process of adding import and export support for the IATI XML format to DevResults. Although IATI is primarily conceived as a public transparency standard (hence the T), many organizations are beginning to see it as a promising interchange format for reporting results — either outbound reporting (to a funder, or to a home office) or inbound reporting (from a grantee or contractor).
However, the IATI standard currently is primarily organized around financial data, and has only limited support for reporting results on performance indicators. These limitations make it very difficult to make any practical use of the IATI format for working with results data.
Proposed changes to the IATI standard
This document suggests three changes to the IATI standard that would make the format vastly more useful for reporting performance data:
Require unambiguous indicator references. There is currently no way to identify precisely which indicator is being reported on, which makes it impossible to aggregate data coming from disparate sources. The solution here is relatively simple: Support an element referencing an indicator vocabulary and an indicator code within that vocabulary.
Support disaggregation of performance data. A thornier issue is that there is currently no way to disaggregate results data — by gender, by age, by geography, or any other way. IATI already supports a versatile and precise standard for specifying an activity's geographic scope. We propose adding geographic disaggregation to the results standard, as well as adding elements for disaggregating results data into other categories.
Add an indicator schema to the IATI standard. This is a much bigger challenge than the first two. Ultimately, though, if we're going to understand the data reported on a for a given indicator, we need a precise definition of that indicator. Our proposal is to create a separate top-level indicator standard, parallel to the existing activity and organization standards, so that aid organizations and indicator registries will have a common language for describing what they measure and how.
1 Require unambiguous indicator references
Proposed change: One or more
<reference> elements must be listed under the
<indicator> element. Like the
<sector> element, this has a
@vocabulary attribute and a
@code attribute. The
@vocabulary attribute references a known indicator library, like the U.S. Foreign Assistance Framework or the WHO Indicator Registry. The
@code attribute is a unique identifier within that vocabulary. As with the
vocabulary="999" denotes a private vocabulary.
More than one
<reference> element can be listed, in cases where the same indicator is referred to by multiple codes within different vocabularies (e.g. an internal code and a standard code).
<indicator> <reference vocabulary="999" code="1.2.3" /> <reference vocabulary="4" code="A3.2-1" /> ... </indicator>
2 Support disaggregation of performance data
Proposed change: Both targets and actuals can be optionally disaggregated by location (e.g. point, facility, town, district, province). Targets and actuals can also be optionally disaggregated into other subsets: These might be demographic attributes such as sex, age, ethnicity, but could include any number of factors, such as crop type, organization type, HIV status, or HIV treatment regimen.
Geographical disaggregation would use something like the existing location standard. For example, data disaggregated into three provinces might look like this:
<period> <location ref="AA-AAA"> <location-id vocabulary="G1" code="1111111" /> <actual value="100" /> </location> <location ref="AA-BBB"> <location-id vocabulary="G1" code="1111112" /> <actual value="110" /> </location> <location ref="AA-CCC"> <location-id vocabulary="G1" code="1111113" /> <actual value="200" /> </location> </period>
Alternatively, data reported by separate facilities might look like this:
<period> <location ref="Clinic-001"> <point><pos>31.616944 65.716944</pos></point> <actual value="100" /> </location> <location ref="Clinic-002"> <point><pos>32.169446 64.169447</pos></point> <actual value="110" /> </location> <location ref="Clinic-003"> <point><pos>33.694461 63.694471</pos></point> <actual value="200" /> </location> </period>
Demographic and other disaggregation
Disaggregation by other attributes would use new
<subset> elements. For example:
<period> <disaggregation name="sex"> <subset name="male"> <actual value="100" /> </subset> <subset name="female"> <actual value="110" /> </subset> </disaggregation> </period>
As no universal vocabularies exist, we suggest defining the disaggregation vocabulary within a structured vocabulary of indicator definitions (see below).
Multiple disaggregations can be nested (cross-disaggregation) or can be side-by-side (parallel disaggregation). All of the following hierarchies are supported:
Disaggregation by geography
Disaggregation by one attribute
Disaggregation by two attributes (nested)
Disaggregation by geography and one attribute
Disaggregation by two attributes
Here's an example of data disaggregated by geography and two nested attributes:
<period> <period-start iso-date="2013-01-01" /> <period-end iso-date="2013-03-31" /> <target value="50" /> <location ref="AA-AAA"> <location-id vocabulary="G1" code="111111" /> <disaggregation name="sex"> <subset name="male"> <disaggregation name="age"> <subset name="0-15"> <actual value="123" /> </subset> <subset name="15+"> <actual value="111" /> </subset> </disaggregation> </subset> <subset name="female"> <disaggregation name="age"> <subset name="0-15"> <actual value="789" /> </subset> <subset name="15+"> <actual value="333" /> </subset> </disaggregation> </subset> </disaggregation> </location> <location ref="AA-BBB"> ... </location> </period>
In some cases, results data are disaggregated by two or more attributes "in parallel." For example, an indicator might be disaggregated by sex, and also disaggregated by age, but not by both at the same time.
This is less than ideal, because we have two different totals, which we can only hope are the same. When aggregating data from multiple sources, you have to choose one of the disaggregations arbitrarily. This approach also destroys information — in this situation, for example, it's impossible to know how many male adults were among the beneficiaries.
However, we see a lot of parallel disaggregation in the wild, and so as a practical matter this standard should probably support it. In this case the markup might look like this:
<period> <disaggregation name="sex"> <subset name="male"> <actual value="5" /> </subset> <subset name="female"> <actual value="7" /> </subset> </disaggregation> <disaggregation name="age"> <subset name="child"> <actual value="8" /> </subset> <subset name="adult"> <actual value="4" /> </subset> </disaggregation> </period>
Targets can be disaggregated in the same way as actuals, or at a lower level of granularity. For example, targets might be set at the aggregate level, while actuals are disaggregated by sex. (It wouldn't normally make sense to have targets disaggregated if the actuals are not disaggregated.)
The target element can simply be included at the appropriate level. So if the target is disaggregated, you would have:
<period> <disaggregation name="sex"> <subset name="male"> <target value="100" /> <actual value="97" /> </subset> <subset name="female"> <target value="150" /> <actual value="160" /> </subset> </disaggregation> </period>
If the target is not disaggregated, you would have:
<period> <target value="250" /> <disaggregation name="sex"> <subset name="male"> <actual value="97" /> </subset> <subset name="female"> <actual value="160" /> </subset> </disaggregation> </period>
3 Add an indicator schema to the IATI standard
In order for the results data reported here to be truly useful, we need to know what the indicator being reported on means.
The current activity/result schema includes a few attributes for describing an indicator:
/iati-activities/iati-activity/result/@type(count vs percentage)
/iati-activities/iati-activity/result/@aggregation-status(suitable for aggregation vs not)
In practice much more information is needed to fully describe an indicator and how it is used. For example, the standard USAID performance indicator reference sheet includes:
- Indicator title
- Precise definition
- Unit of measure (e.g. individuals, hours, dollars, km, etc.)
- How the indicator is to be disaggregated
- Rationale or justification
- Data source
- Method of data collection and construction
- Reporting frequency
- Where the indicator fits within the organization's results framework
Other vocabularies have additional types of data and narrative associated with an indicator:
- Short version of title
- Direction of improvement (whether higher or lower values are desired)
- Numerator definition
- Denominator definition
- Formula (for indicators that are computed from other indicators)
- Display format (whole number, decimal number, percentage, ratio, X per thousand/million, etc.)
- Indicator type (e.g. input vs output vs impact)
- Sector tags
- Other keywords
- Reference URLs
- Geographical reporting level
- Status (e.g. active vs deprecated)
- Data quality notes
- Strengths, limitations
- Data review process
In addition, other indicators (in the same vocabulary, or in other vocabularies) might be related to this indicator in various specific ways.
- A is subset of B
- A is superset of B
- A is computed from B (most commonly, B is the numerator or denominator of A)
- A is referenced by B (likewise)
- A is identical to B
- A is similar to B (e.g. measures the same thing in a different way)
Disaggregation requirements also need to be defined precisely.
- Ideally the disaggregation definitions include canonical vocabularies for the names of the disaggregation factors (e.g. sex, age, crop) and the names of the acceptable subsets (e.g. male, female), along with accepted synonyms (sex = gender, male = m = boy = man = men) and translations of these terms into other languages.
- It may be necessary to indicate which disaggregations are required by the organization being reported to.
- In cases of multiple parallel disaggregations, it may be necessary to designate one of the disaggregations as disabled.
That's all a lot of information. It doesn't make sense to embed the full definition of the indicator along with each data point. Instead, we recommend delegating the indicator definitions to the various vocabularies (indicator repositories). The four fields described above could then be deprecated.
The challenge then becomes one of defining a common standard for indicator vocabularies to use. We propose creating a separate IATI Indicator Standard at the same level as the IATI Activity Standard and Organization Standard.
This would be a big job, to say the least. The elements outlined above give an idea of what this standard might encompass; we won't go as far as to propose an actual standard. If this seems like a viable plan to the IATI community, the next step would be to draft a proposed schema for discussion.