Gains in Afghan Health: Too good to be true?

Today the Center for Global Development hosted a brown bag lunch session analyzing the data collection methodology and data quality of the 2010 Afghanistan Mortality Survey (AMS). The survey was the first to generate direct estimates of both child and adult mortality in Afghanistan; previous health survey data was limited at best, particularly before 2001.

When first released, the survey results were exciting for the international community: they showed that both child and maternal mortality were lower than previous estimates had indicated, and the country had shown improvements in many health indicators. When analyzed with a fine-toothed comb, though, serious questions about data quality, particularly the reliability of the numbers, arose.

Examining the data more closely: Presenter Kenneth Hill, a professor at the Harvard School of Public Health, discussed issues with both the under-five mortality and maternal mortality statistics. The survey indicated the under-five mortality rate, according to household death estimates, was 84 deaths per 1,000 live births, while data calculated based on pregnancy history collected indicated 71 deaths per 1,000 live births. This, in itself, was a surprise, as it is often expected that household data would be lower than under-five mortality calculated from a pregnancy history. Both numbers were markedly lower than expected though.

Professor Hill raised four key concerns around the data, which he manipulated and re-analyzed in various ways in an effort to establish the validity of the estimates:

  1. The trend in under-five mortality over time in the South (one of three regions where data was collected, considered the most conflict-riddled) was implausible.
  2. Sex ratios at birth (number of male infants per 100 females) were markedly skewed in some regions. One would expect 102-107 males per 100 females born; sex ratios in multiple regions exceeded 110, and were as high as 138 in the South. This points to the possibility of purposeful omission of mentioning female children during the questionnaire, or sex-selective abortion (which is far less likely in Afghanistan then the possibility of omission).
  3. The data showed severe underreporting of neonatal deaths relative to the data available from the regional DHS. Comparing the neonatal mortality rate to the post-neonatal mortality rate, the proportion of neonatal deaths is far lower than expected (see graph below).
  4. Finally, of the interviewers who had conducted at least 50 interviews (a reasonable sample size), not a single one in the North or Central regions reported zero child deaths, while a majority of interviewers in the South reported that households had zero child deaths, raising suspicion about interviewer bias or respondent bias.
A number of these concerns came back to issues around data quality in the South, where conflict makes data collection particularly difficult and may make respondents particularly suspicious and/or unwilling to answer questions. Professor Hill recalculated the under-five mortality rate with only the data from the North and Central regions, compared it to comparable international data, and still found wide inconsistencies though.
Excluding the south, Hill’s recalculated child mortality statistics showed a decrease from around 175 deaths per 1,000 live births in the late 1980’s to 115 deaths around 2000 to approximately 80 deaths per 1,000 live births in 2010. Even if these rates were accepted as correct, the rate of decline in under-five mortality (3.3%) would be less than the rate necessary to reach the Afghanistan target for MDG4.
A similar analysis of data quality was done for the adult mortality data, specifically maternal mortality (notoriously hard to measure in any circumstances). Household estimates were consistently higher than those estimated from sibling histories collected by the survey. For MMR, sibling history estimates – which can show a trend over time – were compared to data from recent Bangladesh Mortality and Morbidity Surveys; the AMS data showed much lower maternal mortality than in Bangladesh (which was hard for Hill to believe) and was even more markedly lower than previous global estimates would lead one to believe.
In response, Pav Govindasamy from ICF Macro highlighted the positive trends in the report, including improvements in female educational attainment, increases in the average age at first marriage, and increases in contraceptive prevalence rate over time. She noted that the recent expert advisory group on modeling mortality estimated Afghanistan’s MMR to be approximately 460 deaths per 100,000 live births, not terribly different from the upper estimate from the AMS of 449 deaths per 100,000 live births. While she acknowledged the definite issues with data quality in the South, she encouraged everyone to focus on the big picture findings: health status is improving, and gains have been made.

What next then? The methods used to collect the data were considered gold standard, best practices for implementing a household survey. While the Afghanistan Public Health Institute and Central Statistics Office took the lead on the task, demographic and health survey gurus from ICF Macro provided technical assistance along the way. There are many challenges in conducting a national household survey, particularly in conflict areas, and we shouldn’t shy away from questioning results and methods when something seems too good to be true.

The take home message from Ken and the panelists: data collection in a conflict setting is rough, and doesn’t typically go entirely according to plan. The inconsistencies and issues with the data particularly from the South, considered the least stable of the three regions where the survey was conducted, drive that point home. These challenges point to the importance of vital registration, noted one attendee, and additional research on how to support and promote institutionalization of vital registration in the various countries where we work would be interesting and useful.

2 thoughts on “Gains in Afghan Health: Too good to be true?

  1. Thank you for the detailed account on the discussion. Fascinating, and reminiscent of some of the very good discussion that followed the 2006 Iraq mortality survey (see comments by Guha-Sapir and others on heterogeneity of the phenomenon measured, inaccurate baseline and so on). There is much to be said on methodologies, but I will focus on two points: implementation, and quality assessment.

    One fact rarely acknowledged is that the quality of implementation is as important as the design: “gold standard” method (if such a thing exists) does not equate with “gold standard” implementation. Several of the biases outlined here possibly reflect how the interviewers conducted the respondents selection and elicited responses from the interviewees. Over the last 10 years we have conducted a large number of surveys in conflict areas ( Since we started relying on digital data collection, tracking location and time of individual surveys, we have had to fire an interviewer in at least two occasions because we could clearly document that they were faking data, even though we were going to villages as a team. Using paper-based data collection, it is possible to detect fake surveys by doing trend analysis by interviewers for example, or looking at the consistency of the data. Still, by then it is typically too late to collect new data, and I have seen few researchers who were willing to discard significant amount of data even when the evidence of inconsistencies are clear.

    A second point that this discussion highlights is the value of being able to critically assess research findings. It is a skill that I rarely see among humanitarians, or in fact much of the non-academic world. Yet, the implication go far beyond the low stake of academic discussions. In a word where series of focus group are presented as surveys and non-representative sample are taken as a base to “consult the population”, critical thinking is, well, critical. There are many guides out there to help ask the right questions when reviewing scientific literature, including survey results. Not all data are created equal…

  2. Thanks for your thoughtful comments Patrick! You may enjoy seeing the full slide deck when it’s made available – I’ll post a link when CGD releases it.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s