Warning: data wonkiness & some nitpicking about how people use numbers.
As some of you who have read my intermittent posts in the past might know, I work as a public health analyst on a large USAID contract. We work with data from Demographic and Health Surveys (DHS), World Bank, WHO, UNICEF, and other sources on a weekly, if not daily, basis. My work has given me a renewed reverence for the power of statistics to explain a situation, when accurately calculated and explained, issues with numbers, and how easy it can be to use data improperly. Many thanks to my coworkers for teaching me about some of these issues.
This is post #1 in a series of three on data issues I’ve come across through my work. Today, we’ll focus on why you have to be careful comparing numbers from DHS (or any other survey that has changed over time). This may seem like common sense to many of you who work in this field, but for some reason it’s been an issue that has arisen time and time again.
Today’s sample indicator: treatment for acute respiratory infection (ARI), in honor of World Pneumonia Day (tomorrow). ARI is used as a proxy for pneumonia in the DHS surveys, and routinely collected across numerous countries. And because of my affinity for Kenya, let’s take a look at the data for Kenya for the last two DHS (2003 and 2008/09), pretending that you’re a program manager looking for some information about how treatment rates have changed over time.
You want to know if the percent of under-five children who presented with ARI symptoms who were taken for treatment increased since 2003, and by how much, because you’re interested in childhood pneumonia. There are two straightforward ways to get this number: Statcompiler or looking at the DHS reports.
You decide to go to the report values, since that’s what is often accessible in developing countries, who receive copies of the publications gratis from ICF Macro. The most recent, 2008/09 DHS report seems straightforward: the table tells you that 55.9 percent of children under five who presented with symptoms of ARI were taken for treatment. Exactly what you were looking for.
Then, you open the 2003 DHS, and see the chart (which looks remarkably similar) on page 141 with information about ARI. Except that where you expect to see data for treatment of ARI, the column is titled “Among children with symptoms of ARI and/or fever, percentage for whom treatment was sought from a health facility/provider.” Now, fever is typically used as a proxy for malaria, does not always present with ARI, and often occurs in a child who does not have an ARI. You now have under-fives with fever, but no ARI symptoms, included in your number. The table gives a value of 45.5% being taken for treatment. But what if the majority of those children had only fever? Your 2003 data point isn’t comparable with your 2008/09 data point.
So what do you do? If you’re a data person, you probably look somewhere else for more comparable numbers, look for the data set to see if you can calculate what you want yourself, etc. Note that disaggregating those two things (treatment for ARI and treatment for fever) might be possible, depending on how the survey was designed֫— but you need a good survey to get good data.
If you’re not a data person though, and especially if you don’t know that fever is not always a symptom that accompanies ARI, you might decide that you can make the comparison anyways, maybe with a little footnote. No big deal, you think. And I say that with no judgment, because an amazing number of talented, capable, well-educated people make that exact choice and write up their report comparing the two numbers regardless. Plus, it ends up looking like there’s been an increase of 10.5% in the past five years in the percent of children with ARI symptoms taken for treatment, and that speaks to the success of your program.
But if you go to Statcompiler, which looks at the data sets and standardizes indicators, the program tells you that 49.1% of children with symptoms of ARI were taken for treatment. Hmmmm….not a huge difference, but something worth considering. Suddenly, the increase in children taken for treatment shrinks to 6.8%, though you have a much more accurate comparison.
Some lesssons from this example of when to be careful comparing numbers/indicators that look (almost) the same:
- 3.6 percent might not seem like much (and might be within the confidence interval of your data point), but that means close to an extra 1 out of 25 kids with symptoms of ARI got treatment than I thought. So if I’m running a program, I care about that 3.6 percent.
- Read your footnotes when you see them attached to a number. They’re usually important: sampling women who are 10-49 years old instead of 15-49 years might make it tough to compare a statistic across two countries. And don’t assume indicators stay the
- Statcompiler is a good resource for quick, comparable data points across time and countries. Not perfect, but a simple and easy way to retrieve data in a very consumable format, with very explicit footnotes when there is a strange quirk to a data point.
- There will be indicators where the definition has changed over time, and you will always need to be careful to make sure you’re comparing apples with apples. The revised WHO standard for when to initiate ART is a great example of something you should look out for in recent documents, especially when comparing ART coverage rates over time.