DataPall

DataPall, the brainchild of the interns two years ago (link to all their blogs), is an electronic medical records systems for the Palliative Care unit here at St. Gabe’s. The previous interns here noticed that the small staff in the unit would have to spend weeks manually adding up figures to give quarterly reports to the Ministry of Health, so they designed DataPall so that reports could be generated with the push of a button.

While DataPall has been a huge help to Palliative Care— the team here now only spends a day or two on the reports—there have still been several issues preventing a large-scale rollout. One of the most significant of these issues is accidental duplication of records. As the literacy rate in Malawi is quite low, it isn’t uncommon that patients don’t know how to spell their names, so the nurses and doctors will sometimes write down alternate spellings from appointment to appointment. Additionally, sometimes patients will change last names or accidentally switch their first and last name. Therefore, if you type the misspelled name into the current DataPall search bar, no match will show up and a new patient will be created (even though the patient record already exists under a different spelling).

This accidental duplication of records has lead to the gradual artificial inflation of patients, skewing the data that the unit is supposed to be reporting. This issue became so problematic that one of the doctors once said, “DataPall is telling me that there were 20 new patients last week, but I only saw about 15 different patients!”. To overcome this problem, they oftentimes have to go back into the paper records to make sure that the reports are giving the right numbers. While staff still doesn’t have to spend weeks generating the reports, they still have to spend a couple days to fix the errors.

Using a broad set of matching criteria, we generated a set potential duplicates, of which 121 patient records (!) were actually duplicated. After merging the duplicates, we were then able to feed this data back into the computer (using a machine learning algorithm) in order to generate an algorithm that can better predict whether two records could be a duplicate, based on common misspellings, similar village, etc. We’re really happy that the duplicate searching algorithm actually works quite well (98% sensitivity and specificity), so we’re in the process of implementing it into the workflow of DataPall as well as incorporating it into our new project, Morphine Tracker (read about it on Truce’s blog!).