Researchers at the Nuffield Department of Medicine have shown that using routinely-collected hospital data to study heart valve infections can give misleading information about how common they are, and suggested ways of improving future studies using hospital data.
The researchers looked at electronic health record data – this is the information that is routinely collected by hospitals when a patient is treated . It contains administrative information, such as age and dates of treatment, together with ‘diagnostic codes’ – which represent the condition the patient was treated for, and other conditions the patient has. These diagnostic codes are mainly used to calculate how much the hospital should be paid to treat a patient. The hospital may get paid thousands of pounds for a patient with codes representing cancer treatment, heart disease and diabetes (as it is assumed this sort of patient requires lots of expensive tests and medication), whilst an ‘abdominal pain’ code will earn the hospital a few hundred pounds.
Researchers are using this sort of data more and more to study diseases. Rather than spend 10 years and millions of pounds recruiting patients to find out how common a disease is, or whether it is increasing, researchers can look at how often hospitals are using relevant diagnostic codes in thousands of patients without needing these huge resources. Identifying data like name and address are removed beforehand, and an ethical review is conducted to ensure the data is being used responsibly.
In this study, researchers looked at infective endocarditis – an infection of the heart valves. They compared hospital admissions that had ‘endocarditis’ diagnostic codes assigned, and patients who were diagnosed with endocarditis by a doctor, in Leeds and Oxford, over many years. They also looked at the bacteria grown in blood cultures in patients in Oxford. They found that there were many more admissions with an ‘endocarditis’ code than actual cases of the disease. They also found that using diagnostic codes to try and work out which bacteria were causing the infection was also problematic. They suggested some steps that researchers could take to improve accuracy of the data.
Infective endocarditis is an important infection to study, as cases appear to be increasing. It isn’t clear whether this is because the number of people with heart valve replacements is increasing, because on average people are living longer (and old people are more likely to get this infection), or if it is because we have changed how we use antibiotics. Most of the studies looking at endocarditis have been careful about how they used electronic health record data.
This study highlights that whilst electronic health record data is extremely useful, more work is needed to improve how we use this data to identify diseases – such as using microbiology data, machine learning approaches, and refining algorithms, rather than relying on diagnostic codes alone.