A new MIT study finds “health knowledge graphs,” which show relationships between symptoms and diseases and are intended to help with clinical diagnosis, can fall short for certain conditions and patient populations. The results also suggest ways to boost their performance.

Health knowledge graphs have typically been compiled manually by expert clinicians, but that can be a laborious process. Recently, researchers have experimented with automatically generating these knowledge graphs from patient data. The MIT team has been studying how well such graphs hold up across different diseases and patient populations.

In a paper presented at the Pacific Symposium on Biocomputing 2020, the researchers evaluated automatically generated health knowledge graphs based on real datasets comprising more than 270,000 patients with nearly 200 diseases and more than 770 symptoms.

The team analyzed how various models used electronic health record (EHR) data, containing medical and treatment histories of patients, to automatically “learn” patterns of disease-symptom correlations. They found that the models performed particularly poorly for diseases that have high percentages of very old or young patients, or high percentages of male or female patients — but that choosing the right data for the right model, and making other modifications, can improve performance.

Read more at Massachusetts Institute of Technology

Image Credit: Clockready via  Wikimedia Commons