Study shows that salinity heat maps may not be ready yet at peak time

AI models that interpret medical images hold the promise of enhancing clinicians’ ability to make accurate and timely diagnoses, while also reducing workload by allowing busy clinicians to focus on critical cases and delegating tasks by rote to AI.

But AI models that lack transparency about how and why a diagnosis is made can be problematic. This vague logic – ; Also known as “black box” AI -; It can reduce a clinician’s confidence in the reliability of an AI tool and thus discourage its use. This lack of transparency may also mislead clinicians into overconfidence in the instrument’s interpretation.

In the field of medical imaging, one way to create more understandable models for AI and to demystify AI decision-making has been salience assessments -; An approach that uses heat maps to determine whether a tool is properly focusing only on relevant parts of a given image or directing it to unrelated parts of it.

Heat maps work by highlighting areas in the image that have influenced the interpretation of the AI ​​model. This can help human clinicians see if an AI model is focusing on the same areas as they do or mistakenly focusing on unrelated points in the image.

But a new study published in The intelligence of nature’s machine On October 10, it shows that, despite all their promises, the salty heat maps may not be ready yet in prime time.

The analysis, led by Harvard Medical School researcher Pranav Rajpurkar, Matthew Longren of Stanford University, and Adriel Saporta of New York University, validated seven large-scale salinity determination methods to determine the reliability and accuracy of identifying diseases associated with 10 commonly diagnosed conditions. On X-rays, such as lung lesions, pleural effusion, edema, or enlargement of cardiac structures. To ascertain performance, the researchers compared the tools’ performance with human expert judgments.

In the final analysis, instruments using salinity-based heat maps were consistently underperforming in image evaluation and in their ability to detect pathological lesions, compared to human radiologists.

The work represents the first comparative analysis of salinity maps and the performance of human experts in evaluating multiple X-ray diseases. The study also provides an accurate understanding of whether and how certain pathological features on an image may affect the performance of an AI tool.

The sprout map feature is already being used as a quality assurance tool by clinical practices that use artificial intelligence to interpret computer-assisted detection methods, such as reading chest x-rays. In light of the new findings, the researchers said, this benefit should be applied with caution and with a healthy dose of skepticism.

Our analysis shows that streak maps are not yet reliable enough to validate individual clinical decisions made by the AI ​​model. We have identified important limitations that raise serious safety concerns for their use in current practice.”

Pranav Rajpurkar, Assistant Professor of Biomedical Informatics, HMS

The researchers caution that due to important limitations identified in the study, salt-based heat maps must be improved before they can be widely adopted in clinical AI models.

The team’s complete database, data, and analytics are open and available to anyone interested in studying this important aspect of clinical machine learning in medical imaging applications.


Journal reference:

Saporta, A.; et al. (2022) Determination of salinity methods for interpretation of chest X-rays. The intelligence of nature’s machine.

Leave a Comment