BBN developing technology to assess the reliability and accuracy of healthcare responses
CAMBRIDGE, Mass., Dec. 10, 2024 /PRNewswire/ — RTX BBN Technologies received an award to support the Advanced Research Projects Agency for Health’s (ARPA-H) Chatbot Accuracy and Reliability Evaluation (CARE) Exploration Topic under an Other Transaction Agreement. CARE aims to develop advanced tools and technologies for evaluating medical chatbots in patient-facing applications, addressing the critical need for reliable health information in situations where accuracy may influence patient outcomes.
Despite the potential of medical chatbots, significant limitations threaten their effectiveness. Many AI systems generate factually inaccurate or misleading responses that may cause confusion and pose potential risk to patients. As healthcare evolves, a scalable system is needed to ensure consistent medical chatbot performance in any setting. This need is intensified by ongoing lack of standardization, which continues to undermine confidence.
“Evaluating medical chatbots requires more than simply checking for correct answers; it demands a deep understanding of how these systems address the complex needs of diverse users,” said Dr. Damianos Karakos, BBN principal investigator on the effort.
To address this problem, BBN will use its expertise in machine learning, language-based information processing and large language models to develop the Monitoring, Evaluation and Diagnosing of Intelligent Chatbots (MEDIC) system. This comprehensive solution will function as a technological framework for evaluating medical chatbots, featuring core capabilities such as:
- Integration of insights from caregivers, patients and medical professionals to optimize chatbot interactions and effectively address their concerns and expectations.
- Retrieval of relevant medical texts to validate chatbot responses against evidence-based data sources.
- Advanced prompt engineering to create realistic interactions from various demographic perspectives.
- Detection of missing or inaccurate information in chatbot outputs using multiple evaluative methods, which use advanced information extraction and machine learning techniques.
“Our goal is to develop an adaptable framework that rigorously assesses chatbot performance in real-world scenarios, focusing on key aspects like bias, fairness and the risk of generating misleading information,” said Karakos. “For example, in prenatal care, it’s crucial that expectant mothers receive accurate dietary guidance to support fetal health. MEDIC will assess the dietary advice given by medical chatbots and escalate any ambiguous responses to healthcare professionals for further review. This initiative aims to improve AI-integrated care in a variety of healthcare settings.”
The BBN-led team includes Johns Hopkins University (Prof. Mark Dredze), Johns Hopkins University School of Medicine and Howard University Hospital. Work on this effort is being performed in Cambridge, Massachusetts; Washington, D.C.; and Baltimore, Maryland.
This research was, in part, funded by the Advanced Research Projects Agency for Health (ARPA-H). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the United States Government.
About RTX BBN Technologies
Founded in 1948, RTX BBN Technologies provides advanced technology research and development with a focus on national security priorities. From the ARPANET to the first email, through the first metro network protected by quantum cryptography, BBN consistently transitions advanced research to produce innovative solutions for its customers. BBN takes risks and challenges conventions to create solutions in analytics and machine intelligence, networks and sensors, intelligent software and systems, and physical sciences.
About RTX
With more than 185,000 global employees, RTX pushes the limits of technology and science to redefine how we connect and protect our world. Through industry-leading businesses – Collins Aerospace, Pratt & Whitney, and Raytheon – we are advancing aviation, engineering integrated defense systems for operational success, and developing next-generation technology solutions and manufacturing to help global customers address their most critical challenges. The company, with 2023 sales of $69 billion, is headquartered in Arlington, Virginia.
For questions or to schedule an interview, please contact [email protected]
SOURCE RTX