
Microsoft Claims Its New AI System Diagnosed Patients with Four Times More Accuracy than Human Doctors
The tech company attracted several prominent researchers from Google to develop an artificial intelligence tool capable of diagnosing patients, which could reduce costs in the healthcare sector.
Microsoft has taken a significant step towards advanced medical intelligence, according to Mustafa Suleyman, CEO of its artificial intelligence division. The company has developed a new AI tool capable of diagnosing diseases with four times greater accuracy and at a significantly lower cost than a group of human doctors. The research aimed to verify whether this tool could perform diagnostics correctly, replicating tasks that a physician typically undertakes.
To conduct the study, the Microsoft team used 304 clinical cases extracted from the New England Journal of Medicine and created a test called the Sequential Diagnosis Benchmark (SDBench). A language model broke down each case into a sequential process that a doctor would follow to reach a diagnosis. Subsequently, the researchers developed a system called MAI Diagnostic Orchestrator (MAI-DxO), which consults several leading AI models, including OpenAI's GPT, Google's Gemini, Anthropic's Claude, Meta's Llama, and xAI's Grok, mimicking the collaboration of various human experts.
In the experiment, MAI-DxO outperformed the doctors, achieving an accuracy of 80% compared to 20% for the physicians. Additionally, the system reduced costs by 20% by opting for more economical tests and procedures. Suleyman expressed that this orchestration mechanism, where multiple agents work together in a debate style, is what will bring technology closer to medical superintelligence.
The company has also recruited several artificial intelligence researchers from Google, evidencing a growing competition for top talent in the sector. AI is already being used in some areas of healthcare in the U.S., such as supporting radiologists in image interpretation. Recent multimodal AI models show potential for acting as more general diagnostic tools, although the use of AI in healthcare raises concerns about biases in training data.
Microsoft has yet to decide whether to commercialize this technology, but there is a possibility of integrating it into Bing to help users diagnose symptoms. It could also develop tools to assist doctors in improving or automating patient care. Suleyman mentioned that in the coming years, the company will focus on validating these systems in real-world environments.
Microsoft's current research stands out from previous work as it reproduces the diagnostic method of doctors more faithfully, analyzing symptoms, ordering tests, and performing additional analyses until reaching a diagnosis. In a statement about the project, the company described the combination of several cutting-edge AI models as a "path towards medical superintelligence."
This advancement suggests that AI could help reduce costs in healthcare, a critical issue, especially in the U.S. Dominic King, Microsoft’s vice president and a participant in the project, emphasized that the model shows remarkable performance in both obtaining the diagnosis and cost efficiency.
David Sontag, a scientist from MIT and co-founder of Layer Health, found the work exciting, highlighting its relevance because it reflects more accurately how doctors operate and addresses complex issues in the underlying methodology. However, he cautioned that the findings should be taken with caution, as the doctors in the study did not use additional tools when diagnosing, which may not reflect their usual practice.
Eric Topol from the Scripps Research Institute described the report as impressive for addressing complex diagnostic cases and considered it innovative that AI could, in theory, reduce costs in healthcare. Both Topol and Sontag agreed that the next phase to validate the effectiveness of Microsoft’s system would involve clinical trials comparing its results with those of real doctors treating real patients, for a rigorous cost evaluation.