Microsoft announced Thursday it is teaming up with digital pathology provider Paige to build the world’s largest image-based artificial intelligence model for identifying cancer.
The AI model is training on an unprecedented amount of data that includes billions of images, according to a release. It can identify both common cancers and rare cancers that are notoriously difficult to diagnose, and researchers hope it will eventually help doctors who are struggling to contend with staffing shortages and growing caseloads.
Paige develops digital and AI-powered solutions for pathologists, which are doctors who carry out lab tests on bodily fluids and tissues to make a diagnosis. It’s a specialty that often operates behind the scenes, and it’s crucial for determining a patient’s path forward.
“You don’t have cancer until the pathologist says so. That’s the critical step in the whole medical edifice,” Thomas Fuchs, co-founder and chief scientist at Paige, told CNBC in an interview.
But despite pathologists’ essential role in medicine, Fuchs said their workflow has not changed much in the last 150 years. To diagnose cancer, for instance, pathologists usually examine a piece of tissue on a glass slide under a microscope. The method is tried and true, but if pathologists miss something, it can have dire consequences for patients.
As a result, Paige has been working to digitize the pathologists’ workflow to improve accuracy and efficiency within the specialty.
The company has received approval from the Food and Drug Administration for its viewing tool FullFocus, which allows pathologists to examine scanned digital slides on a screen instead of relying on a microscope. Paige also built an AI model that can help pathologists identify breast cancer, colon cancer and prostate cancer when it appears on the screen.
Digital pathology is costly
Paige is the only company that has received FDA approval for pathologists to use its AI as a secondary tool for identifying prostate cancer, and CEO Andy Moye said this is likely in part because of barriers related to storage costs and data collection.
Digitizing a single slide can require over a gigabyte of storage, so the infrastructure and costs associated with large-scale data collection balloon quickly. Fuchs said the storage costs can be inhibiting for smaller health systems, which is why wealthy academic centers have historically been the only organizations that can afford to invest in digital pathology.
Paige spun out of the Memorial Sloan Kettering Cancer Center in New York in 2017 and has a “fantastic wealth of data,” according to Moye, which is why the company was able to build its own AI-powered solutions in the first place. To put the scale in perspective, Paige has 10 times more data than Netflix, including all the shows and movies that exist on the platform.
But in order to expand its operations and build an AI tool that can identify more cancer types, Paige turned to Microsoft for help. Over the past year and a half, Paige has been using Microsoft’s cloud storage and supercomputing infrastructure to build an advanced new AI model.
Paige’s original AI model used more than 1 billion images from 500,000 pathology slides, but Fuchs said the model the company has built with Microsoft is “orders of magnitude larger than anything out there.” The model is training on 4 million slides to identify both common and rare cancers, which can be difficult to diagnose. Paige said it is the largest computer vision model that has ever been announced publicly.
“Until ChatGPT got released, no one really understood how this is going to impact their lives. I would argue this is very similar for cancer patients going forward,” Moye said. “This is sort of a groundbreaking, land-on-the-moon kind of moment for cancer care.”
Moye added that the company is thinking of ways to incorporate predictive modeling to give pathologists and patients easy access to information about their biomarkers and genomic mutations down the line.
Desney Tan, vice president and managing director of Microsoft Health Futures, said Microsoft’s infrastructure is a key component of the partnership, but that the company is also working to develop the new algorithms, detection and diagnostics that Paige is hoping to deliver in the next couple of years.
He added that though the technology is powerful, it’s meant to enrich pathologists, not replace them.
“We think of these AI implements, these technologies, as tools, really just as the stethoscope is a tool, just as the X-ray machine is a tool,” Tan told CNBC in an interview. “AI is a tool that is to be wielded by a human.”
On Thursday, Paige and Microsoft will publish a paper on the model through Cornell University’s preprint server arXiv. The paper quantifies the impact of the new model compared with existing models, and Fuchs said it outperforms anything that has been built in academia up to this point.
But the preprint is just the first step of a much longer journey. Paige wanted to make the research available to the broader community while it is under peer review, and the company intends to submit to the scientific journal Nature. The process can take months, if not longer. Paige also has years of work ahead before it will be able to roll the model out as a product — including thorough testing and collaboration with regulators to ensure it is safe and accurate.
Ultimately, Fuchs said the AI model will solve the storage problem for health systems, while also helping pathologists work through cases and arrive at a diagnosis more quickly. For some patients, it could mean the difference between waiting two days and two weeks to find out what’s wrong.
“The more you go away from academic medical centers, especially in community clinics where pathologists are completely overwhelmed across all cancer types with so many cases, there, the impact is quite drastic,” Fuchs said. “That really helps to democratize access to health care in these places.”