Can We Trust AI Doctors? Google Health and Academics Battle It Out – Singularity Hub

Posted: October 22, 2020 at 12:12 pm

Machine learning is taking medical diagnosis by storm. From eye disease, breast and other cancers, to more amorphous neurological disorders, AI is routinely matching physician performance, if not beating them outright.

Yet how much can we take those results at face value? When it comes to life and death decisions, when can we put our full trust in enigmatic algorithmsblack boxes that even their creators cannot fully explain or understand? The problem gets more complex as medical AI crosses multiple disciplines and developers, including both academic and industry powerhouses such as Google, Amazon, or Apple, with disparate incentives.

This week, the two sides battled it out in a heated duel in one of the most prestigious science journals, Nature. On one side are prominent AI researchers at the Princess Margaret Cancer Centre, University of Toronto, Stanford University, Johns Hopkins, Harvard, MIT, and others. On the other side is the titan Google Health.

The trigger was an explosive study by Google Health for breast cancer screening, published in January this year. The study claimed to have developed an AI system that vastly outperformed radiologists for diagnosing breast cancer, and can be generalized to populations beyond those used for traininga holy grail of sorts thats incredibly difficult due to the lack of large medical imaging datasets. The study made waves across the media landscape, and created a buzz in the public sphere for medical AIs coming of age.

The problem, the academics argued, is that the study lacked sufficient descriptions of the code and model for others to replicate. In other words, we can only trust the study at its wordsomething thats just not done in scientific research. Google Health, in turn, penned a polite, nuanced but assertive rebuttal arguing for their need to protect patient information and prevent the AI from malicious attacks.

Academic discourse like these form the seat of science, and may seem incredibly nerdy and outdatedespecially because rather than online channels, the two sides resorted to a centuries-old pen-and-paper discussion. By doing so, however, they elevated a necessary debate to a broad worldwide audience, each side landing solid punches that, in turn, could lay the basis of a framework for trust and transparency in medical AIto the benefit of all. Now if they could only rap their arguments in the vein of Hamilton and Jeffersons Cabinet Battles in Hamilton.

Its easy to see where the academics arguments come from. Science is often painted as a holy endeavor embodying objectivity and truth. But as any discipline touched by people, its prone to errors, poor designs, unintentional biases orin very small numbersconscious manipulation to skew the results. Because of this, when publishing results, scientists carefully describe their methodology so others can replicate the findings. If a conclusion, say a vaccine that protects against Covid-19, happens in nearly every lab regardless of the scientist, the material, or the subjects, then we have stronger proof that the vaccine actually works. If not, it means that the initial study may be wrongand scientists can then delineate why and move on. Replication is critical to healthy scientific evolution.

But AI research is shredding the dogma.

In computational research, its not yet a widespread criterion for the details of an AI study to be fully accessible. This is detrimental to our progress, said author Dr. Benjamin Haibe-Kains at Princess Margaret Cancer Centre. For example, nuances in computer code or training samples and parameters could dramatically change training and evaluation of resultsaspects that cant be easily described using text alone, as is the norm. The consequence, said the team, is that it makes trying to verify the complex computational pipeline not possible. (For academics, thats the equivalent of gloves off.)

Although the academics took Google Healths breast cancer study as an example, they acknowledged the problem is far more widespread. By examining the shortfalls of the Google Health study in terms of transparency, the team said, we provide potential solutions with implications for the broader field. Its not an impossible problem. Online depositories such as GitHub, Bitbucket, and others already allow the sharing of code. Others allow sharing of deep learning models, such as ModelHub.ai, with support for frameworks such as TensorFlow, which was used by the Google Health team.

Ins-and-outs details of AI models aside, theres also the question of sharing data that those models were trained from. Its a particularly thorny problem for medical AI, because much of those datasets are under license and sharing can generate privacy concerns. Yet its not unheard of. For example, genomics has leveraged patient datasets for decadesessentially each persons genetic base codeand extensive guidelines exist to protect patient privacy. If youve ever used a 23andMe ancestry spit kit and provided consent for your data to be used for large genomic studies, youve benefited from those guidelines. Setting up something similar for medical AI isnt impossible.

In the end, a higher bar for transparency for medical AI will benefit the entire field, including doctors and patients. In addition to improving accessibility and transparency, such resources can considerably accelerate model development, validation and transition into production and clinical Implementation, the authors wrote.

Led by Dr. Scott McKinney, Google Health did not mince words. Their general argument: No doubt the commenters are motivated by protecting future patients as much as scientific principle. We share that sentiment. But under current regulatory frameworks, our hands are tied when it comes to open sharing.

For example, when it comes to releasing a version of their model for others to test on different sets of medical images, the team said they simply cant because their AI system may be classified as medical device software, which is subject to oversight. Unrestricted release may lead to liability issues that place patients, providers, and developers at risk.

As for sharing datasets, Google Health argued that their largest source used is available online with application to access (with just a hint of sass that their organization helped to fund the resource). Other datasets, due to ethical boards, simply cannot be shared.

Finally, the team argued that sharing a models learned parameters,that is, the bread-and-butter of how theyre constructedcan inadvertently expose the training dataset and model to malicious attack or misuse. Its certainly a concern: you may have previously heard of GPT-3, the OpenAI algorithm that writes unnervingly like a humanenough to fool Redditors for a week. But it would take a really sick individual to bastardize a breast cancer detection tool for some twisted gratification.

The academic-Google Health debate is just a small corner of a worldwide reckoning for medical AI. In September 2011, an international consortium of medical experts introduced a set of official standards for clinical trials that deploy AI in medicine, with the goal of plucking out AI snake oil from trustworthy algorithms. One point may sound familiar: how reliably a medical AI functions in the real word, away from favorable training sets or conditions in the lab. The guidelines represent some of the first when it comes to medical AI, but wont be the last.

If this all seems abstract and high up in the ivory tower, think of it another way: youre now witnessing the room where it happens. By publishing negotiations and discourse publicly, AI developers are inviting additional stakeholders to join in on the conversation. Like self-driving cars, medical AI seems like an inevitability. The question is how to judge and deploy it in a safe, equal mannerwhile inviting a hefty dose of public trust.

Image Credit: Marc Manhart from Pixabay

Continue reading here:

Can We Trust AI Doctors? Google Health and Academics Battle It Out - Singularity Hub

Related Posts