Supercomputers Are Stocking Next Generation Drug Pipelines

A new model incorporates protein, drug, and clinical data to better predict which genes are most likely make proteins that drugs can bind to.
DNAPILLSTAcreditEvanMills.jpg
Evan Mills

Developing new drugs is notoriously inefficient. Fewer than 12 percent of all drugs entering clinical trials end up in pharmacies, and it costs about $2.6 billion to bring a drug to market. It's mostly a process trial by error---squirting compounds and chemicals one-by-one into petri dishes of diseased cells. There are so many molecules to test that pharmaceutical researchers use pipetting robots to test a few thousand variants all at once. The best candidates then go into animal models or cell cultures, where *hopefully *a few will go on to bigger animal and human clinical trials.

Which is why more and more drug developers are turning to computers and artificial intelligence to narrow down the list of potential drug molecules---saving time and money on those downstream tests. Algorithms can identify genes that code for proteins that have good potential for drug-binding. And new models, including one published today in * Science Translational Medicine*, add new layers of complexity to narrow down the field---incorporating protein, drug, and clinical data to better predict which genes are most likely make proteins that drugs can bind to.

“Drug development can fail for many reasons,” says genetic epidemiologist, Aroon Hingorani, a co-author on the paper. “However, a major reason is the failure to select the correct target for the disease of interest.” A drug might show initial promise in early experiments in cells, tissues, and animal models, but these too often are overly simplistic and rarely subjected to randomization and blinding. The most common model for schizophrenia, for example, is a mouse that jumps explosively, a behavior known as “popping”---not the most natural model for a humans' response to a psychoactive drug. Scientists use these results to make hypotheses about which proteins to target, but since these studies tend to be small and short, there are a lot of ways to misinterpret results.

Rather than relying on those limited experiments, Hingorani’s group built a predictive model that combined genetic information with protein structure data and known drug interactions. They ended up with nearly 4,500 potential drug targets, doubling prior estimates for how much of the human genome is considered “druggable.” Then, two clinicians combed through to find 144 drugs with the right shape and chemistry to bind with proteins other than their established targets. These have already passed safety testing---which means they could quickly be repurposed for other diseases. And when you’re developing drugs, time is money.

Researchers estimate that about 15 to 20 percent of the cost of a new drug goes to the discovery phase. Typically, that represents up to a few hundred million dollars and three to six years of work. Computational approaches promise to cut that process down to a few months and a price tag in the tens of thousands of dollars. They haven’t delivered yet---there’s no drug on the market today that started with an AI system singling it out. But they’re moving into the pipeline.

One of Hingorani’s collaborators is a VP of biomedical informatics at BenevolentAI---a British AI company that recently signed a deal to acquire and develop a number of clinical stage drug candidates from Janssen (a Johnson & Johnson pharma subsidiary). They plan to start Phase IIb trials later this year. Other pharma firms are jumping in too; last month Japanese opthalmology giant Santen signed a deal with Palo Alto-based twoXAR to use its AI-driven technology to identify new drug candidates for glaucoma. And a few weeks ago two European companies---Pharnextand Galapagos---teamed up to put computer models to work on finding new treatments for neurodegenerative diseases.

But Derek Lowe, a longtime drug pipeline researcher who writes a blog on the subject for Science, says he's usually skeptical of purely computational approaches. “In the long run I don’t see any reason why this stuff is impossible,” he says. “But if someone comes to me saying that they can just predict the activity of a whole list of compounds, for example, I’m probably going to assume it’s bullshit. I’m going to want to see a whole lot of proof before I believe it.”

Companies like twoXAR are working to build up that body of evidence. Last fall they teamed up with the Asian Liver Center at Stanford to screen 25,000 potential drug candidates for adult liver cancer. Working out of an abandoned nail salon in Palo Alto, they sent their computer software sifting through genetic, proteomic, drug, and clinical databases to identify 10 possible treatments. Samuel So, the director of the liver center, was surprised with the list they brought back: It included a few predictions made by researchers in his lab. So he decided to test all 10. The most promising one, which killed five different liver cancer cell lines without harming healthy cells, is now headed toward human trials. The only existing FDA-approved treatment for the same cancer took five years to develop; so far, it’s taken twoXAR and Stanford four months.

It’s exciting: For an industry with such a high failure rate, even small gains could be worth billions of dollars. Not to mention all those human lives. But the real case for turning pharmaceutical wet labs into server farms won't be made until drugs actually make it to market.