01The Drug-Discovery Process — Why It Is Long, Expensive, and Failure-Prone
Before discussing drug-discovery AI, let's first take apart "how a drug gets made." New-drug development proceeds, roughly, in the following order.
- Target identification (= the stage of finding the "weak point," such as a protein that causes the disease) — decide which molecule to aim at in order to stop the disease
- Hit-to-lead (= the stage of finding promising compounds and refining them into something drug-like) — narrow down candidate molecules that act on the target, from tens of thousands to millions
- Non-clinical studies — confirm efficacy and safety in cells and animals
- Clinical trials — verify safety and efficacy in humans, in order from Phase I to Phase III
- Approval application and manufacturing — pass the regulatory review and keep producing at consistent quality
A commonly cited estimate for the time and cost of the whole process is 10–15 years and about USD 2.5 billion per approved drug (as of 2013, including the cost of capital and the burden of failed candidates). Moreover, the probability that a candidate entering Phase I is ultimately approved is, across several large analyses, roughly 13–14%. In other words, try seven and six disappear along the way.
This triple burden of "time, cost, and failure rate" cannot be solved by speeding up any single point. If the target is chosen poorly, all the downstream compound design and clinical work is wasted. That is why the value of drug-discovery AI lies not only in "building faster" but in "not wandering down the wrong path for a long time."
02The Big Picture of Drug-Discovery AI — Where, and How, It Helps
"Making drugs with AI" tends to get lumped into one phrase, but in reality the technologies used and the things that can be expected differ from stage to stage. Let's first grasp the overall picture through four points of leverage.
Target Identification
Analyzes the vast web of relationships across genes, proteins, papers, and clinical data to generate hypotheses about which molecule to aim at. Its strength is picking up connections people overlook.
Compound Design
Generative models propose many molecular structures likely to bind the target. Combined with predictions of synthesizability and toxicity, the candidates are narrowed down.
Clinical Trials
Patient stratification, optimization of eligibility criteria, site selection, dropout prediction, and more. Raises the precision of the design and spots the seeds of failure early.
Manufacturing / Quality
Optimization of process parameters and anomaly detection. Machine learning is beginning to be used in this unglamorous but heavy area of keeping quality stable after approval.
What is common to all of them is this: AI is good at "producing candidates and prioritizing them," but cannot guarantee "this one is correct." So at every stage, AI's output is treated not as a final conclusion but as a hypothesis to be verified by people and experiments. In the next three sections we look at the specifics, in the order of target, compound, and clinical trial.
03Application to Target Identification — Finding the Disease's Weak Point
Target identification is the very upstream of drug discovery. Miss the aim here, and all the downstream effort is wasted. Traditionally, researchers have chosen targets by accumulating their own domain expertise and individual experiments.
At this stage, machine learning plays the role of "picking up connections that are hard for people to notice from within a vast web of relationships." For example, it integrates fragmentary information — a mutation in some gene, a particular disease, the action of an existing drug, adverse-event reports — and generates, in bulk, hypotheses such as "suppressing this molecule might work against this disease." Such analysis is also used in drug repositioning (= finding new uses for already-approved drugs), repurposing existing drugs for other diseases.
That said, what AI produces is ultimately a hypothesis based on correlation. "There is an association" is not the same as "it is the cause." Whether it truly holds up as a target can only be confirmed through experiments in cells and animals, and finally through verification in humans. AI widens the entrance to exploration, but proving correctness remains the job of experiment.
04Application to Compound Design — Building the Drug Molecule
Once the target is set, the next task is to find a molecule that binds it and changes its behavior. The number of theoretically possible small-molecule compounds is said to be on the order of 10 to the 60th power, and synthesizing and testing all of them is impossible. This is where generative models (= AI that creates new candidates from learned rules) come in.
A generative model learns from the target's structure and known active compounds, and proposes many molecular structures that are "likely to bind, synthesizable, and low in toxicity." Combined with activity prediction, property prediction, and synthetic-route prediction, this narrows the candidates that are actually made and tested down to a handful or a few dozen.
This field has real, peer-reviewed cases. A 2020 study that used deep learning to find an antibiotic candidate (halicin) structurally different from existing antibiotics, and 2024–2025 reports of a fibrosis-treatment candidate designed starting from generative AI that advanced into clinical trials, show results that go beyond theory (see references for details). Even so, these do not mean that "AI completed a drug on its own." Human judgment and verification through numerous experiments enter at every stage. AI raises the speed and diversity of candidate generation, but the practical work of synthesis, evaluation, and safety confirmation does not disappear.
05Application to Clinical Trials — Bringing It to Bear on Study Design
Once a candidate passes non-clinical studies, next come clinical trials in humans. Most of the development cost is spent here, and this is also where failure occurs most often. That is exactly why bringing AI to bear on trial design carries such weight.
Specifically, it is used in ways such as the following.
| Conventional Trial Design | Design With AI Built In |
|---|---|
| Recruit patients under broad criteria | Predict and stratify the patient segments most likely to respond, making differences visible even with fewer participants |
| Select sites by past track record | Select sites by predicting the speed and quality of enrollment |
| Respond to dropout after it happens | Predict early which participants are likely to drop out and provide more support |
| Confirm eligibility on paper, by hand | Efficiently extract eligible patients by analyzing medical records |
These push trials toward being "faster, smaller, and more reliable." But there is a caveat. The more you strengthen stratification, the narrower the target becomes, and it risks no longer representing the real-world patient population. Since AI learns from past data, populations not contained in that data (small racial groups, the elderly, people with comorbidities) tend to be left out. Balancing efficiency against generalizability is an area that needs the eyes of statistical experts and regulators.
06The Meaning of AlphaFold — What Structure Prediction Changed
AlphaFold always comes up in any discussion of drug-discovery AI. Published in 2021, this method predicts, with high accuracy, the folded three-dimensional structure of a protein from its sequence of amino acids. Because a protein's structure is the foundation that determines how a drug binds to that molecule, knowing the structure greatly helps the entrance to drug discovery.
Until then, experimentally determining a single protein's 3D structure by methods such as X-ray crystallography could easily take months to years. AlphaFold made it possible to get ahead of much of that by computation. Its public database holds more than 200 million predicted structures, freely usable by researchers worldwide.
Its meaning must be grasped precisely, though. What AlphaFold predicts is centered on "a single, static structure," whereas real proteins move, interact with other molecules, and change shape depending on circumstances. There are regions where predictions diverge from experiment. In other words, AlphaFold "dramatically lowered the cost of the starting point for structure estimation" — it does not "predict a drug's efficacy." Between knowing the structure and having a drug that works on it, there is still a long road. The narrative of "AlphaFold completes drug discovery," which blurs this distinction, is not accurate.
07Sizing Up the Limits — What AI Cannot Solve
To use drug-discovery AI at life size, it is essential to understand separately what it is good at and what it is poor at.
| What AI Is Relatively Good At | What AI Is Poor At / Cannot Do |
|---|---|
| Generating and narrowing a vast pool of candidates | The final guarantee of "whether it truly works" |
| Predicting patterns from known data | Discovering phenomena not in the data / new biology |
| Estimating structure and physical properties | Fully predicting complex in-body metabolism and side effects |
| Speeding up work and prioritizing | Proving causation (correlation and causation are different) |
There are two fundamental limits. First, AI is weak outside the data it has learned from. What is most valuable in drug discovery is "a mechanism no one has known before," yet AI, which learns from past data, is structurally poor at such unknown discovery. Second, biology is complex, and the behavior inside the human body cannot be fully predicted by computation alone. That is precisely why clinical trials will not go away.
Drug-discovery AI has realistic value as a tool that "speeds up exploration and finds failure early." At the same time, the expectation that "a drug can be made without experiments" is, at least for now, excessive. Whether one can make this distinction is the dividing line for staff deciding on AI investment internally.
08The Relationship With Regulation — PMDA, FDA, and the Pharmaceutical Act
Even if AI produces a candidate, bringing it out as a drug requires passing the regulatory review. What matters here is that the criterion for review is not "who (or what) made it" but "what was shown." Even for a compound designed by AI, the required quality and quantity of non-clinical and clinical data do not change.
Regulators too are starting to engage with the use of AI. In 2023 the U.S. FDA published a discussion paper on the use of artificial intelligence and machine learning in drug development, setting out its thinking on transparency, data reliability, and model validation. Japan's PMDA (Pharmaceuticals and Medical Devices Agency) and the regulatory framework will be dealt with head-on in the next installment of this series.
09Data Quality — If the Foundation Crumbles, So Does the Conclusion
The performance of drug-discovery AI is determined less by the novelty of the model than by the quality of the data used for training. The old adage "garbage in, garbage out (= poor input yields only poor output)" applies directly.
Drug-discovery data has its own inherent weaknesses.
- Bias — published experiments tend to be skewed toward "results that worked." Data from failed experiments is rarely made public, so AI learns mostly from successes and becomes optimistic
- Reproducibility — the same experiment can give different results in a different lab. Learn from scattered data and the predictions scatter too
- Insufficient volume — for rare diseases and new targets, there simply is not enough data to train on in the first place
So having excellent drug-discovery AI and preparing high-quality, low-bias data are separate challenges. In many settings the real bottleneck lies not in the model but in preparing the data — collection, standardization, and quality control. Skipping this and thinking "installing the latest model will solve it" mistakes the order of things.
10The Future, and Connections to Other Chapters on This Site
How will drug-discovery AI progress over the next 5–10 years? Avoiding exaggeration, the following directions are realistic.
- Exploration becomes routine — the use of AI in target identification and compound design becomes a standard tool rather than something special
- Integration with experiment — a "fast round trip between design and experiment," in which AI produces candidates, robots automatically synthesize and evaluate them, and AI relearns from the results, spreads
- Refinement of clinical design — use in patient stratification and trial design advances, while at the same time the regulatory gaze on generalizability and fairness grows stronger
On the other hand, what does not change is equally clear. Clinical trials in humans, the regulatory review, and responsibility for safety — these do not disappear even with AI. Drug-discovery AI is not "a machine that automatically produces drugs" but "a tool for walking the long, expensive, failure-prone road a little more wisely." This life-sized understanding avoids both over-investment and underestimation.
This installment connects to other chapters of this site as follows. Reading them together adds depth to your understanding.
- AI Medical Vol. 6 — PMDA and AI Medical Devices — how Japan's regulation reviews products involving AI. The continuation of this installment's "relationship with regulation"
- Ad Regulations series — how you may communicate a drug born from AI discovery. The practice of Pharmaceutical Act Articles 66, 68, and 68-2
- Material Review series — the discipline of review, which finally receives the generated information
Drug-discovery AI is indeed beginning to hold real value at each stage of target identification, compound design, and clinical trials. Generating a vast pool of candidates quickly, finding the seeds of failure early, and — like AlphaFold — lowering the cost of the entrance to structure estimation: these are not armchair theory but have accumulated as peer-reviewed results. Yet at the same time, AI does not guarantee that something "works," is poor at discoveries not in the data, and cannot substitute for verification in humans. Correlation is not causation, and knowing the structure is not the same as having a drug.
So the key is to treat drug-discovery AI as neither "magic" nor "sham." Use it as a tool for walking this long, expensive, failure-prone road a little more wisely, facing regulation and data quality head-on. To the extent you raise the speed, let the machinery of verification and responsibility keep pace. That is the question the pharmaceutical field must answer for drug discovery in the AI era. Next time, we step into the approval and regulation of AI medical devices in Japan — the role of the PMDA.
- Drug discovery has a "long, expensive, failure-prone" structure: 10–15 years, about USD 2.5 billion, and a roughly 13–14% probability of approval from Phase I. The value of drug-discovery AI lies less in "building faster" than in "not wandering down the wrong path for a long time." At each stage — target, compound, clinical trial — AI helps generate and prioritize candidates but does not guarantee correctness.
- AlphaFold predicts 3D structure from sequence with high accuracy and dramatically lowered the cost of the entrance to structure estimation. But what it predicts is mainly "a single, static structure"; it does not predict a drug's efficacy. The narrative of "AlphaFold completes drug discovery" is not accurate. AI produces hypotheses based on correlation, but proving causation is the job of experiments in humans.
- The criterion for review is not "what (or who) made it" but "what was shown." Even for a drug designed by AI, the required quality and quantity of data do not change. Nor does the fence on advertising and information provision loosen — exaggeration is Pharmaceutical Act Article 66, unapproved drugs are Article 68, and information provision is Article 68-2. The true bottleneck is less the model than preparing high-quality, low-bias data.
- Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. (The core paper on protein structure prediction)
- Varadi M, Anyango S, Deshpande M, et al. AlphaFold Protein Structure Database. Nucleic Acids Research. 2022;50(D1):D439–D444. (The database freely publishing over 200 million predicted structures)
- Stokes JM, Yang K, Swanson K, et al. A Deep Learning Approach to Antibiotic Discovery. Cell. 2020;180(4):688–702. (The peer-reviewed case identifying the novel antibiotic candidate halicin with deep learning)
- Ren F, Aliper A, Chen J, et al. A small-molecule TNIK inhibitor targets fibrosis in preclinical and clinical models. Nature Biotechnology. 2025;43:63–75. (The report of a candidate designed starting from generative AI advancing to the clinical stage)
- Vamathevan J, Clark D, Czodrowski P, et al. Applications of machine learning in drug discovery and development. Nature Reviews Drug Discovery. 2019;18:463–477. (A review of machine-learning applications across the drug-discovery stages)
- DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: New estimates of R&D costs. Journal of Health Economics. 2016;47:20–33. (The estimate of about USD 2.5 billion for new-drug development cost)
- Wong CH, Siah KW, Lo AW. Estimation of clinical trial success rates and related parameters. Biostatistics. 2019;20(2):273–286. (The estimate of the probability of approval from Phase I)
- U.S. Food and Drug Administration. Using Artificial Intelligence and Machine Learning in the Development of Drug and Biological Products (Discussion Paper). FDA, 2023. (The regulator's discussion paper on the use of AI/ML in drug development)
- Director-General, Pharmaceutical Safety and Environmental Health Bureau, Ministry of Health, Labour and Welfare. Guideline on Sales Information Provision Activities for Prescription Drugs. MHLW, 2018. (The compliance framework for sales information provision activities)