AI Marketing 09 — Automated Creative Optimization: Infinite Generation and Automated A/B | AI Marketing | Pharmaceutical Advertising Regulation: Material Creation, Review & Use in Japan

Automated Creative Optimization — Infinite Generation and Automated A/B— Mass generation vs. brand consistency, and the limits of automating regulatory review

Generative AI (= AI that produces text and images automatically) has made it possible to create advertising creative (= the expressive material — headlines, images, copy) by the thousands. Once you can make more, you want the machine to test which ones work. And in the world of web advertising, systems already run that generate expression automatically and shift delivery toward whatever performs best. In pharma, however, this "infinite generation × automated optimization" runs into two walls — brand consistency, and regulatory review. This installment unpacks two technologies in plain terms — DCO and the multi-armed bandit — and, from the vantage point of pharmaceutical content work, draws the line between what can be handed to automation and where the human has to remain.

01What DCO (Dynamic Creative Optimization) Is

First, let's pin down the idea at the center of this field precisely. DCO (= Dynamic Creative Optimization) is a mechanism that, for a single ad slot, recombines parts — headline, image, description, button color — differently for each viewer and situation. Instead of deciding on "this one ad" in advance and showing it to everyone, you prepare the parts and assemble them on the spot before serving.

If the parts are 4 headlines, 3 images, and 2 buttons, the combinations come to 4×3×2, or 24. When generative AI mass-produces these parts, the combinations balloon easily into the thousands and tens of thousands. The aim of DCO is to shift delivery, out of that vast set, toward whatever performs well. Here, "how many you can make" and "how many you can test" become linked.

In pharma, though, this "recombination" cannot be used naïvely. Wording that touches efficacy, indications, or safety is not something you may freely swap in and out as parts. Bring DCO in as-is, and approved combinations end up mixed together with unapproved ones. That danger is addressed head-on in a later section. First, let's follow the basic flow — make, test, shift — through the technology in the next section.

02How the Multi-Armed Bandit Thinks

"Test which expression works, and shift delivery toward the good ones." The foundation of this mechanism is the multi-armed bandit (= a method that tests several options while adjusting how much of your betting goes to the winners). The name comes from a slot machine with multiple levers (= a play on "one-armed bandit," the bandit with many arms). Not knowing which lever pays off most, how should you pull to gain the most within a limited number of tries — that is the problem.

There are two conflicting desires here. Trying an option you haven't tried yet — "exploration" — and continuing to use the option that has been best so far — "exploitation." All exploration, and you fail to cash in even after finding a good option. All exploitation, and you cling to whatever happened to look good first, missing the true winner. How to tune this tug-of-war is the heart of the bandit. In a 2002 paper, Auer and colleagues formalized this tug-of-war mathematically as UCB (= Upper Confidence Bound, a mechanism that preferentially tries options with few trials so far).

The difference from A/B testing: The commonly used A/B test fixes delivery at 50/50 for a set period, then picks the winner after it ends. The multi-armed bandit keeps shifting the delivery ratio while watching results midstream. It can move toward a promising expression early, but if a judgment is skewed by the thin data of the opening phase, it also risks cutting the true winner too soon. "Shifting fast" and "judging correctly" are, here too, in a tug-of-war.

DCO becomes easier to grasp when you see it as this multi-armed bandit applied to combinations of parts. Treat the thousands of combinations as "arms," and shift delivery toward the arms that perform well. Generative AI makes arms without limit, and the bandit selects the winners from among them — that is what "infinite generation × automated optimization" really is.

03Mass Production by Generative AI

Now to the side that makes the arms — mass production by generative AI. Once, crafting a single headline took time. Now, specify the angle, tone, and length, and dozens of candidates appear in minutes. Images too can be made in parallel, multiple drafts each for a patient-facing or physician-facing audience. The number of arms you can test has changed by an order of magnitude.

What makes this mass production dangerous in pharma is that AI optimizes for "plausibility," not "correctness." The more readable, persuasive, and convincing the expression, the less its errors stand out. Let's restate the three traps of mass production here.

Trap 01

Hallucination

"a plausible lie"

Nonexistent trial results or exaggerated figures slip into natural-sounding text (= hallucination, AI confabulation). In pharma, this leads straight to exaggerated advertising.

Trap 02

Off-Label Efficacy

"overclaiming"

AI is skilled at crafting "expression that sounds effective." But a drug may only be spoken of within its approved indications. Go beyond that range, and it becomes unapproved advertising.

Trap 03

Combinatorial Explosion

"review can't keep up"

The more parts you swap, the more combinations swell. Unless each one is reviewed, "how many you can make" outruns "how many you can review."

The third is a trap that arises specifically when mass production and automated optimization are joined. Even if each individual part has passed review, meaning can change in the combined context. Place the phrase "highly effective" next to a certain comparison-data image, and though each is fine alone, together they can imply an exaggerated superiority. The more arms there are, the less a human can track all of this contextual drift.

04Brand Consistency (Vibe)

Another wall for mass production is brand consistency (= the coherence by which any expression feels like the same brand; in this series we call it Vibe). Make arms without limit and shift toward the ones that perform, and expression gets pulled toward "whatever draws the most response right here." Leave it alone, and tone, color, and forcefulness of claim scatter; short-term response may be good, but coherence as a brand erodes.

A pharmaceutical brand stands not on flashiness but on accuracy and composure. Expression that earns response with strong words may post good numbers in the moment, but it chips away at a long-term asset — the trust of healthcare professionals and patients. So automated optimization must be given, in advance, not only performance metrics but a frame that asks "does this expression stray from the brand's voice?" The range in which the bandit is allowed to explore is bounded first, from the brand's side.

Concretely, the permitted tones, the expressions to avoid, and sample tones are embedded structurally into the generation instructions. Apply the brand frame at the stage of making arms, and don't generate arms outside the frame at all. Rather than letting it create freely and then fixing it, mass-produce only inside the frame from the start. This ordering is the key to protecting the Vibe while using speed to the fullest.

05The Limits of Automating Regulatory Review

Here is the core of this installment. If arms swell into the thousands, you'll want to automate regulatory review too. Detecting prohibited terms, and range-checking efficacy expressions, can be largely handed to the machine. But pharmaceutical regulatory review has a part that cannot be fully automated. To see it precisely, let's confirm the immovable foundation — the Pharmaceuticals and Medical Devices Act (PMD Act).

Article	What it establishes	Caution under automated optimization
PMD Act Article 66	Prohibition of exaggerated advertising. Whether explicit or implied, false or exaggerated accounts of efficacy or safety must not be advertised	The higher-responding arms tend toward "inflated" expression. Beware of assertions, superlatives, and implication
PMD Act Article 68	Prohibition of advertising pre-approval drugs. Unapproved drugs or efficacies must not be advertised	Recombining parts can produce efficacy beyond the approved range. Off-label implication touches this article
PMD Act Article 68-2	Provision of drug information (a duty to endeavor to provide information for proper use)	Response optimization tends to trim safety and risk information. Omission runs counter to the intent of this article

Misidentifying an article number itself damages trust. Exaggerated advertising is Article 66, unapproved advertising is Article 68, and information provision is Article 68-2 — this is a foundation worth memorizing outright. Beneath it sit the Ministry of Health, Labour and Welfare's Sales Information Provision Activity Guideline (= HanteiG, the Guideline on Sales Information Provision Activities for Prescription Drugs; notice of the Director-General of the MHLW Pharmaceutical Safety and Environmental Health Bureau, 2018) and the Standards for Fair Advertising of Drugs and the Like (= the yardstick for advertising expression, issued as a notice by the Director of the Compliance and Narcotics Division, Pharmaceutical Safety and Environmental Health Bureau, MHLW). Even when AI does the making, the standards applied are the same.

The limits of automation converge on two points. First, whether something is exaggerated is decided by context. A prohibited-term list can block words like "best" or "safe," but it cannot fully catch expression that creates an exaggerated impression without using those words. Second, the meaning of a combination can only be read by a human. Even when each part passes, the implication that arises when they are placed side by side is hard for a machine to judge. So regulatory review must take a two-layer structure: the machine handles a first-pass filter, and the final go/no-go remains with a human. A design that runs fully automated all the way to publication does not hold up in pharma.

06The Scarcity of the Creator

"If AI makes things without limit, do we stop needing people?" To this question, the answer is a clear no. What overflows from mass production is "plausible expression." The more it overflows, the higher the value of the judgment that picks the one that is correct, on-brand, and does not cross regulation, and sets the direction. As the labor of making drops, the eye that selects and the sense of direction become scarce.

At least three jobs remain in pharmaceutical practice. Defining the brand's voice and putting it into words as a frame for generation. Reading the context of combinations and seeing through the regulatory danger. And rendering the judgment — irreducible to any metric — of whether to take short-term response or long-term trust. None of these are things AI can mass-produce. The creator shifts role, from the person who makes parts to the person who designs the frame, selects, and takes responsibility.

07Measuring Effect

Automated optimization is inseparable from measuring effect. Click-through rate, dwell time, information requests — data can be captured in fine detail. But measuring the effect of pharmaceutical content carries a caution absent from general web marketing. Strong short-term response can harm the long-term brand.

A slightly "inflated" arm can produce better response in the moment. But if it is ever deemed exaggerated, the trust built up collapses in an instant. A drug is a "trust good (= a good whose quality is hard to verify in advance, standing on trust)." So in measuring effect, you must always look, alongside the response numbers, at whether that expression is eroding the brand's trust. Build regulatory compliance and brand consistency into the very definition of the "performance" handed to the multi-armed bandit — a design that does not let it maximize response alone. How you set the metric decides the quality of the automation.

08Connections to Other Chapters on This Site

This installment gains depth read together with the following chapters.

AI Marketing Vol. 1 — Marketing Redefined — The whole map of advertising, CRM, content, and brand. This installment is the frontier of its "advertising × content."
AI Marketing Vol. 5 — AI-Generated Content Strategy — The human-in-the-loop design that builds review into the speed of mass production. The foundation for this installment's mass production and automation.
Material Review series — The practical review that finally catches the generated output and its combinations. The "exit" of automation.

In Closing

Automated creative optimization has made it possible, through two technologies — DCO and the multi-armed bandit — to "make without limit, test as you go, and shift toward the good ones." Generative AI mass-produces the arms; the bandit picks the winners. This speed is a large opportunity. But in pharma, two walls do not move — brand consistency (Vibe), and regulatory review.

Apply the frame first. At the stage of making arms, hand over the brand's voice and the approved information, and do not generate outside the frame. For regulatory review, let the machine handle a first-pass filter, and keep the judgment of context and combination with the human. The PMD Act does not move even when AI gets faster — Article 66 exaggeration, Article 68 unapproved, Article 68-2 information provision. The more thoroughly you hand the automatable parts to the machine, the higher the value of the judgment you cannot hand over — selecting, setting direction, taking responsibility. What becomes scarce in the age of infinite generation is not the hand that makes, but the eye that selects. Next time, we move to how marketing changes in a world after third-party cookies have been abolished.

Key Points — Three to Take Away

DCO (Dynamic Creative Optimization) is a mechanism that recombines parts and serves them differentially. The multi-armed bandit beneath it is a tug-of-war between "exploration (trying the untried)" and "exploitation (using the good ones)," shifting delivery toward good expression. Generative AI mass-produces the arms and the bandit selects the winners — that is what "infinite generation × automated optimization" really is.
Two walls do not move in pharma — brand consistency (Vibe) and regulatory review. At the stage of making arms, hand over the brand's voice and approved information as a frame, and do not generate outside the frame. Leave it to response optimization and tone and forcefulness of claim scatter, chipping away at long-term trust in exchange for short-term numbers.
Regulatory review cannot be fully automated. Prohibited terms and efficacy ranges can be blocked by machine, but whether something is exaggerated is contextual, and the meaning of a combination can only be read by a human. The PMD Act (exaggeration Art. 66, unapproved Art. 68, information provision Art. 68-2) does not change even when AI gets faster. What is scarce in the age of mass production is, more than the hand that makes, the eye that selects and the judgment that takes responsibility.

Sources / References

Auer, P., Cesa-Bianchi, N., & Fischer, P. Finite-time Analysis of the Multiarmed Bandit Problem. Machine Learning, 47(2–3), 235–256. Kluwer Academic Publishers, 2002. (The foundational paper formalizing exploration and exploitation of the multi-armed bandit as UCB)
Ministry of Health, Labour and Welfare. Act on Securing Quality, Efficacy and Safety of Products Including Pharmaceuticals and Medical Devices (PMD Act), Articles 66, 68, and 68-2. (The primary articles: prohibition of exaggerated advertising = Art. 66, prohibition of advertising pre-approval drugs = Art. 68, provision of information for proper use = Art. 68-2)
Director-General, Pharmaceutical Safety and Environmental Health Bureau, MHLW. Guideline on Sales Information Provision Activities for Prescription Drugs (HanteiG). MHLW, 2018. (The notice establishing the proper conduct of information provision activities for prescription drugs)
Director, Compliance and Narcotics Division, Pharmaceutical Safety and Environmental Health Bureau, MHLW. On the Standards for Fair Advertising of Drugs and the Like. MHLW, 2017. (The division-director notice presenting the yardstick for judging whether advertising expression is permissible)
Sutton, R. S., & Barto, A. G. Reinforcement Learning: An Introduction (2nd ed.). MIT Press, 2018. (The standard textbook on reinforcement learning and the multi-armed bandit, including the tug-of-war of exploration and exploitation)
World Health Organization. Ethical Criteria for Medicinal Drug Promotion. WHO, 1988. (The international standard for the accuracy, fairness, and verifiability that drug promotion must uphold)

← Back to AI Marketing