01What Pair Programming Was, Exactly
First, let's recall human-to-human pair programming accurately. It is a practice that spread through XP (= Extreme Programming, a way of developing that stays robust to change) in the 1990s, in which two people write code while looking at a single screen. One person takes the keyboard and writes (= the driver); the other watches the whole and offers observations (= the navigator). In the account Williams and Kessler assembled in 2002, the aim of this form was not "typing faster" but separating the side that writes from the side that watches, so that mistakes surface early.
The key point is that the two are not simply doing the same work twice. Their roles differ. The driver concentrates on the line in front of them; the navigator steps back to ask "is this design really sound?" and "is anything being missed?" Because a near view and a far view are held at the same time, defects surface earlier than when one person writes alone. The essence of pair programming lies in this division of perspective.
This account maps directly onto the relationship between people and AI. AI is a writer that can move fast. So what role should the person take? To give the answer up front: the person becomes the navigator. This installment fleshes out what that role contains.
02Dividing Roles with AI — Who Is the Driver
When a person and an AI form a pair, the first thing to decide is the roles. Naively, it seems enough to say "AI is fast, so let AI do it and the person just checks." But in actual practice, the role that should be held swaps depending on the situation. We organize this into three modes.
AI-Driver
Routine processing, known patterns, scaffolding tests. Let AI produce a first draft while the person watches over the design and correctness. Speed is highest — but when the person loosens the watch, accidents spread fast too.
Human-Driver
Vague requirements, weighty judgments, anything that touches regulation. The person leads, and asks AI to offer alternatives or point out oversights. Slow — but responsibility stays on the person's side.
Hand-Off
Drafting by AI, judgment by the person, fair copy by AI again. The reins are handed back and forth by phase. Most real work settles into this mode. What matters is being aware of the moment of handoff.
One principle runs through all three. The role that moves the hands may pass, but the role of final judgment never leaves the person. Even in a mode that puts AI in the driver's seat, the navigator's chair — the seat that decides what is correct — is always occupied by the person. Handing over the role for the sake of speed is fine, but responsibility must not be handed over with it. From the next section on, we separate what may be handed over from what must not.
03Generation ≠ Evaluation — Separate the Writer from the Reviewer
Here is the core of this installment. Generation (= making code or text) and evaluation (= judging whether it is correct) are separate tasks. And these two must not be bundled into the same agent. Even in human-to-human review, having "the person who wrote the code approve their own code" is avoided. Writers carry a bias to see what they made in a favorable light.
With AI, this bias appears in a different form. AI optimizes its output not for "correctness" but for "plausibility." It produces code that reads well, looks right, and seems to work at a glance — which is exactly why mistakes are hard to notice. Worse still, ask the same AI "is this code correct?" and it tends to affirm its own output. When the agent that generated something also serves as evaluator, the review becomes review in form only.
| Role | Task performed | What happens if combined |
|---|---|---|
| Generator | Produce a fast first draft of code, tests, and text | Speed is there, but bias toward what it made remains |
| Evaluator | Judge correctness, safety, and regulatory fit independently | If the same agent as the generator, review drifts into self-affirmation |
| Combined (forbidden move) | The maker signs off on their own work | Plausible mistakes pass straight through |
So the practice is this. Stand up the generation path and the evaluation path separately. Code written by AI is checked in a different system — human review, an independent test suite, or a separately prepared verification procedure. Never let it verify its own output. This holds the same for human teams and for AI: it is the first and foremost principle of review.
04Where Does Verification Responsibility Sit?
Even after roles are divided, one question remains at the end. "When AI-written code causes an accident in production, who is responsible?" This cannot be left vague. The answer is clear. The responsibility for verification sits with the person who released the code to the world. Responsibility is not diluted by the fact that AI wrote it.
This principle is continuous with pharma material review. Even for promotional materials produced by generative AI, the yardstick applied was the same one made for humans. Judgment comes from "what is written," not "who made it" — and code is exactly the same: responsibility is borne on the basis of "what the code does," not "whether AI or a person wrote it." Being AI-made is no absolution.
05Psychological Safety — Letting AI Say "I Don't Know"
Here we insert one point from human teams. It is the idea of psychological safety (= a team state in which mistakes and doubts can be voiced without fear of punishment), presented by Edmondson in 1999. In her research, teams that reported more mistakes turned out, counterintuitively, to perform better. The reason is simple: only a team that does not hide its mistakes can fix them.
This idea works for human-AI pairs too. When the person silently accepts "whatever AI produced must be correct," mistakes never come to the surface. Conversely, build a relationship — even between a person and a tool — in which one can casually throw doubt at the output: "this part looks suspicious," "show me the evidence." Concretely, use AI in a way that does not let it assert, but makes it state its uncertainty. Ask "what is the source for this figure?" and "where are you least confident?", leaving room for AI to answer "I have no firm evidence."
Psychological safety looks like a matter of human atmosphere, but it is really a mechanism for surfacing mistakes early. It is the same in an AI pair: writing down, as team practice, the premise that a person may doubt AI's output — indeed, that doubting is the job — lowers the probability that a plausible mistake passes in silence.
06Team Practice — Four Promises
Let's put all of this into the shape of promises you can run day to day. Four things worth deciding in advance for a team where a person and an AI form a pair. None are flashy, but many accidents happen precisely when these are skipped.
State the roles
For this task, is AI the driver or the adviser, and which is the person? Decide before starting. Begin without deciding and a spot no one is watching will appear.
Separate generation from evaluation
Whatever AI writes is checked in a different system. Do not let the same AI grade itself. Always pass it through human review or an independent test.
Demand sources
For output involving claims, figures, or efficacy, require evidence to be presented. Output that cannot show its basis is not treated as complete.
Name one responsible person
Decide, from among the people, one person responsible for releasing the deliverable. AI cannot bear responsibility. It is always a person who does.
What runs beneath all four is the division: speed to the tool, judgment to the person. Hand simple work and drafts to AI to run fast, and keep judgment about correctness, safety, and regulatory fit firmly in human hands. Just as human pair programming found defects early through "division of perspective," a human-AI pair also reaches for speed and certainty at once by dividing the roles.
07Summary — Use Speed to Protect Trust
Let's trace the key points once more. The essence of pair programming was not typing fast, but separating the writing perspective from the watching perspective so mistakes surface early. In a human-AI pair, AI serves as the fast writer, and the person moves into the watcher's role — the navigator. The hands-on role may be passed back and forth by phase, but the role that decides what is correct is never let go by the person.
We set out three pillars. First, separate generation from evaluation. Do not let an AI that optimizes for plausibility grade itself; check it in a different system. Second, verification responsibility sits with the person. "AI wrote it" is no absolution; responsibility is borne on "what the code does." Third, psychological safety — write down, as team practice, the premise that AI's output may be doubted. The power to create fast is used to protect trust, which is easily broken. This is the core of human-AI collaboration.
AI has made it possible to write code fast and in volume. This is a large opportunity. But that same speed carries plausible mistakes at a new scale. The choice is neither to slow down and buy safety, nor to discard safety and take speed. Divide the roles, cut generation apart from evaluation, and seat verification responsibility with the person — with these three, speed and certainty coexist. All the more so when writing code around pharma. Confirm separately that execution succeeded and that the content complies with regulation. The key to using AI as a partner in the pair is to put into words, in advance, what you entrust to the partner and what you do not.
- The essence of pair programming is "division of perspective." In a human-AI pair, AI is the writer and the person is the watcher (the navigator). The hands-on role may be passed by phase, but the role that decides what is correct is never let go by the person.
- Generation ≠ evaluation. Do not let an AI that optimizes for plausibility grade itself; stand up the generation path and the evaluation path separately. "It ran" is not "it's correct" — confirm execution success and content correctness separately.
- Verification responsibility sits with the person. "AI wrote it" is no absolution; judgment comes from "what it does," not "who made it." Write down psychological safety — that AI's output may be doubted — as practice, and name one responsible person.
- Williams, L. & Kessler, R. Pair Programming Illuminated. Addison-Wesley, 2002. (Foundational text summarizing the role division of pair programming and its defect-detection effect)
- Edmondson, A. Psychological Safety and Learning Behavior in Work Teams. Administrative Science Quarterly, 44(2), 350–383. 1999. (Study showing that teams able to voice mistakes perform better)
- Beck, K. Extreme Programming Explained: Embrace Change. Addison-Wesley, 2000. (The original text laying out the whole of XP, including pair programming)
- Fagan, M. E. Design and Code Inspections to Reduce Errors in Program Development. IBM Systems Journal, 15(3), 182–211. 1976. (Classic on code inspection, in which a role separate from the writer detects defects)
- Ministry of Health, Labour and Welfare. Act on Securing Quality, Efficacy and Safety of Pharmaceuticals, Medical Devices, etc. (Pharmaceuticals and Medical Devices Act), Articles 66, 68, and 68-2. (Primary provisions: exaggerated advertising = Article 66, advertising of unapproved products = Article 68, information provision = Article 68-2)
- Director-General, Pharmaceutical Safety and Environmental Health Bureau, MHLW. Guidelines on Sales Information Provision Activities for Prescription Drugs. 2018. (Notice setting out the compliance items for sales information provision activities)
- Director, Compliance and Narcotics Division, Pharmaceutical Safety and Environmental Health Bureau, MHLW. Standards for Fair Advertising of Drugs. (Notice presenting the yardstick for judging the propriety of advertising expression)