AI Programming 02 — Working With Copilot: The Practical Discipline of Completion-Type AI | AI Programming | Pharmaceutical Advertising Regulation: Material Creation, Review & Use in Japan

Working With Copilot ── The Practical Discipline of Completion-Type AI── Proven productivity, and the design of acceptance judgment, security, and review integration

You are in the middle of writing code when the next few lines quietly appear in grey. Press Tab, and they become your own code ── this is the core experience of completion-type AI (= an AI that proposes the continuation of your code as you type). Led by GitHub Copilot (= the flagship completion-type AI offered by GitHub), tools of this kind have become part of everyday practice since 2022. This installment lays out, step by step, how much faster completion-type AI actually makes you (the numbers from empirical studies), how to judge whether to accept a suggestion, and how to design the "back side of speed" ── security, licensing, and review. For anyone in pharmaceutical marketing, development, or medical affairs who writes in-house tools or analysis scripts, this should become a set of criteria you can use starting tomorrow.

01How Completion-Type AI Works ── It Is "Prediction," Not "Understanding"

First, let's be precise about what completion-type AI is doing. A completion-type AI like Copilot uses the contents of the file you are writing, the surrounding files you have open, and your function names and comments as clues, and predicts and displays "the code most likely to come next." At the core of the mechanism is a large language model (= a prediction model trained on large volumes of text and code), which has learned patterns of "in a context like this, the code usually continues like that" from vast amounts of public code.

What you must grasp here is that completion-type AI does not understand your intent. Write "convert the amount to tax-inclusive" in a function comment, and it will produce plausible-looking code. But it is only reproducing code that humans commonly wrote in similar contexts ── it is not writing with any "knowledge" of whether your system's tax rate is really 10%, or whether rounding is truncation or round-half-up.

This distinction is the foundation for every later section. Completion-type AI is a tool that outputs "the probable continuation," not a tool that guarantees correctness. That is precisely why the acceptance judgment on the receiving side (Section 03) and review (Section 06) are indispensable.

02The Empirical Productivity Studies ── "Speed" Seen in Numbers

The felt sense that "completion-type AI is fast" is widely shared, but what do the actual numbers say? Here we look not at impressions but at published empirical research.

In a controlled experiment by GitHub and researchers (Kalliamvakou, Ziegler, et al., 2022–2024), the same task (writing an HTTP server in JavaScript) was compared between a group with Copilot and a group without. The result: the group with Copilot took about 55% less time to complete the task. The completion rate itself also tended to be higher in the group that had it.

That said, taking this number at face value is dangerous. The following caveats apply.

It depends on the nature of the task ── The more routine the task and the clearer the correct answer, the larger the effect. Conversely, for design decisions that require deep understanding of the business domain, the effect is limited.
"Fast" and "correct" are separate ── You can write faster, but whether that code is correct is a different question. A speed number is not a correctness number.
Self-reported satisfaction is high, but diverges from objective metrics ── Most users feel "my productivity went up," yet there is a gap with measured results, and the felt sense can tilt toward overestimation.

In short, completion-type AI genuinely raises "the speed of routine writing." But that is only a small part of practice as a whole. It does not directly help the stages that consume most of the time ── design, review, verification, and pinning down requirements. Reading "55% faster" as "development finishes in half the time" will betray your expectations.

Works well 01

Boilerplate code

handles "repetition"

Data shaping, API-call templates, test drafts. The more the pattern is fixed, the higher the suggestion accuracy and the larger the speed effect.

Works well 02

Unfamiliar languages / APIs

handles "recalling"

Reduces time spent looking up the syntax of a language you rarely use, or how to call a library. But suggestions can use outdated idioms, so verification is mandatory.

Works poorly 03

Design decisions

cannot handle "deciding"

Which structure to choose, where to divide responsibilities. Judgments that require understanding of context and purpose lie outside the coverage of completion-type AI.

Works poorly 04

Domain-specific rules

requires "knowing" as a premise

Your own tax rates, rounding, business requirements, regulations. The AI does not know the specific correctness that is absent from its training data. Here, humans hand it the frame.

03Criteria for Acceptance ── Ask Before You Press Tab

The heart of the practical discipline of completion-type AI lies in "the judgment of whether to accept a suggestion." If you press Tab every time a suggestion appears, you end up harboring code the AI wrote without ever reading it yourself. Before accepting, asking the following three questions every time is the basic discipline.

Can you read it? ── Can you explain, line by line, what the suggested code is doing? Do not accept code you cannot explain. This is the first checkpoint.
Does it match your intent? ── Does that code truly match what you are trying to do right now? "Plausible but subtly different" happens frequently.
Are the boundary conditions safe? ── Empty input, zero, negative values, unexpected types. AI suggestions tend to lean toward "the common happy path" and are often careless with edge cases.

In practice, always keeping the awareness of choosing among three options ── "accept as is," "accept part and fix it," "reject and write it yourself" ── stabilizes quality. In particular, when a large suggestion of more than five lines appears whole, stop for a moment. The longer the suggestion, the higher the risk that "a plausible but wrong line" is mixed in somewhere.

04Security Cautions ── It Has Also Learned the "Common Mistakes"

Completion-type AI learns from public code. Public code includes good code, but also a large amount of vulnerable code. So the AI can suggest not only safe ways of writing but also dangerous ones, as "common patterns." There is no room for optimism here.

Empirical research backs up this concern. In the study by Perry et al. (2023, presented at ACM CCS), participants who used AI assistance tended to write less secure code, and moreover to be overconfident that "I wrote it securely." It is a structure in which speed and a sense of security breed complacency.

In practice, including pharma, you must always inspect AI suggestions from the standpoint of the following OWASP Top 10 (= an industry-standard list of the most frequent and serious vulnerabilities in web applications).

Risky suggestions the AI tends to produce	OWASP standpoint to inspect
Embedding user input into SQL by string concatenation	Injection (input made to execute as a command)
Hard-coding API keys or passwords into the code	Improper handling of authentication / secrets
Handling file paths or URLs without validating input	Broken access control / server-side request forgery
Using old cryptographic schemes or weak hashes	Cryptographic failures

The point comes down to this single fact: "the AI produced it" does not mean it is safe. Rather, "the psychology of wanting to skip verification because it came out fast" is the greatest risk. Hard-coded secrets and unvalidated input are the most easily overlooked at the moment of accepting an AI suggestion. Here you guard with both automated tooling (static analysis, secret scanning) and human eyes.

05Handling Licensing ── Whose Code Is What Comes Out?

The public code that completion-type AI learned from each carries a license (= an agreement defining the conditions under which that code may be used). When an AI suggestion closely resembles specific source code it learned from, those license conditions can become an issue. In practice, fix the following two points as organizational rules.

Filter matches against public code ── Copilot and many other tools have a setting that suppresses suggestions matching public code. For enterprise use, enabling this is the basic policy. Do not leave the presence of the setting to each individual.
Clarify the rights and responsibility for outputs internally ── Who guarantees AI-generated code, and who bears responsibility when a problem arises. Draw the line in advance through contracts and internal rules. "The AI wrote it" does not exempt anyone from responsibility.

Points to note in pharmaceutical in-house development: Even for analysis scripts and in-house tools, licensing and ownership of outputs become issues whenever external distribution or sharing with contractors occurs. In addition, when the code itself handles regulated information ── for example, a script that performs aggregations related to efficacy or safety ── note that its output can touch the yardstick of advertising and information provision. The distinctions in the Pharmaceutical and Medical Devices Act between exaggerated advertising (Article 66), prohibition of advertising unapproved drugs (Article 68), and information provision for proper use (Article 68-2) are fences you should re-confirm at the stage of releasing code output to the outside.

06Integration Into Review ── Put AI-Made Code Through the Same Checkpoint

When you use completion-type AI, code grows fast. The mistake organizations most easily make here is thinking "since the AI wrote it, a light review is fine." It is the reverse. Because even the author has sometimes not read AI-made code in full, it needs, if anything, more scrutiny than usual.

In practice, the principle is to not change your review criteria based on whether code is AI-made. Put it through the same checkpoints as human-written code ── code review, tests, static analysis, security checks ── unchanged. On top of that, add to the review the inspection items specific to AI-made code.

Detecting "plausible lies" ── Function names that don't exist, libraries that aren't real, subtly wrong arguments. The AI writes plausibly, so errors that only surface when you run it get mixed in.
Gaps in boundary and error handling ── The happy path may be clean, but the abnormal path is often thin. Look here with focus.
Duplicated copies ── Accepting similar suggestions in many places scatters the same logic around. Fix one place later, and the others stay unfixed.

What matters is not overlaying "the writer" and "the verifier" in the same frame of mind. Right after having the AI write quickly, attachment and overconfidence in your own output arise easily. So separate generation and verification as distinct stages in your mind. If possible, ideally a different person handles the review.

07Team Operation ── Turning Individual Speed Into Organizational Quality

Completion-type AI looks like an individual's tool, but sustaining its effect requires operational design as a team. If individuals use it however they like, speed appears but quality variance increases. Decide the following three points as shared team rules.

Align the settings ── The public-code filter, secret exclusion, repositories to exclude. Do not leave these to individuals; distribute them as the organization's default configuration.
Put the acceptance criteria into words ── Make Section 03's "Can you read it / Does it match intent / Are the boundaries safe" the team's watchwords. Do not leave it as tacit knowledge.
Share failures ── A defect that arose from accepting an AI suggestion is not material for blame, but a record of team learning. The more the knowledge of "suggestions like this are dangerous" accumulates, the faster and more accurate everyone's judgment becomes.

Used well, completion-type AI helps newcomers ramp up and reduces routine work for veterans. But that holds only when "someone who can judge" uses the tool. Rely on the tool without the ability to judge, and you become an organization that makes mistakes quickly. Running, alongside the tool's introduction, a mechanism that grows judgment ── a review culture and shared learning ── is the condition for turning the investment into results.

08Connections to Other Chapters on This Site

The discipline of completion-type AI covered here connects to the other installments of the AI Programming series, and to this site's regulatory and practical chapters, as follows. Read them together, and the dots become a line.

AI Programming Vol. 1 ── The overall picture of the relationship between AI and code. This installment digs into one form of it ── "completion."
AI Programming Vol. 3 (next) ── Prompt engineering. It moves toward the discipline of a form that "instructs and has it write," not completion.
AI Marketing Vol. 5 ── Generated Content and Review ── How to build generated output into review. The design philosophy of human-in-the-loop (= a mechanism where humans check at key points) is common to both code and content.
Material Review series ── The practice of the review that finally receives generated output. The source of the principle of not changing criteria based on whether something is AI-made.

In Closing

Completion-type AI genuinely raises the speed of writing code. The empirical studies' figure of "about 55% faster" is real. But that number is limited to routine writing, and it does not directly help the design, verification, and requirement-pinning that occupy most of practice. On the back side of speed remain the jobs humans must keep doing ── acceptance judgment, security, licensing, and review. If anything, precisely because code now comes out fast and in volume, their weight has grown.

Mastering completion-type AI does not mean pressing Tab quickly. It means reading the suggestion, checking it against intent, doubting the boundaries, and rejecting it when necessary ── letting go of none of that judgment, one piece at a time, from within the speed. Next time, we move to the form that "instructs and has it write" rather than completes ── prompt engineering ── and cover how to design reproducible instructions.

Key Points ── Three to Take Away

Completion-type AI is not a tool that "understands intent and outputs correct code," but a tool that "predicts the probable continuation from context." Empirical studies report about 55% time savings on routine tasks, but this is "speed," not "correctness." It does not help design, verification, or domain-specific rules, and across practice as a whole its reach is limited.
Before accepting a suggestion, always ask "Can you read it / Does it match intent / Are the boundary conditions safe?" Because the AI learns from public code, it can also suggest vulnerable ways of writing (injection, hard-coded secrets, unvalidated input). Inspection from the OWASP Top 10 standpoint, and the public-code match filter, should be fixed as the organization's default configuration rather than left to individuals.
AI-made code is not something to review lightly; if anything it needs more scrutiny than usual. Separate "the writer" and "the verifier" in your mind, and put it through the same checkpoints as human-written code (review, tests, static analysis). In pharmaceutical in-house development, also confirm, at the stage of releasing to the outside, that the output can touch the yardstick of the Pharmaceutical and Medical Devices Act (exaggeration = Art. 66 / unapproved = Art. 68 / information provision = Art. 68-2).

Sources & References

Peng, S., Kalliamvakou, E., Cihon, P., & Demirer, M. The Impact of AI on Developer Productivity: Evidence from GitHub Copilot. arXiv:2302.06590, 2023. (Primary source for the controlled experiment reporting about 55% reduction in completion time with Copilot.)
Ziegler, A., Kalliamvakou, E., Li, X. A., et al. Measuring GitHub Copilot's Impact on Productivity. Communications of the ACM, 67(3), 54–63. Association for Computing Machinery, 2024. (A study analyzing the relationship between users' felt productivity and objective metrics.)
Perry, N., Srivastava, M., Kumar, D., & Boneh, D. Do Users Write More Insecure Code with AI Assistants? Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (CCS '23). Association for Computing Machinery, 2023. (Empirical research showing that AI assistance tends to breed insecure code and overconfidence.)
OWASP Foundation. OWASP Top 10:2021 ── The Ten Most Critical Web Application Security Risks. OWASP Foundation, 2021. (The industry-standard list of the most frequent and serious vulnerabilities in web applications.)
Ministry of Health, Labour and Welfare. Act on Securing Quality, Efficacy and Safety of Products Including Pharmaceuticals and Medical Devices (Pharmaceutical and Medical Devices Act), Articles 66, 68, and 68-2. Ministry of Health, Labour and Welfare. (Statutory basis for exaggerated advertising = Art. 66, prohibition of advertising unapproved drugs = Art. 68, information provision for proper use = Art. 68-2.)
Ministry of Health, Labour and Welfare, Director-General of the Pharmaceutical Safety and Environmental Health Bureau. Guidelines on Sales Information Provision Activities for Prescription Drugs. Ministry of Health, Labour and Welfare, 2018. (Notice indicating the line between information provision and advertising.)

← Back to AI Programming