Measure by the Materials Actually Made, Not by Impressions or Self-Report

The person who says 'I stay faithful to the facts' is sometimes the one whose chart has a stretched axis. Words can lie; finished work resists lying. So we measure skill from the materials actually made and from conduct when releasing them — not from impressions or self-report.

A checkup won't pass you on 'I feel fine'

Recall a health checkup. If you declare 'I am healthy,' that alone does not pass you. They measure blood pressure, draw blood, read the numbers. The reason is simple: how a person feels is unreliable. People with a real problem often feel 'as energetic as always.'

Measuring a material maker's skill works the same way. 'I stay faithful to the source,' 'I can explain things clearly' — such self-reports are the 'I feel fine' of a checkup. What we should examine is not the words but the numbers, that is, the material made. A material tells, more honestly than the maker's intent, what the maker valued and where they let their guard down.

Words can lie; deliverables resist it

In an interview it is easy to answer 'I have never bent a fact,' and the person believes it. Yet open the product information summary they made (the booklet of key product points handed to doctors) and you find that only the graph of the primary endpoint (the single most important measure decided up front) has a vertical axis starting partway up, making a tiny gap look large. This happens.

What matters here is that the maker often has no awareness of having deceived. In a reported case, a survival curve (a line showing the share of people still alive over time) was started at 0 when it should have started at 0.8, so two drugs looked no different. Until it was flagged, the maker thought it was 'just easier to read.' The mouth says 'faithful'; the hand stretched the axis. You cannot catch this gap by listening. It appears only when you open the material.

Principle: the object of evaluation is not 'did they say they can' but 'what did they make.' Words speak of intent; the deliverable speaks of reality. When the two diverge, trust the deliverable.

Reading the four mental drivers of distortion in a material

Picture airport baggage screening. The screener does not read the traveler's character; they pass the actual contents of the bag through the X-ray. Material evaluation is the same: instead of the maker's character or zeal, we pass the contents of the material to see whether 'signs of a fact slipping' are showing. Tracing reported deviation cases, those signs sort into four mental states.

Mental driver	Sign left in the material (reported case)	Why words alone miss it
Motivated reasoning (the conclusion to sell comes first)	Explains only a secondary item with a significant difference (a gap hard to explain by chance); prepares no material for the primary item	The maker believes they 'picked the important part' and has no sense of lying
Local rationalization ('just here')	The whole uses a correct axis, but one explanatory slide stretches the axis to stress the gap	The maker says 'the whole is correct,' so it looks fine spoken aloud
Sin of omission (not telling)	Pre-dose testing is mandatory, yet the summary states only 'no testing required'	It is 'merely unwritten' and never surfaces unless asked
Externalizing responsibility (blaming others)	No difference in the Japanese subgroup, yet says 'a difference appears'; when flagged, 'the professor said it was fine'	The authority's name leaves no trace on paper; the maker escapes into a spoken excuse

What the four share is that none is a 'villain's crime.' They are circuits an ordinary maker slips into, unaware, under sales pressure and deadlines. So the one who measures does not suspect the person but coolly reads the evidence that is the material.

Leave evidence, as a proofreader reads with a red pen

Picture a printer's proof (a trial print before the real run). A proofreader does not pass it on 'probably fine.' They point a red pen at each typo and number mismatch and mark which line is wrong on paper. This becomes the evidence that later explains 'why it was fixed.'

Material evaluation, too, must not end with a one-line impression. Point to the concrete spot in the material: 'the axis starts partway up,' 'the primary item's data is not attached,' 'testing is mandatory yet written as unnecessary.' Unlike self-report, anyone can confirm the same fact here. Evaluation does not wobble from person to person because the evidence is fixed in the deliverable, not in words.

Treat conduct, too, as part of the deliverable

Consider a driving test. A perfect written score does not pass you if you ignore a stop sign on the road, because the 'conduct' of actual driving is what is judged. A material maker is the same: not only the finished paper but the conduct at release becomes evidence.

In a reported case, no seminar presenter disclosed conflicts of interest (whether one stands to gain from the drug), explaining 'because we were not asked.' This is not an error on paper but a problem of conduct at release. Not telling until asked, not disclosing unless required — this 'omission' is recorded as solid evidence too. In short, we measure both the finished object and the whole behavior of bringing it into the world. From the next piece on, using this evidence-based yardstick, we will trace the process step by step from the request to the finished material.

Measuring Skill from Work and Behavior ── Map of all 10 episodes

Vol. 1 (this episode): Measure by the Materials Actually Made, Not by Impressions or Self-Report ── A material maker's skill is measured from the actual deliverables and observable conduct, not from self-report or others' impressions.
Vol. 2: Tracing the Brief, the Choices, and the Result — In Order ── Read a creator's skill from evidence by walking through one real project in order: the brief, the thinking, the actions, and the result.
Vol. 3: Reading "Faithfulness to the Facts" and "Craft of Delivery" Out of the Work Itself ── This installment shows how to recode a finished piece into two axes — faithfulness to the facts and the craft of getting it across — by reading concrete clues, not impressions.
Vol. 4: The Rules That Keep Measurement Honest ── Six ground rules that keep the evaluator from drifting when measuring an author's real skill.
Vol. 5: Three Rulers: Accuracy, Clarity, and Balance ── Defines three rulers for grading material-making skill and scores each on a four-step scale: accuracy as the floor, clarity as the reach, and balance as the adjustment between too much and too little.
Vol. 6: How to Decide the Level — Returning to the Source Sets the Ceiling ── Work that cannot be traced back to its source cannot earn a higher level, however polished it looks. Grounding sets the ceiling.
Vol. 7: What Deliverables Signal Which Level ── An anchor table that reads a creator's level (L1-L4) from visible deliverables and behavior patterns.
Vol. 8: How Far Can We Trust a Judgment? ── How sure a level judgment is depends on how visible the evidence is; less observable skills produce shakier judgments, so we attach a confidence to each verdict.
Vol. 9: Combine More Than Self-Assessment: Add the Reviewer's and Requester's View ── Layering four viewpoints — self, reviewer, requester, and AI — surfaces the deviations of omission that a single pair of eyes cannot see.
Vol. 10 (final): Connecting the Measurement to Pass/Fail and a Development Plan ── The finale links the score to the pass floor and a plan for what to grow next.

In closing

The starting point for measuring skill rests not on the person's words or others' impressions but on the material itself. A material keeps, more honestly than the maker's intent, the axis trick, the contraindication left unsaid, the primary data left unattached.

And inside the material show the signs of four mental states: the conclusion to sell coming first, just-here rationalization, the omission of not telling, and shifting blame to others. So the one who measures reads evidence rather than suspecting the person. Next, we move to how to trace, step by step, what request the material answered and how it was shaped.

Key Points ── Three to take with you

Look at the deliverable, not the words. 'I am faithful' is the checkup's 'I feel fine.' We measure the numbers that are the material made; when words and deliverable diverge, trust the deliverable.
Four mental signs show in the material. Conclusion-first (motivated reasoning), just-here tricks (local rationalization), not telling (omission), blaming others (externalizing). Reported deviation cases sort into these four circuits.
Include conduct as evidence. Not only the finished paper but behavior at release — like not disclosing conflicts of interest 'because we were not asked' — is recorded. Evidence lets anyone confirm the same fact.

Sources & references

Ministry of Health, Labour and Welfare, Compliance and Narcotics Division (commissioned project). Report on the Monitoring of Promotional Information Activities for Prescription Drugs (March 2024 and prior years). Flagged cases are published with company names anonymized; the deviation patterns cited here are generalized from this report.
Ministry of Health, Labour and Welfare. Guidelines on Promotional Information Activities for Prescription Drugs. Standards requiring fact-based information in product summaries and slides.
Japan Pharmaceutical Manufacturers Association. JPMA Code of Practice / Guidance for Preparing Prescription Drug Product Information Summaries. Self-regulatory standards on graph axes and handling of primary endpoints.
Spencer, L. M. & Spencer, S. M. Competence at Work. Foundational text for measuring ability from actual behavior and outcomes rather than self-report.

← Back to Measuring Skill from Work and Behavior