A single evaluation sheet sat on Yui's desk. A scoring table with no name on it. Numbers filled the columns, yet the total and the average rows were blacked out. "Is this a calculation error?" Yui asked. Without lifting her eyes from the page, Mio answered, "It's not an error. I erased them on purpose. The qualifying line doesn't decide a person by their average."

The Blacked-Out Average

In the spring of her second year on the job, Yui was handed a blue file marked "Qualifying Line" for the first time. Inside were the evaluation records the materials review office uses when choosing an independent reviewer. "Materials" here means the promotional documents a pharmaceutical company hands to doctors and pharmacists: explanations of efficacy, graphs, warnings about side effects. Each and every one of them is reviewed by someone before it goes out into the world.

The record Yui turned to laid out one candidate reviewer's scores, item by item: clarity of explanation, depth of knowledge, ability to detect problems. Some columns were high, others low. Normally you would add them up, take the average, and pass anyone above a certain line — that feels natural. But the average row had been blotted out with ink.

"Mio, if you just took the average you could rank everyone in one shot. Why is it erased?" To Yui's plain question, Mio finally looked up. "Good question. That's the doorway into this whole system."

A Driver's License Has No Average Score

Mio drew a single horizontal line on the whiteboard. "Picture this. A driver's license test. Perfect score on the written exam, flawless parking. But the person runs one red light, just once. Do you give that person a license?"

Yui thought for a moment and shook her head. "You can't. Run a red light even once and an accident can happen."

"Right. A license test isn't addition, where your strong subjects cover for your weak ones. If there's even one thing you must never do, you fail — no matter how perfect everything else is. The qualifying line works the same way. What we're looking at isn't the score on a single piece of material." Mio tapped the file's cover with her finger. "It's whether we can trust this person to review on their own. What's being judged isn't the document. It's the person."

What the qualifying line (the pass/fail standard) decides is not whether one piece of material in front of you is good or bad. It is the pass/fail line for the reviewer themselves: "Can this person be trusted to review independently?"

Why We Don't Decide by Average

Yui still wasn't convinced. "But someone with a high average is someone with strong all-around ability, right? What's wrong with that?"

Mio ran her finger along one row of the scoring table. "This candidate is nearly perfect on clarity of explanation. But their detection ability — the power to spot danger — is low. Take the average and the clarity fills in the hole left by the detection, so on paper it looks like a high score. But in actual review work, being good at explaining doesn't fill even a millimeter of the hole left by an overlooked danger."

"It doesn't fill it…"

"Just the opposite. When someone who can't detect danger is also eloquent, they wave a dangerous piece of material through as 'no problem,' confidently. And because the explanation is smooth, everyone around them is convinced too. For the patient, an eloquent person who overlooks danger is more dangerous than a quiet one who overlooks it." Mio's voice was soft, but Yui felt a hard core in it.

Method of decidingHow low detection ability is handledConsequence for patients
Pass/fail by averageStrong subjects fill the hole and hide itDangerous material passes, eloquently
Qualifying line (required-item method)If detection falls below the line, it's a failOverlooked dangers are stopped

The Asymmetry of Harm

"The asymmetry of harm." Mio wrote the phrase on the board. "Think of airport security screening. The mistake of letting a dangerous item through as safe. The mistake of stopping a safe bag as dangerous. These two do not weigh the same."

Yui caught her breath. "The first one… a person gets hurt."

"Right. Stopping too much can be redone. But letting too much through can sometimes be impossible to take back. Materials review is the same. If you fail to see through an exaggerated claim of efficacy and let it out into the world, a doctor who believes that document delivers it to a patient. That's why the qualifying line is strict about a shortfall in detection ability. That one thing, it never lets dissolve into an average."

Yui looked once more at the average row erased in ink. The black rectangle that had looked like an omission a moment ago now looked like a line drawn with intent.

What This Line Draws

"Yui, someday you'll stand on the far side of this line too," Mio said. "Recognized as someone allowed to review materials alone. And what gets measured then isn't how many points you scored. It's whether, if we leave it to you, you'll let a danger slip through."

From here on, the review office will hold three candidates up against the qualifying line. The eloquent one, the knowledgeable one, the unassuming one. Who crosses the line, and who cannot? Yui closed the file. The words "Qualifying Line" on the cover suddenly felt heavy.

The subject is the person

What gets a pass or fail isn't a single piece of material, but whether the person can be trusted with independent review.

No averaging

Addition that lets strengths cover weaknesses is forbidden. If a required item falls below the line, it's a fail.

Detection is the core

The power to spot danger is central. Eloquence does not fill the hole.

The Qualification Bar ── Map of all 10 episodes

  1. Vol. 2: The Asymmetry of Harm ── A Miss Is Orders of Magnitude Heavier ── Why you must not draw the line with an average, part 1: a miss and a false alarm are not equal harms
  2. Vol. 3: The Compensation Trap ── Eloquence Hiding a Gap in Detection ── A reviewer who is brilliant at explaining and at getting along with people is weak at just one thing: spotting danger (risk detection). Average the scores and they pass. But someone who cannot spot danger yet talks well will push risky material through on charm alone. Why you must not decide pass or fail on an average — explained gently through real Case A.
  3. Vol. 4: Floor vs. Aggregate ── Non-Compensatory Gates and the Weighted Score ── Pass/fail is decided by minimum bars (floors); the total score is used only to rank. Fall below even one bar and a perfect score still fails. This is the unbreakable rule of the qualifying line.
  4. Vol. 5: The Highest Floor for Detection ── Why Risk Detection Exists ── Material review — the job of checking a drug company's promotional materials for doctors before they go out — exists to find the dangerous spots. So among eight abilities, the minimum bar for the power to spot danger (risk detection) is set highest. To pass as someone who can review alone (qualified) you need level L3, the second-highest rung, plus a real-world spotting range of 2 or more. A person who stops one rung lower, at L2, lets the most dangerous materials slip right through.
  5. Vol. 6: A Floor on Two Axes ── Not Letting Desktop Detection Pass ── The pass line for detection cannot be drawn with a single score. It needs two rulers: how well you can explain the danger, and whether you can catch it in the real material in front of you. A textbook-only spotter may look like L3 on paper but does not clear for solo work.
  6. Vol. 7: Calibration as a Gate to Independence ── Overconfidence Disqualifies ── A look at the gate (calibration gate G2) that asks: do you estimate your own seeing-power correctly? Working alone means no one checks behind you. A person who thinks their detection skill is higher than it really is (gap Δ of +2 or more) waves through danger without noticing their own blind spot. This gap (Δ) is not skill itself, but it decides whether someone may work alone.
  7. Vol. 8: The Four Gates G0–G4 ── The Logic of Early Rejection ── A reviewer's pass or fail is decided at four checkpoints in order. Anyone who fails an earlier checkpoint is not re-measured at a later one. A non-negotiable minimum line (a "floor") cannot be patched over by other strengths, and the total score never flips the result.
  8. Vol. 9: Three Profiles ── How One Line Sorts Them ── The eloquent talker, the textbook thinker, and the real deal — where one pass/fail line sends each
  9. Vol. 10 (final): The Responsibility of Drawing the Line ── Anchors First, Human Confirmation, Non-Punitive Growth ── The closing chapter that turns the pass line into something a workplace can actually use. Only when a shared book of agreed examples exists does the line become a common yardstick. The four verdict tiers are not a brand of failure but a signpost for what to grow next. AI gives a rough first reading; a human makes the final call.
In closing

The question Yui first raised — why not decide by the average of the scores — was the doorway into this whole series. The average hides weakness behind strength. But there is one thing that must never be hidden in review work: the power to find danger.

The qualifying line is not a system that's cold to people. It's a line drawn so that the one document someone overlooked is stopped before it reaches a patient. From the next installment on, we'll watch how that line is actually drawn and measured, through the review of the candidates themselves.

Key Points ── Three to take with you
  1. Key point What the qualifying line judges is not the pass or fail of a single piece of material, but the pass/fail line for the reviewer themselves: whether this person can be trusted to review alone.
  2. Key point Pass and fail are not decided by average. Addition that lets strong items fill the holes of weak ones is forbidden; if a required item such as detection ability falls below the line, the candidate fails even with perfect scores elsewhere.
  3. Key point Harm is asymmetric. The mistake of letting something dangerous through is heavier than the mistake of stopping something safe. So the eloquence of someone who cannot detect danger is seen as making patients more endangered, not less.
Sources & references
  1. Angoff, W. H. Scales, Norms, and Equivalent Scores. Educational Measurement (2nd ed.), American Council on Education, 1971. (The Angoff method of setting cut scores by expert judgment; classic basis for floor thresholds.)
  2. Hambleton, R. K. & Pitoniak, M. J. Setting Performance Standards. Educational Measurement (4th ed.), 2006. (Survey of conjunctive vs. compensatory standard-setting models.)
  3. Macmillan, N. A. & Creelman, C. D. Detection Theory: A User's Guide. Lawrence Erlbaum, 2005. (Signal detection theory; the asymmetry of miss and false alarm, and the basis of sensitivity/specificity.)
  4. Spencer, L. M. & Spencer, S. M. Competence at Work. Wiley, 1993. (Competency threshold levels and performance prediction; foundation of the prerequisite series.)
  5. Messick, S. Validity. Educational Measurement (3rd ed.). American Council on Education, 1989. (Validity of judgments; the framework that supports the legitimacy of the line.)