Measurement Design ── L from Behavioral Evidence, via Multi-Party AI Dialogue
L is never decided by self-report. Concrete past behavior heard through STAR is encoded into abstraction α, scope σ, and grounding g, and L is computed as the highest level the grounded evidence supports ── integrated across the person and multiple third parties via AI dialogue, weighted by observability. Ten parts.
序
Introduction — Get the Map First
Grab the whole picture before the episodes.
The map →
01
The Hazard of Impression and Self-Report ── We Measure Only Demonstrated Behavior
When we evaluate people, the least reliable inputs are impression and self-report.
Read →
02
Listening Through STAR ── Situation, Task, Action, Result, Thought
Ask "are you skilled?" and what comes back is a self-image, not what the person actually did.
Read →
03
Encoding to Two Axes ── Action Reveals Scope, Thought Reveals Abstraction
In the earlier parts we set up a coordinate grid built from two rulers (the line of reasoning, and how far a person can move), the four levels L1 through L4, and a bottom line below which someone fails.
Read →
04
The Six BEI Principles ── Axioms That Keep the Measurement Clean
Through Part 3 we saw how a four-point interview — Situation, Task, Action, Result, called STAR — gets translated into two rulers: depth of thinking and reach of action.
Read →
05
Three Bands ── The Scales of Abstraction α, Scope σ, and Grounding g
Before deciding a level, the evidence first has to be put into a measurable form.
Read →
06
How L Is Decided ── The Grounding Ceiling and Projection to the Diagonal
L is not some single point on a line.
Read →
07
The Behaviors That Separate Levels ── Eight-Dimension Anchors and Boundaries
Someone says "that person is an L3." But what did they look at to decide L3? Through part six we settled that a level (the assessment step we call L) is not decided by self-report but computed from evidence of actions actually performed.
Read →
08
Confidence and Observability ── How Far to Trust a Reading
Even when the L number is out, the job is not done.
Read →
09
Multi-Party AI Dialogue ── Corroboration for Others' Level, Divergence for Calibration
Two people watch the same person; one says "she's clearly senior level," the other "still mid-level at best." This happens all the time.
Read →
10
From Integrated Output to the Qualifying Line ── The Record and the Operating Procedure
Over nine episodes we traced the path: take the concrete behavior heard in the interview, translate it into three yardsticks (depth of thinking, width of view, and grounding in fact), read the highest rung the person actually reached, and bundle the readings of the person and several others, weighted.
Read →