In Plato’s Meno, Socrates is in a conversation about whether virtue can be taught. The dialogue stalls, as Socratic dialogues tend to. Socrates calls over a slave boy belonging to Meno’s household — someone with no formal education, no training in geometry, no preparation for what is about to happen — and begins to ask him questions about a square. Specifically, whether one could double the area of a square by doubling the length of its sides.

The boy first says yes. Socrates, by asking further questions, reveals that this is wrong. The boy is uncomfortable. Socrates keeps asking. After perhaps twenty minutes, by following questions he has not been told to follow and arriving at conclusions he was not told in advance, the boy produces a geometrically valid proof of how to double the area of the square — by constructing a new square on the diagonal of the original.

Socrates draws a half-mystical conclusion from this: the boy must have known the proof already, perhaps from a previous life, because nobody had taught it to him in this one. The conclusion is wrong about epistemology. The boy did not previously know geometry in any meaningful sense; Socrates was leading him through a structured derivation by selecting questions that closed off wrong answers and left only the right one. It is teaching, of a particular kind, dressed up as recollection.

But the conclusion is right about something else, and the something else is the reason this scene has survived for twenty-four hundred years. Questioning under pressure surfaces structures of thought that the person being questioned could not have produced on their own. Not because the structures were already there in the precise form they take, but because the act of being asked, in real time, in front of someone who is going to ask again, produces a kind of cognitive reaching that monologue does not. The slave boy in Meno is doing something he could not have done by himself in a quiet room with a piece of slate. The questioner is not transferring information into him. The questioner is making him find information by prohibiting him from doing nothing.

This is what oral examination has always been for, and it has been around for almost as long as anything we recognize as institutional learning.

The medieval university built its central pedagogy around it. Disputation — the formal oral defense of a thesis against objections from anyone who chose to raise them — was not a quaint ritual. It was how knowledge was demonstrated. Thomas Aquinas spent his career engaged in disputations of two kinds. The disputatio ordinaria was scheduled, on a fixed topic, with prepared participants. The disputatio de quolibet — quodlibetal disputation — was different. It happened twice a year, around Christmas and Easter, and the master had to defend any thesis the audience felt like proposing. De quolibet literally means “about anything whatever.” The format placed the master in a position structurally identical to the slave boy in Meno: someone is going to ask, and you are going to have to answer, and you do not get to choose the question.

The Summa Theologiae is a record of this. It is written in the dialectical form of objection-and-reply because it grew out of a culture in which knowledge was demonstrated by surviving objections in front of witnesses. The written work is the residue. The disputation was the event.

This pedagogy did not vanish; it migrated. The Oxbridge tutorial system, which still operates today in something close to its original form, places one student in front of one tutor for an hour, with no syllabus and no slides, and the tutor asks questions until the student stops being able to answer. The PhD viva voce in the European tradition can run for several hours, with examiners explicitly licensed to find the place where the candidate’s thesis comes apart and apply pressure there. The American medical board examination once included an oral component in which an experienced physician asked the candidate to reason through cases the candidate had not seen before. Some surgical specialties retain this. The bar examination in many U.S. jurisdictions used to include an oral defense; most have dropped it.

What all of these practices share is not adversarialism, though they sometimes feel adversarial. What they share is the structural insistence that the person being assessed must produce reasoning now, on a topic they did not get to choose, in front of someone who is going to evaluate not the conclusion but the process by which the conclusion arrives.

The contemporary descendants of this practice are scattered and often unrecognized as kin to one another. The first-year law student called on cold in a Socratic classroom is doing what the medieval disputant did, in a domain the disputant would have considered embarrassingly applied. The candidate at a McKinsey case interview, asked to estimate the number of piano tuners in Chicago, is doing what the Oxbridge tutor’s first question is designed to do: produce a reasoning structure under conditions where the answer cannot be looked up. The MBA applicant submitting a video essay — ninety seconds, one prompt, no second take — is being subjected to the most algorithmically pure form of oral examination ever devised. The political interview, in its better moments, is the same instrument operated by a journalist instead of a professor. Conversations with Tyler, hosted by an economist who treats his guests as candidates and his questions as quodlibetal, is more recognizable as descended from the medieval disputation than from anything we currently call an interview show.

In each of these cases the institution is not asking the speaker to perform. It is asking the speaker to think where the institution can watch.

The reason this format works is not mysterious. When you write, you can stop. You can revise. You can show a draft to someone. You can return tomorrow with a better paragraph. The cognitive process that produced the draft happens out of view, distributed across hours and conversations and revisions, and what arrives on the page has been processed beyond recovery. By the time anyone reads it, the thinking is gone. What remains is the thinking’s record, polished by a series of small editorial decisions that conceal where the original thought struggled. This is not a complaint about writing. It is an observation about what writing is. Writing is a technology for hiding the intermediate steps of thought, and it is enormously useful precisely because of this — most of the time, the intermediate steps are not what we want to communicate.

But sometimes the intermediate steps are exactly what we want to see. When a student is being evaluated for whether she can think, not whether she can produce the appearance of having thought, the intermediate steps are the only thing that matters. When a job candidate is being evaluated for whether his expertise extends past the boundary of what he prepared for last night, the intermediate steps are everything. When we want to know whether a person actually understands something, we need to ask a question they were not expecting and watch what happens to their face.

What the disputation, the tutorial, the viva, the cold call, the case interview, and the video essay all do is impose conditions under which the intermediate steps cannot be hidden. They place a person in a situation where the answer must be constructed in front of an observer, on a timetable the speaker did not set, on a topic the speaker did not choose. Whatever the speaker does in that moment is not their best work. It is not their most polished work. It is not the work they would want quoted out of context. It is, however, the work that reveals how their cognition actually moves when nothing protects it from being watched.

Here is the irony. Institutional support for this kind of assessment is in retreat exactly as the technology to do it well is arriving.

The medical boards dropped most of their oral components in the late twentieth century, citing cost and inter-rater inconsistency. Law schools have softened cold-calling under pressure from students who experience it as harmful. Comprehensive oral examinations in PhD programs have become, in many places, perfunctory rituals where the candidate is asked questions everyone knows in advance and the committee is expected not to actually try to derail anyone. The defenses witnessed in the last several years have, with one or two exceptions, been ceremonies. The disputation was an event; the modern dissertation defense is an unveiling. The shift is recent and largely uncommented.

There are reasons. Oral examination is harder to make fair across different examiners, and fairness in formal assessment is now a defensible legal concern. It produces measurable distress in test-takers, and student welfare is taken more seriously than it once was. It is difficult to standardize, which makes it difficult to defend in court when someone who failed wants to know exactly why. Multiple-choice replacement instruments are cheaper, more reliable, and more transparent in the narrow sense of producing a number that everyone can agree on. The replacement was rational.

But the replacement also gave up something the original was uniquely good at, and the people who managed the replacement did not always seem to know they were giving it up. They believed they were replacing one assessment with another assessment of the same thing. They were not. They were replacing an assessment of how a person reasons with an assessment of how a person decides on a multiple-choice key. These are not the same construct. They are correlated, sometimes strongly, but a high score on the second can be produced by someone who could not have survived a single round of the first, and a low score on the second can come from someone whose reasoning is alive in the way the first format would have detected and the second cannot.

Meanwhile, the technology for capturing oral reasoning, transcribing it, and applying a consistent rubric across many speakers has arrived in the same decade the institutions are abandoning the format. This is a strange historical accident. For most of the time disputation existed, it was constrained by the requirement that the examiner be present and trained and consistent — three constraints that limited it to elite contexts and small numbers of students. Those constraints are now much weaker. It is now technically possible to administer something like a quodlibetal disputation to ten thousand people in a week, score it consistently, and produce dimensional feedback that distinguishes the structure of reasoning from the smoothness of delivery. Whether this should be done is a separate question, and the right answer is not obviously yes. But the possibility exists, in a way it has not existed before, and the institutions that abandoned oral examination on grounds of feasibility are abandoning it now on grounds of preference.

The slave boy in Meno is asked a question he has not prepared for, and produces a proof he has not been taught. Socrates calls this recollection. We would call it construction under questioning. Whatever we call it, it is something a person can be made to do that they cannot do alone, and it reveals something about how their mind is organized that no amount of essay-writing can reveal.

We have been doing this for twenty-four hundred years for a reason. We are not done with it yet, even if we have momentarily forgotten what it was for.

What the Cold Call Was Always For