← NotesVerbal Reasoning

Reasoning is the Job

On Jensen Huang and the case for measuring spoken thought

There is a word Jensen Huang uses with unusual frequency when he talks about how he runs NVIDIA. Not vision, though he has that. Not execution, though NVIDIA’s record on that is difficult to dispute. The word is reason. He reasons his way into decisions. He reasons step by step. There is, he says, a reasoning system inside him that eventually convinces him so clearly that an outcome will happen that he simply proceeds as if it already has.

This is worth pausing on. Huang is among the most consequential executives in the history of technology — the company he built now sits at the center of one of the largest capital deployments in modern industrial history. He is surrounded, as he puts it, by people who are smarter than he is in almost every specific domain. He did not invent the GPU. He does not have PhDs in memory architecture, optical networking, or power delivery. What he has, by his own description, is the capacity to reason across those domains — to take what the specialists know and work through the implications until something becomes clear enough to act on.

He described his management method to Lex Fridman in terms that are striking precisely because they are not the terms of inspiration or vision. He shapes belief systems, he said — slowly, continuously, through the accumulation of reasoning done in public. By the time he announces a major strategic direction, the people around him have already been reasoned toward it over months or years. The announcement is not a revelation. It is the final step in a long shared argument.

This is a description of verbal reasoning as professional practice. Not charisma. Not authority. Not the theatrical performance of certainty. The patient, incremental, public construction of a case.¹

§

The idea that reasoning is a leadership competency is easy to agree with and almost impossible to measure. It tends to get collapsed into vaguer terms — critical thinking, strategic clarity, communication skills — that mean different things in different contexts and resist any serious definition. The result is that organizations spend considerable resources selecting, developing, and promoting people on the basis of proxies that correlate imperfectly with the thing they actually care about.

Consider what gets measured in the standard hiring process for professional roles. Academic credentials, which measure a person’s performance in structured evaluative contexts designed years or decades ago. Written assessments, which measure what a person can produce with time, revision, and access to sources. Interviews, which vary so widely in structure and quality that their predictive validity is notoriously poor. None of these reliably surface what Huang is describing — the capacity to reason in real time, in public, under conditions the person cannot fully control.²

The irony is that the most consequential verbal reasoning in any organization happens not in written documents but in conversation — in the meeting where the decision is actually made, in the explanation that brings a skeptic on board, in the moment when someone is asked to account for a decision and has to reconstruct the reasoning that produced it. These moments are unscripted. They cannot be prepared for in the way a written document can be prepared. They reward something that writing never has to produce: the capacity to hold a complex argument together while it is still forming, under the pressure of an audience that is already responding.

§

Huang’s dishwasher story is worth considering here. Before NVIDIA, before the decades of compounded bets, he washed dishes. The anecdote surfaces periodically in interviews, usually as a note about humility or the distance traveled. But there is another way to read it. The distance between washing dishes and running a multi-trillion-dollar company is not primarily a distance of acquired knowledge or technical expertise — Huang says repeatedly that the people around him know more than he does in their specific domains. It is a distance of demonstrated reasoning capacity, accumulated through decades of practice reasoning out loud, in rooms where the stakes kept rising.

This is not an inspirational claim about meritocracy. It is an observation about what kind of capacity actually compounds over a career. Domain knowledge is necessary but depreciates — technologies change, markets shift, what you knew becomes less relevant. The ability to reason through novel problems, to articulate the structure of an argument before you know its conclusion, to convince people who are smarter than you in specific domains that a particular direction is right — that capacity, if anything, appreciates.³

§

The failure to take this seriously has real costs.

Institutions routinely make selection and promotion decisions based on writing quality, credential accumulation, and structured test performance — none of which predict how a person reasons under the conditions that actually matter. A candidate who writes flawless cover letters and performs well on standardized assessments may freeze when asked to explain their reasoning in a room that isn’t going along with them. A candidate who stumbles in writing but thinks precisely out loud — who can take an objection, sit with it for a moment, and respond with a reformulation that addresses the actual point rather than talking past it — rarely distinguishes themselves in a conventional hiring process.

The consulting industry knows this, which is why it has built a selection architecture entirely around the case interview — a thirty-minute exercise in spoken reasoning under mild adversarial pressure. Law schools know it, which is why the Socratic method still survives despite everything. The military knows it, which is why after decades of written assessments and credential review, the final selection for certain roles still involves putting people in situations and watching how they reason aloud when the situation is not going as planned.

These are not accidents. They are adaptations to a real signal that conventional assessment misses.⁴

§

The question of what to do with this observation is not trivial. Organizations cannot easily assess spontaneous verbal reasoning at scale. The case interview is expensive to administer, inconsistently conducted, and vulnerable to coaching. Most institutions have neither the time nor the trained assessors to evaluate how candidates reason out loud. And so they continue selecting on the proxies — the credentials, the writing, the structured test scores — that are measurable even when they know those proxies are incomplete.

The situation is analogous to how organizations assessed physical fitness before reliable measurement tools existed. They looked at available indicators — appearance, self-report, performance in structured exercises — that correlated roughly with what they cared about but failed to capture it precisely. The indicators were better than nothing. They were also systematically misleading in predictable ways.

Reliable, scalable assessment of spontaneous verbal reasoning is a solvable problem. The technology to transcribe speech accurately, score it against rubrics that capture reasoning structure, and return results quickly enough to be useful now exists. What has not yet followed is an instrument designed specifically for this purpose — built around the dimensions that distinguish sophisticated verbal reasoning from surface fluency, and calibrated against something meaningful enough to interpret.

That gap is what this project is attempting to close. Whether it succeeds is a different question. But the gap is real, the cost of not closing it is real, and the tools to close it are, for the first time, available.⁵

¹ The phrase Huang used in his Lex Fridman appearance was approximately “there’s a reasoning system that convinces me so clearly this outcome will happen.” The word system is interesting — it implies something more structural than intuition, more deliberate than instinct. He uses the word reason as both verb and noun throughout the conversation, which is not the vocabulary most executives reach for when describing their decision-making.

² The research on unstructured interview validity is sobering. A meta-analysis by Schmidt and Hunter (1998) found unstructured interviews had a validity coefficient of roughly 0.38 for job performance — meaningfully better than chance, but considerably below what structured behavioral interviews, work sample tests, and cognitive ability assessments achieve. The problem is not that interviews are useless but that they measure interviewer comfort and candidate presentation fluency at least as much as they measure the underlying capacity.

³ There is a version of this claim that tips into self-help territory, which is not the intention. The point is narrower: reasoning capacity is a skill that improves with deliberate practice in a way that specific domain knowledge does not, because reasoning is domain-general. A person who has spent thirty years making and revising arguments under pressure in a variety of contexts has accumulated something that persists across domain changes in a way that specific technical expertise does not.

⁴ The survival of the Socratic method in law schools deserves more attention than it usually gets. Its critics focus on its psychological costs — the cold-call anxiety, the hierarchies it reproduces. These are real. But its defenders have a point that often goes unstated: it is one of the only remaining institutional practices in higher education that assesses how people reason in real time, in public, under conditions they cannot fully prepare for. The written exam tests the stabilized product of thinking. The Socratic classroom tests something closer to the process.

⁵ The tool is at expressivecognition.org. It is early-stage and imperfect. But the problem it is aimed at is not.