← NotesVerbal Reasoning

What Verbal IQ Tests Actually Measure (And What They Miss)

On vocabulary, reasoning, and the difference between knowing and thinking

Most verbal IQ tests measure what you know, not how you think. That distinction matters more than it might seem.

A vocabulary subtest asks you to define ephemeral. A reading comprehension item asks you to identify the main idea of a passage. These tasks are real — they correlate modestly with cognitive outcomes, and they have decades of norming data behind them. But they share a structural limitation that almost no one who takes them notices: they test performance on a prepared, external stimulus. The test is designed so that prior knowledge of the language is the key variable.

The question they cannot answer is the one that matters most in practice: when you are speaking spontaneously — explaining an idea, working through a problem out loud, fielding an unexpected question — how well does your verbal reasoning actually hold up?

The Standard Approach: Crystallized Knowledge as a Proxy

The verbal subtests in most major cognitive batteries (WAIS, WJ-IV, Stanford-Binet) rely heavily on what psychologists call crystallized intelligence — accumulated knowledge and learned skills that tend to increase with education and exposure. Vocabulary range, general information, comprehension of social conventions: these are all crystallized measures. They are useful, reasonably stable, and well-normed.

But crystallized knowledge is a proxy, not the thing itself. You can have an impressive vocabulary and still reason poorly when speaking under pressure. You can have a modest vocabulary and still produce spoken arguments with striking coherence, precision, and epistemic calibration. The vocabulary test cannot tell these two people apart.

The GRE Verbal Reasoning section has the same structure: it measures how well you can decode challenging written text and select between answer choices. A useful academic screening tool. But it tells you very little about the quality of reasoning you produce when you are not reading, not selecting, and not given time to edit — which is to say, when you are talking.

What Actually Happens When You Speak

There is a reason cognitive scientists have studied spontaneous speech for decades. When you speak without a script, you are doing something computationally demanding: you are formulating words and grammatical structures in real time while simultaneously managing your ideas, monitoring what you have already said, and anticipating what your listener needs. William Levelt called this the “speaking machine” — the remarkable fact that humans can produce fluent language at roughly two to three words per second while most of the process runs unconsciously.

The quality of that process — its coherence, its precision, its self-monitoring, its epistemic honesty — is what verbal reasoning actually looks like in practice. And it varies enormously across speakers who might score identically on a vocabulary subtest.

A few things that verbal IQ tests reliably do not capture:

Epistemic calibration. Do you accurately signal what you know versus what you are uncertain about? High verbal reasoners hedge appropriately. They say “I believe” when they believe and “I know” when they know. They do not overstate confidence to sound authoritative, and they do not retreat into vague non-answers to avoid commitment. Traditional tests have no mechanism to detect this.

Conceptual continuity. When you speak for several minutes, do your ideas build on each other — or do they fragment, reset, and pile up without connection? Multiple-choice verbal tests, by definition, do not involve sustained spoken discourse. They cannot measure whether your ideas accumulate into a coherent whole.

Compression under pressure. It is one thing to define a word from a list. It is another to spontaneously pack high propositional density into real-time speech — saying more with less. The latter is the real-world skill. The former is a reasonable proxy for it — but only a proxy.

Generative self-monitoring. High verbal reasoners monitor their own reasoning in real time. They catch when they have said something imprecise and correct it. They signal the boundaries of their knowledge. They can step back from an argument mid-sentence to evaluate it. Timed multiple-choice testing puts a ceiling on this behavior by design.

The Population This Particularly Affects

The gap between vocabulary-based verbal IQ and spontaneous speech quality shows up most clearly at the tails. At the high end, genuinely sophisticated verbal thinkers often find that standard verbal IQ tests bottom out on the wrong dimension. They may score well simply because they have broad vocabulary, but the test adds nothing to what you could learn from asking them to talk for ten minutes.

At the other tail, advanced non-native English speakers face a systematic penalty. A highly educated speaker whose first language is Mandarin or Farsi may produce spontaneous English speech that is conceptually sophisticated but syntactically influenced by their L1. A vocabulary test or reading comprehension measure conflates the conceptual sophistication with the surface-level English form. They score lower — not because their verbal reasoning is weaker, but because the test is measuring the wrong thing.

What EC Measures Instead

Expressive Cognition takes a different approach. Rather than presenting you with a stimulus to decode, it elicits five spoken responses to open-ended prompts — the kind of questions that require you to actually produce verbal reasoning, not recognize or select it.

The resulting speech is scored across six dimensions using a rubric developed from the spontaneous speech literature (Levelt, Chafe, Biber) and validated against corpora of high-ability speakers: 99 guests across Conversations with Tyler, oral arguments from the Supreme Court, and academic seminar speech from the MICASE corpus.

Those six dimensions are:

  • Abstraction — movement between concrete examples and general principles
  • Compression — propositional density of spontaneous speech
  • Originality — genuinely unexpected and apt reframings
  • Conceptual Continuity — idea accumulation versus fragmentation
  • Epistemic Calibration — accuracy of confidence signaling
  • Generative Self-Monitoring — real-time self-correction and reasoning revision

The composite produces a Verbal Reasoning Index (VRI) score — centered at 100, SD=15, same format as a traditional IQ score for interpretive familiarity — but measuring a genuinely different construct.

In a study of 30 public intellectuals, Epistemic Calibration and Generative Self-Monitoring emerged as the two strongest predictors of intellectual reputation (r = 0.420 and r = 0.441 respectively). The dimensions most verbal IQ tests cannot detect at all turned out to be the most predictive. That is not a coincidence. It reflects what verbal reasoning actually consists of when it matters.

The Practical Upshot

If you want to know your vocabulary range, take a vocabulary test. They are fast, well-normed, and reliable for what they measure.

If you want to understand the cognitive quality of how you reason when you talk — how your ideas hold together under pressure, how precisely you select language, how accurately you represent what you know versus what you do not — that requires a different kind of instrument.

The test that gets closest to that is one that asks you to think out loud, and scores what it hears.