Expressive Cognition is built on a small program of validation research. The papers below define the construct, document the scoring rubric, and test whether the Verbal Reasoning Index detects meaningful differences in spontaneous speech across naturalistic corpora.
Each paper is available in full as a web page or as a PDF download. Findings are reported with their limitations stated plainly.
Foundation#1
Construct definition, the six core dimensions, and the theoretical grounding behind the Verbal Reasoning Index.
Frames verbal reasoning as a real-time cognitive performance, not a stored knowledge inventory.
Construct Validity#3
Joint SCOTUS + MICASE construct validity paper. Tests whether the rubric detects known differences in verbal reasoning quality across two fundamentally different speech contexts.
The rubric discriminates between elite, experienced, and first-time SCOTUS advocates, and between faculty, graduate, and undergraduate seminar speakers — in the predicted direction.
Construct Validity#4
Extends the original 3-advocate SCOTUS study to 12 advocates, with cross-model scoring and a written-vs-spoken comparison on same-case briefs.
VRI correlates r = 0.688 with a biographical attainment proxy across 12 advocates — the strongest single-criterion validity result in the program.
Ecological Validity#5
30 guests from Conversations with Tyler scored blinded across 8 dimensions in three independent passes, then correlated against external intellectual reputation.
Generative Self-Monitoring (r = 0.441) and Epistemic Calibration (r = 0.420) significantly predict reputation; the composite VRI is attenuated by ceiling effects in this pre-selected sample.
Instrument#6
99 Conversations with Tyler guests balanced across nine disciplinary cells, scored in three blinded passes by three independent frontier LLMs (Claude Sonnet 4, GPT-5 mini, Mistral Large), with confirmatory factor analyses estimated independently on each scorer's correlation matrix.
All three models reject the one-factor model and recover the same two-factor structure (Generative Range and Calibrative Control) with identical dimension composition — the first three-way cross-vendor LLM-as-judge factor-invariance result published at this scale. Pairwise VRI agreement averages r = .668; a strict generosity gradient (Mistral > Sonnet > GPT-5 mini) spans 1.06 scale points; and Conceptual Continuity emerges empirically as a boundary dimension whose factorial placement is scorer-convention-determined.
Additional studies in progress: a fluid intelligence (Gf) convergent-validity study, a cross-linguistic L1/L2 study, and a large-sample normative corpus.