This is a snapshot of early-stage research that I am excited to develop further. Code is available here.
I used Maia-2, a state-of-the-art human prediction model for chess, to quantify the "human-likeness" or "human-conformity" of 55 grandmasters. Maia-2 (see arXiv:2409.20553) is a deep-learning model trained exclusively on human games to predict the consensus human move in a given context (i.e., what a player at a specific rating would do), rather than the objective optimal move.
I calculated the mean predicted probability that Maia-2 assigned to each grandmaster's actual moves across 18,000 moves per player, stratified by game phase (opening/mid/end). I found that a single latent axis (the first principal component) explains ~58% of the variance in human-likeness scores; higher values on this axis mean more human-typical play in every phase.
1. Latent Behavioural Structure: This consistency is non-trivial. Chess strategy is highly phase-dependent (openings rely on memorized theory, middlegames on tactics/strategy, endgames on technique), so the emergence of a single "human-likeness" axis across these three distinct contexts is a significant signal. It suggests this approach can provide a robust, model-based metric for how "typical" an agent's decision-making is relative to the human population.
2. Orthogonality to Skill and Era: Naturally, one might suspect Maia-2 is inadvertently measuring skill (stronger GMs being less human) or era effects (modern players matching Maia-2's training data better), but neither shows any correlation (R2 ≈ 0.009 and R2 ≈ 0.016 respectively). In other words, "human-likeness" captures a behavioural axis orthogonal to both raw skill and era. It implies that elite proficiency is compatible with a wide spectrum of styles, ranging from those that align closely with consensus human intuition to those that diverge sharply from it.
3. Identity & Generalization: I also found that this metric acts as a behavioural fingerprint. For each player, I computed a three-dimensional signature of average human-likeness per phase, calculated separately from the original 18,000 training moves and from a held-out set of 4,500 moves (~100 games). A simple k-nearest neighbours classifier matching held-out signatures to training signatures achieves ~55% top-5 identification accuracy (vs. 9% random baseline). Remarkably, compressing each player's complex behavioural patterns into just three human-likeness values preserves enough signal to recover individual identity 5× better than chance, suggesting the metric captures consistent, idiosyncratic patterns rather than mere noise or situational variation.
My next goal is to better characterize what this latent axis actually captures (strategic tendencies like aggression and risk-taking, cognitive style, or other traits that shape individual playstyle).