Realizing the Invariant: A Publicly Reproducible Cross-Model Probe
Abstract. This note reports a small, publicly reproducible measurement: independently trained language models, read over an arbitrary text domain, agree on the same coordinate-free relational geometry — a cross-model invariant that anyone can confirm in minutes, and that we make falsifiable rather than merely assert.
I. The claim
Two language models trained by different teams, on different data, at different sizes, share no common dictionary of directions. Whatever "a dolphin" or "compound interest" becomes inside one network, it lands somewhere unrelated inside another; the two hidden spaces may differ by an arbitrary rotation, reflection, or rescaling. Comparing their raw activations is therefore meaningless. And yet a long line of work — from representational similarity analysis to the recent observation that stronger models converge toward a shared representation — suggests that something is held in common. The claim we examine is the sharp version of that intuition: once the freedom of coordinates is removed, independently trained models encode the same relationships among the same things. That basis-independent shared structure is what we call the invariant, and the only question that matters here is whether it is real.
II. A neutral domain and four unrelated models
The temptation in such demonstrations is to choose a domain that flatters the result. We do the opposite. The corpus is 156 short sentences across twelve unrelated everyday topics — animals, food, weather, travel, sport, music, technology, health, finance, plants, vehicles, space — with nothing tying them to one another or to the method. Any comparably broad corpus reproduces the effect; the domain is meant to be unremarkable, almost arbitrary.
We read the corpus with four models from four independent pretraining lineages: GPT-2 (OpenAI), Pythia-160m (EleutherAI), OPT-125m (Meta), and BLOOM-560m (BigScience) — different organisations, different data, different sizes. For each sentence we take the attention-masked mean of a mid-network hidden state, a standard probe point, yielding one vector per sentence per model. The vectors live in spaces of different dimensions, which is no obstacle: everything that follows is dimension-agnostic.
III. Structure without privileged coordinates
If the comparison is to mean anything, it must not depend on a coordinate system that neither model agreed to. We therefore use three measures, each blind to rotation, reflection, and isotropic rescaling, so that two models related by any such transformation score as identical.
The first is linear CKA, which compares the two sets of activations through their second-order (Gram) structure,
$$\mathrm{CKA}(X,Y)=\frac{\lVert Y^{\top}X\rVert_F^{2}}{\lVert X^{\top}X\rVert_F,\lVert Y^{\top}Y\rVert_F},$$
and is unchanged if either space is rotated or rescaled. The second is representational similarity analysis: we build each model's matrix of pairwise cosine similarities and correlate the two matrices — a comparison of relationships, not positions. The third is the most intuitive: mutual k-nearest-neighbors, the average overlap between each item's closest neighbors in one model and in the other. Do the same sentences stay together? Three different lenses, one question.
IV. What the numbers say
They agree, and they agree strongly. Averaged over all model pairs, linear CKA is 0.72, RSA is 0.75, and mutual k-NN overlap is 0.38 against a chance level near 0.065 — roughly six times what coincidence would give. The agreement is not confined to similar architectures: even the most distant pair in the set still shares well over half of its relational structure. Different teams, different data, different sizes — and yet the geometry of this arbitrary little world comes out nearly the same in all of them.
V. Making it falsifiable
A similarity number is only as good as the null it is measured against, so every score is reported beside two controls. The first shuffles which sentence is which before comparing, destroying shared content while leaving the rest untouched. The second replaces one model entirely with random Gaussian features of the same shape. Under either control a genuine invariant must collapse to chance — and it does. For GPT-2 against Pythia, a CKA of 0.79 sits against a shuffled null of 0.015 and a random-feature null of 0.11; the permutation p-value is 0.005, the smallest this test can return. The signal is not an artifact of scale, dimension, or wishful averaging. It is content.
VI. What is public, and what is not
This note is deliberately narrow. What is public is the fact and the means to check it: independently trained models share a measurable, coordinate-free geometry over an arbitrary domain, and the companion repository regenerates every number above from a single command. What is not in scope here is any particular basis on which one might read that geometry, any kernel, any alignment procedure, or any use to which the invariant might be put. Those are separate questions, and we keep them separate on purpose. The only claim we make in public is the one the data forces.
An invariant that survives shuffling, randomization, and a change of every coordinate is not a property of any one model — it is a property of the territory all of them are mapping.
Honest limits. The models are small and the domain is short; absolute numbers will move with larger models, other layers, or other pooling, and the final layer is more model-specific than the middle. None of this touches the qualitative result, which holds across the middle of every network and stands far from every null. The point was never a record number — it was a clean, repeatable yes.
Repository. github.com/mareklauko/cross-model-invariant — companion code, the exact domain, and every number in this note.
