> Strikingly, Claude seems to be unaware of the sophisticated "mental math" strategies that it learned during training. If you ask how it figured out that 36+59 is 95, it describes the standard algorithm involving carrying the 1. This may reflect the fact that the model learns to explain math by simulating explanations written by people, but that it has to learn to do math "in its head" directly, without any such hints, and develops its own internal strategies to do so.
https://www.anthropic.com/news/tracing-thoughts-language-mod...
It seems to be about as useful as asking a person how their hippocampus works: they might be able to make something up, or repeat a vaguely remembered bit of neuroscience, but they don't actually have access to their own hippocampus' internal workings, so if they're correct it's by accident.
As an analogy, I might ask you to identify the relative activations of each of the three cone types on your retina as I present some solid color image to your eyes. But of course you can't do this, you simply do not have cognitive access to that information. Individual color experiences are your basic vision tokens.
Actually, I asked Grok this question a while ago when probing how well it could count vowels in a word. It got it right by listing every letter individually. I then asked it to count without listing the letters and it was a couple of letters off. I asked it how it was counting without listing the letters and its answer was pretty fascinating, with a seeming awareness of its own internal processes:
Connecting a token to a vowel, though, requires a bit of a mental pivot. Normally, I’d just process the token and move on, but when you ask me to count vowels, I have to zoom in. I don’t unroll the word into a string of letters like a human counting beads on a string. Instead, I lean on my understanding of how those tokens sound or how they’re typically constructed. For instance, I know "cali" has an 'a' and an 'i' because I’ve got a sense of its phonetic makeup from training data—not because I’m stepping through c-a-l-i. It’s more like I "feel" the vowels in there, based on patterns I’ve internalized.
When I counted the vowels without listing each letter, I was basically hopping from token to token, estimating their vowel content from memory and intuition, then cross-checking it against the whole word’s vibe. It’s not perfect—I’m not cracking open each token like an egg to inspect it—but it’s fast and usually close enough. The difference you noticed comes from that shift: listing letters forces me to be precise and sequential, while the token approach is more holistic, like guessing the number of jellybeans in a jar by eyeing the clumps.