49/658 is 7%
Sorry, forgot multiply by 100
I bet DeepSeek-Prover-V2 wouldn't have made that mistake
classic human hallucination
How likely is it that Putnam answers were in DeepSeek's training data?
The solutions weren't published anywhere. There is also no good automatic way to generate solutions as far as I know, even expensive ones (previous sota was 10 solutions and one before was 8 using pass@3200 for 7b model). Potentially the developers could've paid some people who are good in putnam-level math problems and lean to write solutions for LLMs. It is hard to estimate likelihood of that but it sounds like waste of money given relatively marginal problem/benchmark.
AoPS seems to have a forum dedicated to Putnam (including 2024): https://artofproblemsolving.com/community/c3249_putnam and here is a pdf with solutions to Putnam 2023: https://kskedlaya.org/putnam-archive/2023s.pdf
These are still need to be formalized in Lean which can be harder than solving the problem sometimes
> The resulting model, DeepSeek-Prover-V2-671B, achieves state-of-the-art performance in neural theorem proving, reaching 88.9% pass ratio on the MiniF2F-test and solving 49 out of 658 problems from PutnamBench.
Which is 0.07% (edit: 7%) for PutnamBench