agucova parent
This benchmark’s questions and answers will be kept fully private, and the benchmark will only be run by Epoch. Short of the companies fishing out the questions from API logs (which seems quite unlikely), this shouldn’t be a problem.
> answers will be kept fully private
> Short of the companies fishing out the questions from API logs (which seems quite unlikely)
They all pretty clearly state[1] versions of "We use your queries (removing personal data) to improve the models" so I'm not sure why that's unlikely.
https://help.openai.com/en/articles/5722486-how-your-data-is...
Ideally they would have batches of those exercises, where the only use the next batch when someone has solved a suspicious amount of those exercises. If it performs much worse on the next batch, that is a tell of leakage.
I looked at the sample questions and even if they get the questions there is no way they will figure out the answers without making significant breakthroughs in understanding mathematics and logic.