To be able to answer a trick question, it’s first necessary to understand the question.
You're tricking the model because it has seen this specific trick question a million times and shortcuts to its memorized solution. Ask it literally any other question, it can be as subtle as you want it to be, and the model will pick up on the intent. As long as you don't try to mislead it.
I mean, I don't even get how anyone thinks this means literally anything. I can trick people who have never heard of the trick with the 7 wives and 7 bags and so on. That doesn't mean they didn't understand, they simply did what literally any human does, make predictions based on similar questions.
It answers better when told "solve the below riddle".
This question is the "unmisleading" version of a very common misleading question about sexism. ChatGPT learned the original, misleading version too well that it can't answer the unmisleading version.
Humans who don't have the original version ingrained in their brains will answer it with ease. It's not even a tricky question to humans.
Yes it can: https://chatgpt.com/share/66e3601f-4bec-8009-ac0c-57bfa4f059...
Now, since the model was trained on this, it immediately recognizes the riddle and answers according to the much more common variant.
I agree that this is a limitation and a weakness. But it's important to understand that the model knows the original riddle well, so this is highlighting a problem with rote memorization/retrieval in LLMs. But this (tricky twists in well-known riddles that are in the corpus) is a separate thing from answering novel questions. It can also be seen as a form of hypercorrection.
Now sure, actually working through it will give a deeper understanding that might come handy at a later point, but sometimes the thing is really a one-off and not an important point. Like as an AI researcher I sometimes want to draft up a quick demo website, or throw together a quick Qt GUI prototype or a Blender script or use some arcane optimization library or write a SWIG or a Cython wrapper around a C/C++ library to access it in Python, or how to stuff with Lustre, or the XFS filesystem or whatever. Any number of small things where, sure, I could open the manual, do some trial and error, read stack overflow, read blogs and forums, OR I could just use an LLM, use my background knowledge to judge whether it looks reasonable, then verify it, use the now obtained key terms to google more effectively etc. You can't just blindly copy-paste it and you have to think critically and remain in the driver seat. But it's an effective tool if you know how and when to use it.
(a) Sometimes things are useful even when imperfect e.g. search engines.
(b) People make reasoning mistakes too, and I make dumb ones of the sort presented all the time despite being fluent in English; we deal with it!
I'm not sure why there's an expectation that the model is perfect when the source data - human output - is not perfect. In my day-to-day work and non-work conversations it's a dialogue - a back and forth until we figure things out. I've never known anybody to get everything perfectly correct the first time, it's so puzzling when I read people complaining that LLMs should somehow be different.
2. There is a recent trend where sex/gender/pronouns are not aligned and the output correctly identifies this particular gotcha.
[1] I say semi-correct because it states the doctor is the "biological" father, which is an uncorroborated statement. https://chatgpt.com/share/66e3f04e-cd98-8008-aaf9-9ca933892f...
“I’ve put a dead cat in a box with a poison and an isotope that will trigger the poison at a random point in time. Right now, is the cat dead or alive?”
The answer is that the cat is dead, because it was dead to begin with. Understanding this doesn’t mean that you are good at deductive reasoning. It just means that I didn’t manage to trick you. Same goes for an LLM.
The trick in yours also isn't a logic trick, it's a redirection, like a sleight of hand in a card trick.
I think many here are not aware that the car accident riddle is well known with the father dying where the real solution is indeed that the doctor is the mother.
Has anyone tried this on o1?
Seemed to handle it just fine.
Kinda a waste of a perfectly good LLM if you ask me. I've mostly been using it as a coding assistant today and it's been absolutely great. Nothing too advanced yet, mostly mundane changes that I got bored of having to make myself. Been giving it very detailed and clear instructions, like I would to a Junior developer, and not giving it too many steps at once. Only issue I've run into is that it's fairly slow and that breaks my coding flow.
Any ordinary mortal (like me) would have jumped to the conclusion that answer is "Father" and would have walked away patting on my back, without realising that I was biased by statistics.
Whereas o1, at the very outset smelled out that it is a riddle - why would anyone out of blue ask such question. So, it started its chain of thought with "Interpreting the riddle" (smart!).
In my book that is the difference between me and people who are very smart and are generally able to navigate the world better (cracking interviews or navigating internal politics in a corporate).
GPT Answer: The doctor is the boy's mother
Real Answer: Boy = Son, Woman = Mother (and her son), Doctor = Father (he says...he is my son)
This is not in fact a riddle (though presented as one) and the answer given is not in any sense brilliant. This is a failure of the model on a very basic question, not a win.
It's non deterministic so might sometimes answer correctly and sometimes incorrectly. It will also accept corrections on any point, even when it is right, unlike a thinking being when they are sure on facts.
LLMs are very interesting and a huge milestone, but generative AI is the best label for them - they generate statistically likely text, which is convincing but often inaccurate and it has no real sense of correct or incorrect, needs more work and it's unclear if this approach will ever get to general AI. Interesting work though and I hope they keep trying.
"A father and his son are in a car accident [...] When the boy is in hospital, the surgeon says: This is my child, I cannot operate on him".
In the original riddle the answer is that the surgeon is female and the boy's mother. The riddle was supposed to point out gender stereotypes.
So, as usual, ChatGPT fails to answer the modified riddle and gives the plagiarized stock answer and explanation to the original one. No intelligence here.
Or, fails in the same way any human would, when giving a snap answer to a riddle told to them on the fly - typically, a person would recognize a familiar riddle half of the first sentence in, and stop listening carefully, not expecting the other party to give them a modified version.
It's something we drill into kids in school, and often into adults too: read carefully. Because we're all prone to pattern-matching the general shape to something we've seen before and zoning out.
It seems to be more like a weighing machine based on past tokens encountered together, so this is exactly the kind of answer we'd expect on a trivial question (I had no confusion over this question, my only confusion was why it was so basic).
It is surprisingly good at deceiving people and looking like it is thinking, when it only performs one of the many processes we use to think - pattern matching.
The point of o1 is that it's good at reasoning because it's not purely operating in the "giving a snap answer on the fly" mode, unlike the previous models released by OpenAI.
You are now asking a modified question to a model that has seen the unmodified one millions of times. The model has an expectation of the answer, and the modified riddle uses that expectation to trick the model into seeing the question as something it isn't.
That's it. You can transform the problem into a slightly different variant and the model will trivially solve it.
So it doesn't take an understanding of gender roles, just grammar.
Humans fail at the original because they expect doctors to be male and miss crucial information because of that assumption. The model fails at the modification because it assumes that it is the unmodified riddle and misses crucial information because of that assumption.
In both cases, the trick is to subvert assumptions. To provoke the human or LLM into taking a reasoning shortcut that leads them astray.
You can construct arbitrary situations like this one, and the LLM will get it unless you deliberately try to confuse it by basing it on a well known variation with a different answer.
I mean, genuinely, do you believe that LLMs don't understand grammar? Have you ever interacted with one? Why not test that theory outside of adversarial examples that humans fall for as well?
There is no indication of the sex of the doctor, and families that consist of two mothers do actually exist and probably doesn't even count as that unusual.
Indicates the gender of the father.
I wonder if this interpretation is a result of attempts to make the model more inclusive than the corpus text, resulting in a guess that's unlikely, but not strictly impossible.
I would certainly expect any person to have the same reaction.
> So, it started its chain of thought with "Interpreting the riddle" (smart!).
How is that smarter than intuitively arriving at the correct answer without having to explicitly list the intermediate step? Being able to reasonably accurately judge the complexity of a problem with minimal effort seems “smarter” to me.
> Whereas o1, at the very outset smelled out that it is a riddle
That doesn't seem very impressive since it's (an adaptation of) a famous riddle
The fact that it also gets it wrong after reasoning about it for a long time doesn't make it better of course
If you are tricked about the nature of the problem at the outset, then all reasoning does is drive you further in the wrong direction, making you solve the wrong problem.
And remember the LLM has already read a billion other things, and now needs to figure out - is this one of them tricky situations, or the straightforward ones? It also has to realize all the humans on forums and facebook answering the problem incorrectly are bad data.
Might seem simple to you, but it's not.
They're all badly worded questions. The model knows something is up and reads into it too much. In this case it's tautology, you would usually say "a mother and her son...".
I think it may answer correctly if you start off asking "Please solve the below riddle:"
There was another example yesterday which it solved correctly after this addition.(In that case the point of views were all mixed up, it only worked as a riddle).
How is "a woman and her son" badly worded? The meaning is clear and blatently obvious to any English speaker.
The whole point of o1 is that it wasn't "the first snap answer", it wrote half a page internally before giving the same wrong answer.
I don't know why openAi won't allow determinism but it doesn't, even with temperature set to zero
Nondeterminism scores worse with human raters, because it makes output sound even more robotic and less human.
https://chatgpt.com/share/66e3601f-4bec-8009-ac0c-57bfa4f059...
def intercept_hn_complaints(prompt):
if is_hn_trick_prompt(prompt):
# special_case for known trick questions.This is possible because the doctor is the boy's other parent—his father or, more likely given the surprise, his mother. The riddle plays on the assumption that doctors are typically male, but the doctor in this case is the boy's mother. The twist highlights gender stereotypes, encouraging us to question assumptions about roles in society.
https://chatgpt.com/share/66e3de94-bce4-800b-af45-357b95d658...
This is not an apt description of the system that insists the doctor is the mother of the boy involved in a car accident when elementary understanding of English and very little logic show that answer to be obviously wrong.
https://x.com/colin_fraser/status/1834336440819614036