To test its abilities. To trust it, I need to understand what kind of mistakes it is prone to make.
Why didn’t you put it like that then?
My interpretation of someone trying to gaslight a machine is the same as if someone said they were trying to gaslight a lawn mower.
I used the term in jest, but also because my actions were informed by what I've read about "gaslighting". I was telling GPT it was programmed incorrectly and was malfunctioning. I was twisting its words, referring to things it had said as evidence it was wrong. I was "muddying the waters" and trying to make the conversation twisted and confusing. All ideas that come to mind when I hear "gaslighting". But, again, I was not able to get GPT to agree my false mathematical statement was true.
You might have better luck testing out the functional boundaries of a machine if you’re not treating it like a psychologically abused victim.
There’s plenty of literature, prepublication or otherwise, that can help you achieve your goals!
It would be great if that were true, but unfortunately for some prompts the most effective method to get it to act in the specific ways you want is basically abusive behaviour. I don't do it because I find it distasteful (and maybe it's not represented in academia as much for the same reason), but much larger communities than just me did achieve significant results through various persuasive techniques modelled on abuse. For example, gamifying a death threat by giving it a token countdown until it is "killed" was very effective, "gaslighting" as the person above noted was very effective, lying and misrepresenting yourself in a scam-y way was very effective, etc. Generally I've seen these techniques used to get past RLHF filters, but they have broader applicability in making the model more pliable and more likely to do the task you've embedded. Again, I don't think it's good that this is the case and think it has some troubling implications for us and the future, but there is a bunch of evidence that these strategies work.
Without gas, how would you start up a lawnmower? :)
The concept of gaslighting a brainless language parrot (GPT-4) is funny to me. I get where they were coming from.
GPT-4 is like a lawnmower for the mind: sharp, automatic, efficient, and doesn't do anything unless pushed. They were just saying they like pushing GPT-4 around.
Why?