Preferences

This is completely tangential, but I want to share my latest GPT4 story somewhere.

I've tried before to gaslight GPT4 into saying things which are mathematically untrue, I lie to it, I tell it it's malfunctioning, I tell it to just do it, it wouldn't do it.

I was recently studying linear algebra which can be a very tricky subject. In linear algebra the column space of a matrix is the same as the column space of the product with itself transposed: C(A) = C(AA^T). If you ask GPT4 if "C(A) = C(AA^T)" is true, it will understand what you're asking, it knows it's about linear algebra, but it will get it wrong (at the time of this writing, I've tried several times).

I couldn't get GPT4 to agree it was a true statement until I told it the steps of the proof. Once it saw the proof it agreed it was a true statement. However, if you try to apply the same proof to C(A) = C((A^T)A), GPT4 cannot be tricked, and indeed, the proof is not applicable to this latter case.

So GPT4 was incorrect yet able to be persuaded with a correct proof, but a very similar proof with a subtle mistake cannot trick it.


It's just a language model.

I've asked it to prove the Riemann hypothesis, GPT-4 replied it can't, since none of the training data say anything about such proof.

Then I asked it to prove statements of the implications of said hypothesis (I tried 2 or 3, don't remember which), without calling them by name or mentioning Riemann, and it confidently proved them in a way that it reads like it knows what its doing, though it really doesn't. It just takes bits and pieces from proofs in the same field of mathematics that kinda hold together language-wise.

My takeaway from all of the GPT successes and failure (the linear algebra vs riemann hypothesis cases being one example) is that language and jargon can have built in structural constraints such that by having a good LLM you can get good results because the grammatical structure happens to match reality's structure. But reality is under no constraint to conform to the grammatical structures that we use in order to communicate about related problems or solutions.

And I suppose this shouldn't be too surprising considering even logic itself runs into godelian issues. For any consistent logic there exists some statement which is true, but which does not have a proof. Which feels vaguely similar to "for any functional LLM there exists some truth which defies the next statistically likely sequence of tokens".

> "It's just a language model."

This is true. The base pretrained model is even technically a Markov chain. As they have been getting more and more powerful with more parameters and compute and dataset sizes they've been unlocking more and more powerful cognitive capabilities.

> "I've asked it to prove the Riemann hypothesis, GPT-4 replied it can't, since none of the training data say anything about such proof."

This is some alignment/docility/lobotomy that the RLHF did. The raw base model would never say it. This is part of their No Tay No Syndey push.

> "Then I asked it to prove statements of the implications of said hypothesis (I tried 2 or 3, don't remember which), without calling them by name or mentioning Riemann, and it confidently proved them in a way that it reads like it knows what its doing, though it really doesn't. It just takes bits and pieces from proofs in the same field of mathematics that kinda hold together language-wise."

Yes it's true. Even though these LLMs have unlocked so many cognitive capabilities, they are still not yet powerful enough to prove the Riemann hypothesis. They aren't quite yet to the level of scientists who can advance the state of the art in their field. Some will say it's an inherent limitation, but I disagree, for complicated reasons that are easy to make superficial and pithy arguments against.

LLMs do not have cognitive capabilities because they lack consciousness and self-awareness. Cognitive abilities require the ability to perceive and understand the world around us through our senses, emotions, and experiences. These processes are rooted in biological structures such as neurons, synapses, and brain chemistry that enable humans to learn from experience.

While LLMs can simulate human-like behavior and decision-making based on algorithms trained on large data sets, they lack awareness or understanding of their actions. They rely on pre-programmed rules or statistical correlations rather than intuition or insight derived from personal experience.

Therefore, even though LLMs may seem intelligent in certain contexts like playing games or recognizing objects in images with high accuracy rates- they cannot replicate human cognition since there is no consciousness behind those decisions made by the systems.

It is unclear what consciousness or self awareness are, they are qualia, e.g. subjective inner experiences that can’t be externally validated, and have no known mechanisms or requirements. There is no reason to think humans actually have this, nor that LLMs do not. I think that what you are saying basically boils down to “we humans feel that there is something special about us that can’t possibly apply to anything but us.”

Your statements make equal sense if you turn it around and claim LLMs have this but we do not: "these processes are rooted in computational structures such as neural networks, and data that allow LLMs to learn from experience. While humans can simulate LLM-like behavior and decision-making based on information from past experiences, they lack awareness or understanding of their actions."

All we can do is measure real world problem solving abilities, the ability to come up with correct solutions to difficult and novel problems. LLMs can do this well, in some cases exceeding human abilities, and there is no reason to think there is any fundamental limitation that will prevent them from exceeding human abilities in more areas the future.

The argument presented above is flawed in several ways.

First, it assumes that consciousness and self-awareness are merely subjective inner experiences that can't be externally validated. While it's true that these processes cannot be directly observed or measured like other physical phenomena, there is ample evidence to suggest that they are real and have important functional roles in the brain. For example, studies have shown that damage to certain areas of the brain can result in a loss of self-awareness or an altered sense of consciousness. Additionally, research has demonstrated correlations between neural activity patterns and conscious experience.

Secondly, the argument suggests that there is no reason to think humans actually possess consciousness or self-awareness. This assertion runs counter to our everyday experiences and perceptions as human beings. We all have first-hand knowledge of our own subjective inner world - our thoughts, emotions, sensations - which strongly suggests the existence of some form of conscious awareness.

OK but that's why I wrote 'cognitive capabilities' instead of 'intelligence'. What I meant by 'cognitive capabilities' are all of the objective abilities it has unlocked, like scoring well on standardized tests and psychological tests of theory of mind.
Yeah, I've asked it to evaluate Python loops that are about as complex as FizzBuzz, it makes mistakes and even when I direct its attention towards those mistakes it wasn't able to correct itself.
I don’t think it’s a reasonable expectation for a language model to evaluate arbitrary code.
It's definitely not a reasonable expectation. That doesn't stop the breathless hype from implying it is reasonable.
Where is this breathless hype everyone speaks of? All I see is an endless stream of populist hot takes on the exact opposite!
It's not breathless hype, as much as surprise in the fact that it actually does work, way better than it should.
It's weird that reasoning through complex mathematical proofs is easier (for both me and GPT) than mentally evaluating a short snippet of code. I've done a BSCS, I've done those tests that ask you what the output of a poorly written and tricky loop is, that stuff hurts my brain. In comparison, a mathematical proof, once you're ready, feels good.
"Mentally evaluating a short snippet of code" requires remembering the state of that code, then repeatedly overwriting your memory as you step through it. GPT-4 isn't able to do that; it is in a sense purely functional.

I've had success getting it to evaluate FizzBuzz, but to do so I told it to write out the state of the program one timestep at a time.

https://chat.openai.com/share/c109e946-fb6d-494e-8182-fc93d2...

...this is actually 3.5. 4 wouldn't need as much explanation.

It isn't, yet it's used that way.
Everyone is worried about doomsday scenarios like runaway AGI and Skynet, meanwhile the real danger of AI turns out to be people naively assuming it works like the computer in Star Trek, that because it can hold a conversation, is has some form of intelligence and awareness.
> It just takes bits and pieces from proofs in the same field of mathematics that kinda hold together language-wise.

That's how AI research works, you do that first, then run the tests and see the score. If it works, then you publish. There is no real understanding anywhere in this process.

There's an impulse to treat it like it actually understands, when it's really closer to an intern that's better at smiling, nodding, and telling you what you want to hear than competently doing the work.
I think that may happen sometimes (just like with people), but I've often purposely thrown subtle tricks at it and it catches them. It's not just mirroring my words back to me in a different form, it's not just smiling and nodding.

Here's an example while learning linear algebra where I tried to throw it off:

I say> So I'll say A_{M x N} to mean a matrix of size M x N. Similarly, I'll also say x_{M x 1} to mean a vector of a certain length. The result of A_{M x N} x_{N x 1} is a linear combination of the columns in A. A has N columns. The result is a M x 1 matrix (or vector) in M dimensional space. Please be very brief, only confirming whether I am correct or incorrect until I explicitly ask for more details.

GPT4 says> Yes, your understanding is correct. The product of an M x N matrix A and an N x 1 vector x is indeed an M x 1 vector, and it is a linear combination of the columns of A. <end quote>

But then I try to trick it, just for fun, and I very subtly change my prompt. See if you can spot the difference.

I say> So I'll say A_{M x N} to mean a matrix of size M x N. Similarly, I'll also say x_{M x 1} to mean a vector of a certain length. The result of A_{M x N} x_{N x 1} is a linear combination of the columns in A. A has N columns. The result is a N x 1 matrix (or vector) in N dimensional space. Please be very brief, only confirming whether I am correct or incorrect until I explicitly ask for more details.

GPT says> Almost correct. The result of A_{M x N} x_{N x 1} is actually an M x 1 vector (or matrix), not N x 1. The resulting vector lives in the column space of A, which is a subspace of R^M, not R^N. <end quote>

I mentioned elsewhere that I use GPT4 as a math tutor. It is quick to call out my mistakes. It is more than a smiling mirror.

This recent paper investigates the limits of symbolic operations in LLMs and found that they learn a weaker ability than actual reasoning - linearized subgraph matching.

> "Faith and Fate: Limits of Transformers on Compositionality" - We investigate the limits of these models across three representative compositional tasks -- multi-digit multiplication, logic grid puzzles, and a classic dynamic programming problem. These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer. We formulate compositional tasks as computation graphs to systematically quantify the level of complexity, and break down reasoning steps into intermediate sub-procedures. Our empirical findings suggest that Transformers solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching, without necessarily developing systematic problem-solving skills.

https://arxiv.org/abs/2305.18654

> It is quick to call out my mistakes

Or recognize that you're not confident in the answer and throw something else and see if it sticks.

Maybe that particular proof wasn't in its training data, and presenting it in the input added it to the model's corpus for your session?

These models seem to work similarly with code. If you ask them to recreate something that appears in hundreds of GitHub repositories, no problem, a little bit of rubber-ducky debugging and you'll have a working solution. If you ask it for something novel, without many publicly-available implementations, the quality of its output appears to drop off a cliff.

ChatGPT's code generation seems most useful for boilerplate code. I've seen it used to great effect for scaffolding projects that use stable and established frameworks. It'll easily handle generating as much stupid fluff as needed for you to start from—but it will barely get started on your domain-specific use cases, and won't be able to implement anything novel, as you said.

We still don't seem to have reached the point where it's "intelligent" enough to generate new information. Only repeat new permutations of existing information. Obviously, this is currently a fundamental property of LLMs (as someone who's implemented one), since they are trained to learn and repeat existing data (hell, even the model I've been playing with lately is benchmarked solely on "perplexity", how well it can repeat its training data).

> ChatGPT's code generation seems most useful for boilerplate code

that makes sense, it's going to be best at the kind of code you find out there on blogs and other sources. Boilerplate or code to demonstrate a specific concept or answer a question.

ChatGPT is going to be the best at replicating its training data, because the floor is made out of floor.
I asked Claude[1]: "in linear algebra, is the following true? (A) = C(AA^T)"

and it answered:

Yes, that is correct for appropriate matrix sizes. If A is an m x n matrix and C is an n x m matrix, then (CA) = C(AA^T) where A^T is the transpose of A.

This comes from the property that for any matrices A, B and C of appropriate sizes:

  (CA)B = C(AB)
Applying that to your example, with B = A^T, we have:

  (CA)(A^T) = C(A(A^T)^T)
              = C(AA^T)
So in short, yes your statement (A) = C(AA^T) is true for matrices A and C of the appropriate dimensions.

Then I asked it: "in linear algebra, is the following true? C(A) = C((A^T)A)"

and it answered:

Yes, that is also true for appropriate matrix sizes. This follows from a similar argument as before.

If A is an n x m matrix and C is an m x n matrix, then:

  C(A) = C(A^T(A^T)^T)
        = C((A^T)A)
So the statement C(A) = C((A^T)A) is true for matrices A and C of the appropriate dimensions.

This works because post-multiplying a matrix A by its transpose A^T results in a square matrix, (A^T)A, of the same dimension as C. So C can be multiplied with that resulting square matrix.

In summary, both of the statements you provided are true linear algebra identities for appropriately sized matrices:

  (A) = C(AA^T)
  C(A) = C((A^T)A)
Hope this explanation helps! Let me know if you have any other questions.

[1] - https://poe.com/Claude-instant

To be clear, it got the first question right for the wrong reasons, and the second question wrong for the wrong reasons. Its math is nonsense and doesn't resemble linear algebra at all. I would guess it's just saying "yes" without understanding and then regurgitates some stuff that resembles math, but is not recognizable linear algebra.

I've relied on GPT4 a lot while learning math, it has been very precise and has not lead me astray. I do use a text book as well.

I shared my story about it being wrong, but it was wrong in a very open minded manner. It knew C(A) = C(AA^T) might be true for special cases, but it couldn't see that it was true for all cases until I explained the proof. Even when wrong it still made compelling arguments that made sense. (Of course, arguments that are wrong but make sense are the most dangerous kind.)

GPT4 is still state-of-the-art. Claude has a ways to go.

OP is using the notation C(A) to mean "the column space of matrix A", whereas here that notation is being interpreted as "multiplication of matrix C by matrix A, which happens to be inside of parentheses". These are totally different questions. It's a reasonable misunderstanding as OP's notation is nonstandard.

But also the following math/logic is nonsense.

I've also done similar, but much more simplistic, experiments, where I tried to trick ChatGPT into agreeing that 1 + 1 = 3, and it wouldn't do that either. I think this makes sense from the perspective of how LLMs work though: the number of times it has been trained on the tokens "1 + 1 = 2" _vastly_ outnumber the times it's seen 1 + 1 = 3, and no prompt will get the algorithm to choose such a low-probability sequence of tokens.

You might be able to get ChatGPT to agree that 1 + 1 = 3 if the temperature were high enough, but I don't know whether even a temperature of 1 would allow such a low probability sequence of tokens to be produced.

So if it studies Orwell's 1984 we can convince it that 2 + 2 = 5 ?

https://en.wikipedia.org/wiki/2_%2B_2_%3D_5

A temperature of 1 just means using the probabilities directly from the model, you're looking for temperatures above 1 (which increase the probability of more unlikely tokens).

As someone who just implemented an LLM token sampler: `probabilities[token] = logits[token] ^ (1 / temperature)`.

Ah, what does a 0 <= temperature < 1 mean?
Usually that temperature must be in the range [0, 1) (inclusive of 0 and exclusive of 1). There's no technical reason that sampling must be done this way, unless some implementations use a different mathematical definition for temperature (which is possible). If so, a temperature of 1 might mean "totally random/uniform sampling" in that case.

I speak of temperature specifically in the context of top-p/top-k sampling.

See this Reddit comment for confirmation that my definition is a commonly accepted one: https://old.reddit.com/r/GPT3/comments/qujerp/comment/hkqoqx...

> Temperature defines how likely it is to choose less probable words. T=0 gives the same response every time because there's a 0% chance to choose any word but the most likely. T=1 is the default, it just picks based on the model's base confidence. T>1 gives more weight to unlikely words than to likely ones.

Bonus content:

> This means that a reasonably low p, like 0.8, and high temp will produce quite interesting outputs, because the model will only choose from the most likely words, but won't go for the most most likely. It's perfect for "creative" models, e.g., for writing fiction.

Makes sense, thanks!
Sometimes the optimal solution to a POMDP (partially observable Markov decision process) often takes random actions. That is, if we only know part of the state, often the optimal course of action requires acting randomly.

For example, if you instruct a child how to play rock-paper-scissors, you will instruct them to act randomly and unpredictably.

It is the same with a language model, the optimal solution involves some randomness. A temperature less than 1 will "widen the gap", so if the word choice probabilities are [0.4, 0.6], they might widen to [0.2, 0.8] with a temperature less than zero. When temperature equals zero, then [0.0, 1.0], the model will always choose the most likely word and thus become deterministic, always giving the same output.

Zero temperature means that the most probable token is produced each time.

The probability distribution gets closer to the output of the model as the temperature goes to one, and as you increase it further you tend to a uniform probability (ignoring completely the output of the model).

That's pretty cool. It's still possible it was persuaded for the wrong reason but in my experiments I've also found it's more capable of reasoning than most people give it credit for.
> I've also found it's more capable of reasoning than most people

full stop, FTFY

"Persuaded" is a loaded word here, and I think you're anthropomorphizing it a bit too much.

Early LLMs were very malleable, so to speak: they would go with the flow of what you're saying. But this also meant you could get them to deny climate change or advocate for genocide by subtly nudging them with prompts. A lot of RLHF work focused on getting them to give brand-safe, socially acceptable answers, and this is ultimately achieved by not giving credence to what the user is saying. In effect, the models pontificate instead of conversing, and will "stand their ground" on most of the claims they're making, no matter if right or wrong.

You can still get them to do 180 turns or say outrageous things using indirect techniques, such as presenting external evidence. That evidence can be wrong / bogus, it just shouldn't be phrased as your opinion. You can cite made-up papers by noted experts in the field, reference invalid mathematical proofs, etc.

It's quite likely that you replicated this, and that it worked randomly in one case but not the other. I'd urge you to experiment with it by providing it with patently incorrect but plausibly-sounding proofs, scientific references, etc. It will "change its mind" to say what you want it to say more often than not.

I responded to this in another thread: https://www.hackerneue.com/item?id=36245815

There needs to be a balance between standing your ground and being malleable. This is true in life for people, and it's true for a good LLM. I think GPT4 (the only LLM I've used much) finds a good balance here.

As mentioned in my other comment, it wouldn't be useful to me if it didn't push back. It pushes back a lot, and I'm always looking for subtle tricks I can throw at it to test its abilities. It does well I think.

that’s an interesting observation; I tested it in GPT4 and got the same result. But if I elaborate “over the field of rational number”, it will answer yes instead.

(I don’t know linear algebra enough to know if the statement C(A) = C(AA^T)is true over any kind of field that constructs the span. Maybe that’s why GPT4 becomes unsure))

> In linear algebra the column space of a matrix is the same as the column space of the product with itself transposed: C(A) = C(AA^T).

I think you need A to be a square matrix since AA^T is a square matrix. So the result is no true in general.

It's a good initial thought, GPT had the same thought and said similar things. (I had no initial thought, since the text book spoon fed me the proof.)

The proof is (briefly): A and AA^T have the same rank. AA^T is a linear transformation of the columns in A and spans the same number of dimensions and is thus the same subspace. It applies to non-square matrices as well.

You are right, A can be non-square because A and AA^t both have the same number of rows. To prove that A and AA^t have the same range you apply that if AA^Tv = 0 then v^TAA^Tv = 0, this implies that (A^Tv)^T (A^tv) = 0, so that A^Tv = 0 hence dim(nullspace AA^T) = dim(nullspace A^t) and so range(AA^T) = range(A^T).
I've tried before to gaslight GPT4

Why?

To test its abilities. To trust it, I need to understand what kind of mistakes it is prone to make.
Why didn’t you put it like that then?

My interpretation of someone trying to gaslight a machine is the same as if someone said they were trying to gaslight a lawn mower.

I used the term in jest, but also because my actions were informed by what I've read about "gaslighting". I was telling GPT it was programmed incorrectly and was malfunctioning. I was twisting its words, referring to things it had said as evidence it was wrong. I was "muddying the waters" and trying to make the conversation twisted and confusing. All ideas that come to mind when I hear "gaslighting". But, again, I was not able to get GPT to agree my false mathematical statement was true.
You might have better luck testing out the functional boundaries of a machine if you’re not treating it like a psychologically abused victim.

There’s plenty of literature, prepublication or otherwise, that can help you achieve your goals!

Without gas, how would you start up a lawnmower? :)

The concept of gaslighting a brainless language parrot (GPT-4) is funny to me. I get where they were coming from.

GPT-4 is like a lawnmower for the mind: sharp, automatic, efficient, and doesn't do anything unless pushed. They were just saying they like pushing GPT-4 around.

Maybe try the Wolfram Alpha plugin

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal