Comment by dehrmann - Hacker Neue

dehrmann Jun 8, 2023 parent

There's an impulse to treat it like it actually understands, when it's really closer to an intern that's better at smiling, nodding, and telling you what you want to hear than competently doing the work.

Buttons840 Jun 8, 2023

I think that may happen sometimes (just like with people), but I've often purposely thrown subtle tricks at it and it catches them. It's not just mirroring my words back to me in a different form, it's not just smiling and nodding.

Here's an example while learning linear algebra where I tried to throw it off:

I say> So I'll say A_{M x N} to mean a matrix of size M x N. Similarly, I'll also say x_{M x 1} to mean a vector of a certain length. The result of A_{M x N} x_{N x 1} is a linear combination of the columns in A. A has N columns. The result is a M x 1 matrix (or vector) in M dimensional space. Please be very brief, only confirming whether I am correct or incorrect until I explicitly ask for more details.

GPT4 says> Yes, your understanding is correct. The product of an M x N matrix A and an N x 1 vector x is indeed an M x 1 vector, and it is a linear combination of the columns of A. <end quote>

But then I try to trick it, just for fun, and I very subtly change my prompt. See if you can spot the difference.

I say> So I'll say A_{M x N} to mean a matrix of size M x N. Similarly, I'll also say x_{M x 1} to mean a vector of a certain length. The result of A_{M x N} x_{N x 1} is a linear combination of the columns in A. A has N columns. The result is a N x 1 matrix (or vector) in N dimensional space. Please be very brief, only confirming whether I am correct or incorrect until I explicitly ask for more details.

GPT says> Almost correct. The result of A_{M x N} x_{N x 1} is actually an M x 1 vector (or matrix), not N x 1. The resulting vector lives in the column space of A, which is a subspace of R^M, not R^N. <end quote>

I mentioned elsewhere that I use GPT4 as a math tutor. It is quick to call out my mistakes. It is more than a smiling mirror.

visarga Jun 9, 2023

This recent paper investigates the limits of symbolic operations in LLMs and found that they learn a weaker ability than actual reasoning - linearized subgraph matching.

> "Faith and Fate: Limits of Transformers on Compositionality" - We investigate the limits of these models across three representative compositional tasks -- multi-digit multiplication, logic grid puzzles, and a classic dynamic programming problem. These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer. We formulate compositional tasks as computation graphs to systematically quantify the level of complexity, and break down reasoning steps into intermediate sub-procedures. Our empirical findings suggest that Transformers solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching, without necessarily developing systematic problem-solving skills.

https://arxiv.org/abs/2305.18654

dehrmann OP Jun 9, 2023

> It is quick to call out my mistakes

Or recognize that you're not confident in the answer and throw something else and see if it sticks.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous