Comment by semiquaver

semiquaver 4 days ago parent

LLMs are not inherently indeterministic. Batching, temperature, and other things make them appear so when run by big providers but a locally-run LLM model at zero temperature will always produce the same output given the same input.

oytis 4 days ago

That's an improvement, they are still "chaotic" though in that small changes in input can change the output unpredictably strong

behnamoh 4 days ago

Yes, this paper says exactly what you talked about: https://arxiv.org/abs/2404.01332

lmeyerov 4 days ago

That assumes they were implemented with deterministic operators, which isn't the default assumption when using neural network libs on GPUs. Imagine random seeds, cublas optimizations - like you can configure all these things, but I wouldn't assume it, esp in GPU-optimized OSS..

This item has no comments currently.