Comment by westurner - Hacker Neue

westurner Jul 12, 2025 parent

Use case for science and code LLMs: Superhydrodynamic gravity (SQR / SQG, )

LLMs do seem to favor general relativity but probably would've favored classical mechanics at the time given the training corpora.

Not-yet unified: Quantum gravity, QFT, "A unified model must: " https://www.hackerneue.com/item?id=44289148

Will be interested to see how this model responds to currently unresolvable issues in physics. Is it an open or a closed world mentality and/or a conditioned disclaimer which encourages progress?

What are the current benchmarks?

From https://www.hackerneue.com/item?id=42899805 re: "Large Language Models for Mathematicians" (2023) :

> Benchmarks for math and physics LLMs: FrontierMath, TheoremQA, Multi SWE-bench: https://www.hackerneue.com/item?id=42097683

Multi-SWE-bench: A Multi-Lingual and Multi-Modal GitHub Issue Resolving Benchmark: https://multi-swe-bench.github.io/

Add'l LLM benchmarks and awesome lists: https://www.hackerneue.com/item?id=44485226

Microsoft has a new datacenter that you don't have to keep adding water to; which spares the aquifers.

How to use this LLM to solve energy and sustainability problems all LLMs exacerbate? Solutions for the Global Goals, hopefully

westurner OP Jul 13, 2025

(Unbelievable that I need to justify this at -4!)

Is the performance or accuracy on this better on FrontierMath or Multi-SWE-bench, given the training in 1,000 languages?

I just read in the Colab release notes that models uploaded to HuggingFace can be opened on Colab with "Open in colab" on HuggingFace

kordlessagain Jul 14, 2025

It's the word "gravity" that triggers them.

This item has no comments currently.