Profile: mreid - Hacker Neue

mreid

Joined Jul 15, 2009 449 karma

Machine learning engineer at Apple. Formerly an academic researcher in ML at the Australian National University and NICTA. Hobbies: running, synths, piano, go.

mreid Oct 17, 2025 parent
We've come a long way in 40 years from Racter's automatically generated poetry: https://www.101bananas.com/poems/racter.html
I always found this one a little poignant:
```
  More than iron
  More than lead
  More than gold I need electricity
  I need it more than I need lamb or pork or lettuce or cucumber
  I need it for my dreams
```
mreid Sep 4, 2025 parent

On your site you make the claim that: "Our thesis is that there is 100 years of physics and math research that has gone unnoticed by the CS/ML communities and we intend to rectify that."
Extraordinary claims require extraordinary evidence. Especially considering that a decent fraction of the CS/ML researchers that I know have solid physics and math backgrounds. Just of the top of my head, Marcus Hutter, David MacKay, Bernhard Scholkopf, Alex Smola, Max Welling, Christopher Bishop, etc. are/were prominent researchers with strong math and physics backgrounds. More recently Jared Kaplan and Dario Amodei at Anthropic also have physics backgrounds, as well as plenty of people at DeepMind.
To claim that you have noticed something in "100 years of physics and math research" that all of those people (and more) have missed and you didn't is pure hubris.
mreid Sep 1, 2025 parent

Is it concerning to anyone else that the "Simple & Reliable" and "Reliable on Longer Tasks" diagrams look kind of like the much maligned waterfall design process?
mreid Jul 16, 2025 parent

I also had to convince my parents to buy me books about fractals. My prized possession as a 15 year old was a copy of Mandelbrot's "Fractal Geometry of Nature". A lot of it went over my head but it had some gorgeous colour plates and interesting sections. I still have it at home some 35 years later.
That also inspired me to write IFS code for ferns, Sierpinski gaskets, and Menger sponges in 68k assembler (after realizing AmigaBASIC was too slow).
mreid Jul 16, 2025 parent

Thanks, I hadn't seen that term before.
Just to be clear, I wasn't claiming that "communicating clearly" is a new idea in software engineering, I'm mainly commenting on how effective embracing it can be.
When doing math, pretty much every term is "load-bearing" in that arguments will make use of specific aspects of a concept and how it relates to other concepts.
If you look at most graduate-level math textbooks or papers, they typically start with a whole bunch of numbered definitions that reference each other, followed by some simple lemmas or propositions that establish simple relationships between them before diving into more complex theorems and proofs.
The best software projects I've seen follow a roughly similar pattern: there are several "core" functions or libraries with a streamlined API, good docs, and solid testing; on top of that there are more complex processes that treat these as black-boxes and can rely on their behavior being well-defined and consistent.
Probably the common thread between math and programming is both lean heavily on abstraction as a core principle.
mreid Jul 15, 2025 parent

Another thing I learnt from my math degree that I find helps a lot when programming and software engineering more generally is *defining your terms*.
So many communication issues on teams occur when people are using the same words to mean different things.
mreid Jul 15, 2025 parent

As someone who taught myself 68000 assembler as a kid in order to render Mandelbrot and Julia sets quickly it still blows my mind a little that fairly hi-res versions of these can be rendered basically instantaneously in a browser using an interpreted language.
mreid Jul 3, 2025 parent

A lot of those skills come from thinking about development in a team as a system and ask where do things frequently go wrong or take too long?
Practice clearly and concisely expressing what you understand the problem to be. This could be a problem with some code, some missing knowledge, or a bad process.
Check to see whether everyone understands and agrees. If not, try to target the root of the misunderstanding and try again. Sometimes you’ll need to write a short document to make things clear. Once there is a shared understanding then people can start taking about solutions. Once everyone agrees on a solution, someone can go implement it.
Like any skill, if you practice this loop often enough and take time to reflect on what worked and what didn’t, you slowly find that you develop a facility for it.
mreid Jun 30, 2025 parent

Really nice presentation of divergences. I made a similar attempt (math + interactive JS visualizations) a long time ago for the Bregman divergences: https://mark.reid.name/blog/meet-the-bregman-divergences.htm...
These visualizations are much nicer than mine though.
Curious fact, the Bregman divergences are a different class of divergences to the f-divergences that intersect at the KL divergence. That is, KL is (essentially) the only divergence that is both an f-divergence and a Bregman divergence. This is basically because log turns a ratio into a difference.
mreid Jun 25, 2025 parent

A sphere is not a tetrahedron.
mreid May 29, 2025 parent

I'm guessing it is a derogatory pun, alluding to vibe coders.
mreid May 29, 2025 parent

These are definitely thought-provoking questions and there are branches of mathematical philosophy such as constructivism and mathematical intuitionism that explore these.
Even if computation is not directly part of the laws of physics, knowing that humans and our computers are limited to things that are finite and computable might place limits on how we can appreciate how the universe works.
This is kind of a digression but if you (or others) are interested in some examples of things that are right on the edge of these questions you should check out the [busy beaver function](https://www.quantamagazine.org/amateur-mathematicians-find-f...). This tell you the maximum number of steps an n-state Turing machine can take before halting.
mreid May 29, 2025 parent

Sure, “arbitrary” is a suitable term here too. A random process is just one way to generate an arbitrary function.
My use of “random” here was referring to the coin flipping process and was more for building intuition than precisely specifying all the other non-expressible functions. I was trying to allude to the fact that these other type of functions don’t have any easily expressible process behind them. When I’ve taught this stuff in the past I’ve found that people latch onto coin flipping more easily that imagining some arbitrary assignment of values.
For what it’s worth, I used to be a researcher in probability and information theory and have published papers on those topics so I am aware of the various technical definitions of randomness (Kolmogorov axioms, algorithmic probability theory, etc.)
I think you’re right about my comment being a little too lengthy for most people to find useful. I started explaining this stuff and got carried away.
mreid May 29, 2025 parent

I think you are experiencing the same confusion I felt when I first started thinking about the difference between a program and a function.
The set of all possible mapping from all possible finite strings to booleans is definitely *not* countable.
What I (and the article) mean by a "function" `f : string -> boolean` here is any arbitrary assignment of a single boolean value to every possible string. Let's consider two examples:
1. Some "expressible" function like "f(s) returns True if the length of s is odd, and returns False otherwise".
2. Some "random" function where some magical process has listed out every possible string and then, for each string, flipped a coin and assigned that string True if the coin came up heads, and False if it came up tails and wrote down all the results in a infinite table and called that the function f.
The first type of "expressible" function is the type that we most commonly encounter and that we can implement as programs (i.e., a list of finitely many instructions to go from string to boolean).
Nearly all of the second type of function -- the "random" ones -- cannot be expressible using a program. The only way to capture the function's behavior is to write down the infinite look-up table that assigns each string a boolean value.
Now you are probably asking, "How do you know that the infinite tables cannot be expressed using some program?" and the answer is because there are too many possible infinite tables.
To give you an intuition for this, consider all the way we can assign boolean values to the strings "a", "b", and "c": there are 2^3 = 8. For any finite set X of n strings there will be 2^n possible tables that assign a boolean value to each string in X. The critical thing here is that 2^n is always strictly larger than n for all n >= 1.
This fact that there are more tables mapping strings to boolean than strings still holds even when there are infinitely many strings. What exactly we mean by "more" here is what Cantor and others developed. They said that a set A has more things than a set B if you consider all possible ways you can pair a thing from A with a thing from B there will always be things in A that are left over (i.e., not paired with anything from B).
Cantor's diagonalization argument applied here is the following: let S be the set of all finite strings and F be the set of all functions/tables that assigns a boolean to each element in S (this is sometimes written F = 2^S). Now suppose F was countable. By definition, that would mean that there is a pairing that assigns each natural number to exactly one function from F with none left over. The set S of finite strings is also countable so there is also a pairing from natural numbers to all elements of S. This means we can pair each element of S with exactly one element of F by looking up the natural number n assigned to s \in S and pairing s with the element f \in F that was also assigned to n. Crucially, what assuming the countability of F means is that if you give me a string s then there is always single f_s that is paired with s. Conversely, if you give me an f \in F there must be exactly one string s_f that is paired with that f.
We are going to construct a new function f' that is not in F. The way we do this is by defining f'(s) = not f_s(s). That is, f'(s) takes the input string s, looks up the function f_s that is paired with s, calculates the value of f_s(s) then flips its output.
Now we can argue that f' cannot be in F since it is different to every other function in F. Why? Well, suppose f' was actually some f \in F then since F is countable we know it must have some paired string s, that is, f' = f_s for some string s. Now if we look at the value of f'(s) it must be the same as f_s(s) since f' = f_s. But also f'(s) = not f_s(s) by the way we defined f' so we get that f_s(s) = not f_s(s) which is a contradiction. Therefore f' cannot be in F and thus our premise that F was countable must be wrong.
Another way to see this is that, by construction, f' is different to every other function f in F, specifically on the input value s that is paired with f.
Thus, the set F of all functions from strings to boolean must be uncountable.
mreid May 29, 2025 parent

I had exactly the same reaction, which prompted me to write this comment: https://www.hackerneue.com/item?id=44122045
mreid May 29, 2025 parent

This is a really nice explanation of decidability. One extra thing it might be worth mentioning is that there are many more functions `f : string -> boolean` then there are programs that implement those functions.
When I first encountered this topic I had trouble intuitively understanding how there could not exist an `IS_HALTING` function when it is also just a function that takes in a string (representing a program plus its inputs) and outputs True or False depending on whether it halts or not.
The argument in the article does a great job of showing that `IS_HALTING` cannot exist because it is in some sense "too powerful" but that means there is a mapping f : strings -> boolean that cannot be represented as a program, which seems weird if you've been programming for ages and every function you encounter is expressed as a program.
The result becomes less weird when you realize that that almost all functions from string -> boolean are not expressible as a program. Why? Well there are countable many programs since there are only countably many finite length strings and every program, by definition, is a finite length string. However, there are uncountably many functions from string -> boolean since these functions map one-to-one to sets of strings (just let the set be all inputs that map to True) and the cardinality of the set of sets of strings is uncountable.
This is essentially due to Cantor's diagonalization argument which shows you cannot put all elements in a set X into a 1-1 correspondence with all the subsets of X, even when X is countably infinite. This fact is at the heart of a lot of these computability results since it shows there is a gap between all functions (= any arbitrary subset of finite strings) and a program (= a finite string).
mreid Feb 6, 2025 parent

I must have some of that microplastic in my brain since I misread the start of the title as "Human, Brian May, ..." and then couldn't parse the rest properly.
mreid Dec 3, 2024 parent
The program Racter (which, from what I understand, was a basic MDP) was generating poetry in the 1980s that was published, read, criticized, and presented in museums: https://www.101bananas.com/poems/racter.html
I remember this as one of its poems was used on on the t-shirts of the computing social club that I was part of as a postgrad student:
```
  More than iron
  More than lead
  More than gold I need electricity
  I need it more than I need lamb or pork or lettuce or cucumber
  I need it for my dreams
```
mreid Oct 30, 2024 parent

I've heard a similar sentiment: "It's not lines of code written, it's lines of code spent."
It also reminds me of this analogy for data, especially sensitive data: "it's not oil, it's nuclear waste."
mreid Oct 9, 2024 parent

I also wrote a short blog post about this topic back in 2009: https://mark.reid.name/blog/warning-high-dimensions.html
It great to see these interesting math facts continue to be discussed and presented in new ways.
mreid May 9, 2024 parent

If you want to go a more analog route, this series of synth DIY videos by Moritz Klein is really great: https://www.youtube.com/@MoritzKlein0
mreid Nov 21, 2023 parent

From the paper:
"Alcohol is known to induce headaches when consumed in large quantities."
[Citation needed]
mreid Aug 22, 2023 parent

This likelihood ratio approach highlights the fact that KL divergence is a member of the family of Csiszár F-divergences. These are measures of "distance" between distributions of the form E_Q[ F(p(x)/q(x)) ] where F is any convex function with F(1) = 0. This is kind of a generalization of log-likelihood where F kind of "weights" the badness of ratios different to 1. When F is -log you get KL divergence.
Another curious fact about KL divergences is they are also a Bregman divergence: take a convex function H and define B_H(P, Q) = \sum_x H(p(x)) - H(q(x)) - <∇H(q(x)), p(x) - q(x)>. These generalize pointwise square Euclidean distance. KL is obtained when H(P) is negative entropy \sum_x p(x) log p(x).
I spent a bunch of time studying divergences over distributions (e.g., see my blog post[1]) and in particular these two classes and the really neat fact about KL divergence is that it is essentially the only divergence that is both an F-divergence and a Bregman divergence. This is basically due to the property of log that turns logs of products into sums.
[1]: https://mark.reid.name/blog/meet-the-bregman-divergences.htm...
mreid Aug 9, 2023 parent

"It is well known that stone can think, because the whole of electronics is based on that fact, but in some universes men spend ages looking for other intelligences in the sky without once looking under their feet. That is because they've got the time-span all wrong. From stone's point of view the universe is hardly created and mountain ranges are bouncing up and down like organ-stops while continents zip backward and forward in general high spirits, crashing into each other from the sheer joy of momentum and getting their rocks off. It is going to be quite some time before stone notices its disfiguring little skin disease and starts to scratch, which is just as well."
-- Terry Pratchett, "Equal Rites"
mreid May 10, 2023 parent

Don't sell yourself short. I only started running when I turned 38. I ran my first marathon in 3:53 when I was 42, and my second one in 3:41 when I was 44. I'm now nearly 48 and am planning on doing another one later this year and am training for a sub 3:30 time.
mreid May 8, 2023 parent

It makes me glad to see that Aleph (and ILP more generally) are still being used. It's a really powerful and interpretable way of constructing code from examples.
I made a few contributions to Aleph as part of my PhD on doing transfer learning in ILP and really enjoyed working in Prolog.
https://mark.reid.name/bits/pubs/unsw07.pdf
mreid Sep 13, 2022 parent

It might be easiest to give a sense of what "unprovable but true" means by way of an imagined example.
Goldbach's conjecture is that "every even number bigger than 2 is the sum of exactly two prime numbers", so 4 = 2 + 2, 6 = 3 + 3, 8 = 5 + 3, etc.
For this statement to be *true* it just means that every even number there must exist two primes that add to that number. This is a statement about infinitely many integers.
A *proof* of Goldbach's conjecture consists of a finite number of formal reasoning steps that start with some axioms and end up at the statement of the result.
To this day, it seems as though Goldbach's conjecture is true. It holds for every number we've been able to test. However, no one has proved it is true or false yet (or proved that it is unprovable).
The proof of Gödel's result's involves very carefully formalizing what statements and proofs mean so that they can be encoded as statements about arithmetic. He then shows there is a statement with encoding G that says "The statement with encoding G cannot be proved" – if it is true, then it cannot be proved.
It's kind of confusing at first, but there is an easier way to get an intuition why there might be true statements that cannot be proved. Think of each statement about the natural numbers as a subset where each number in the subset makes the statement true. There are uncountably many subsets of the natural numbers (by Cantor's diagonalization argument). Proofs are finite chains of finite statements so there are only countably infinitely many of these. Therefore there must be subsets/statements that are true that do not have a matching proof.
The approach that Gödel's proof takes is not too different to the above argument – it is essentially a diagonalization argument – the complexity is in making the encoding of statements are numbers very precise.
mreid Aug 2, 2022 parent

I learnt the strategy of "satisficing" from Herbert Simons' "Sciences of the Artificial" book. It suggests not bothering to split hairs amongst alternatives once options have met some threshold for acceptance.
https://en.wikipedia.org/wiki/Satisficing
mreid Jun 23, 2021 parent

I'm not the parent, but I think they are referring to William Sethares' book "Tuning, Timbre, Spectrum, Scale": https://www.springer.com/gp/book/9781852337971
It's a fascinating take on how those concepts relate to each other. You can find some of the papers that were synthesized into that book on his web page: https://sethares.engr.wisc.edu/pubs.html
mreid Jul 18, 2020 parent

Definitely looks like synesthesia. FTFY

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous