Preferences

Before we start writing Lean. Perhaps we can start with something "dumber" like Rust or any typed program. If you want to write something correct, or you care about correctness, you should not be using dynamic languages. The most useful and used type of test is type checking.

Type errors, especially once you have designed your types to be correct by construction, is extremely, extremely useful for LLMs. Once you have the foundation correct, they just have to wiggle through that narrow gap until it figures out something that fits.

But from what I understood and read so far, I am not convinced of OP's "formal verification". A simple litmus test is to take any of your recent day job task and try to describe a formal specification of it. Is it even doable? Reasonable? Is it even there? For me the most useful kind of verification is the verification of the lower level tools i.e. data structures, language, compilers etc

For example, the type signature of Vec::operator[usize] in Rust returns T. This cannot be true because it cannot guarantee to return a T given ANY usize. To me, panic is the most laziest and worst ways to put in a specification. It means that every single line of Rust code is now able to enter this termination state.


I once attended a talk by someone who is or was big in the node.js world. He opened with the premise, "a static type check is just a stand-in for a unit test."

I wanted to throw a shoe at him. A static type check doesn't stand in for "a" unit test; static typing stands in for an unbounded number of unit tests.

Put another way, this common misconception by users of languages like Javascript and Python that unit testing is just as good as type checking (plus more flexible) is a confusion between the "exists" and "for all" logical operators.

Plus, it is simply more enjoyable to design the types in your program than to write unit tests. The fun factor comes from operating on a higher level of abstraction and engages more of your brain’s puzzle-solving mode than just writing unit tests. Making yourself think about “for all x” rather than a concrete x forces your brain to consider deeply the properties of x being used.
> it is simply more enjoyable to design the types in your program than to write unit tests.

I have tried both and I have no idea what you're talking about.

> Making yourself think about “for all x” rather than a concrete x forces your brain to consider deeply the properties of x being used.

The entire point of dynamic typing is that you can think about interfaces rather than concrete types, which entails deep consideration of the properties of the object (semantics of the provided interface).

That's not the entire point of dynamic typing, because all the interface stuff comes from statically typed languages. Some* dynamic languages borrowed it, but most use "implicit" interfaces - where the interface is whatever kind of works, I guess.
> because all the interface stuff comes from statically typed languages.

No, it doesn't. It comes from theory that came after the languages.

> Some* dynamic languages borrowed it, but most use "implicit" interfaces

An implicit interface is an interface, and is exactly the sort of thing I'm talking about in GP. The point is that you think about the object in terms of its capabilities, rather than some proven-up-front categorization that it fits into. What it does, not what it is.

You can achieve this with structural subtyping, such as Go interfaces and Python protocols. Whether that is desirable is a different question.
> "a static type check is just a stand-in for a unit test."

This is not an original argument. Rich Hickey made a similar argument in his "Simple made easy" talk in 2011, though his focus was on a fact that every bug that easiest in a software system has passed unnoticed through both a type checker and a test suit. And even before that similar ideas of test suits being a suitable replacement for a type checker have percolated through Python and Ruby communities, too.

I distinctly remember that the "tests makes static type checks unnecessary" was in fact so prevalent in JavaScript community that TypeScript had really hard time getting adoption in its first 3-4 years, and only the introduction of VSCode in 2015 and subsequent growth of its marketshare over Atom and SublimeText got more people exposed to TypeScript and the benefits of a type checker. Overall it took almost 10 years for Typescript to become the "default" language for web projects.

Agreed.

Besides, it's not like types don't matter in dynamically typed languages. The (competent) programmer still needs to keep types in their head while programming. "Can this function work with a float, or must I pass an int?" "This function expects an iterable, but what happens if I pass a string?" Etc.

I started my career with JavaScript and Python, but over the years I've come to the conclusion that a language that hides types from programmers and does implicit conversion magic in the background does not deliver a better DX. It might make the language more approachable initially, and the idea of faster prototyping might be appealing, but it very quickly leads to maintenance problems and bugs. Before type hinting tools for Python became popular, I worked on many projects where `TypeError` was the #1 exception in Sentry by a large margin.

Gradual and optional typing is better than nothing, but IME if the language doesn't require it, most programmers are lazy and will do the bare minimum to properly add type declarations. Especially with things like TypeScript, which makes many declarations difficult to read, write, and understand.

I think that type inference is a solid middle ground. Types are still statically declared, but the compiler is smart enough to not bother the developer when the type is obvious.

> Before type hinting tools for Python became popular, I worked on many projects where `TypeError` was the #1 exception in Sentry by a large margin.

My experience is radically different. `ValueError` is far more common in my un-annotated Python, and the most common cause of `TypeError` anyway is the wrong order or number of arguments after a refactoring.

Hhmm I could be misremembering if it was `ValueError` or `TypeError`. This was a few years ago. I know that typing issues were always the most frequent in any Python project I have worked on.

How is your experience different?

Hundreds of unit tests replace a type.

Start using properties and it is in the thousands.

Most code should be typed. Python is great for prototypes, but once the prototype gels, you need types.

I've always hated Python. Could never enjoy it at all. Pretty much the same poor DX as PHP, Javascript, Ruby, etc.

Finally set up neovim with pyright; use types on every single fucking thing, and now I love Python[1].

Can't wait to see TC39 become a reality (learned about it just this past week on HN, actually). Maybe I'll enjoy Javascript too.

--------------------

[1] Within reason; the packaging experience is still extremely poor!

Didn't get the bit about TC39 - it is just a group. It has nothing to do with types in JS.
Sorry, I should have been clearer about which proposal I was talking about:

https://github.com/tc39/proposal-type-annotations

I've never seen types used correctly in Python. A `tuple` is not even a type! It's a type constructor!
I’ve been doing Python and Typescript professionally, Python for almost two decades, Typescript for last 5 years and I can very confidently say that it doesn’t matter.

Besides, you see to be confusing Python run-time with Python typecheck-time, theoretically unfortunate, but again practically irrelevant distinction. (Unfortunate since Python typecheck is basically a different language than Python execution; irrelevant, because the right subsets of both align well.)

The distinction you are trying to make is nonsensical in Python's object model. Types are inherently callable, and calling them constructs (i.e. instantiates) the type (normally; this can be overridden, by design). There is also no type->kind->category hierarchy; `type` itself is an object, which is its own type.

When you're at a level of theory where terms like "type constructor" are natural, it's unreasonable to expect any of it to be applicable to Python. This is why the Haskell people speak of dynamically-typed languages in the Python mold as "untyped" regardless of their attitude towards implicit casts.

And I love it, and have been using it for decades, and write beautiful things where the annotations hardly ever seem worth the effort — perhaps for documentation, but not for a static checker. Then I look at other, newer Pythonistas trying to figure out how to write complex generic type expressions (and sacrificing backwards compatibility as they keep up with the churn of Python figuring out how to offer useful annotation syntax) and deal with covariance vs contravariance etc. and I just smile.

Most of the types are "not even types". It's why you can pass a type as a value (not a string) to a function.

If the type was a type, you'd not be able to use it as a value.

A unit test is a functional assertion. A type is a semantic construct that can provide that, but it provides a lot more.

As a trivial example, if I create a type alias from “string” to “foobarId,” I now (assuming a compliant language) can prevent code that consumes foobarIds from accidentally consuming a string.

Good that Python supports types then
Python has formalized magic comments.

You can run a third party linter on those comments, but you must hope that they're correct. There are usually some checks for that, but they're only reliable in trivial cases.

This is not static typing any more than "you can use emscripten to transpile JavaScript to C" means that JavaScript is a low level language with native assembly support. It's a huge step forward from "no system at all" and I'm thrilled it exists, but it's hardly the same thing.

They are originally magic comments, unformalized. The feature is originally known as annotations and you can basically put anything there.

    def f(a: "something like an int", b: "another int-like thing"): pass

    def g(a: sys.stdin, b: sys.stderr): pass
It's actually remarkable how with the success of TypeScript so many other dynamic languages switched to gradual typing.

Erlang and Clojure were the early ones, TypeScript followed, and now Python, Ruby, and even Perl have ways to specify types and type check your programs.

> Good that Python supports types then

"Optional typing" is not the same as "Static typing".

Great, my program will crash, because I forgot to opt-in to typing :-/

Poorly, though, and with lots of edge cases and foot guns to make you miserable once your code calls out to numpy or gets some JSON.
Tell me more please: how does one use types in Python? Unfortunately I write Python professionally these days (it is the language that has all the libraries) and hate it with a passion.
You will never get good if you continue to hate it. You don't have to like it, but hating it creates a mental block.
mypy is the simplest/oldest option, look into pyright and ty, too.

install uv and then

    uvx mypy
should do the rest.
> I wanted to throw a shoe at him.

You should have!

If you care about the type of a parameter you can just add an assertion in the method /s
Good luck using static typing to model many real world unit tests for the programming languages people use most. I start with an easy example: those records should be sorted by date of birth. We can move on to more complicated scenarios.
> "records should be sorted by date of birth."

What's wrong with C#'s:

    System.Collections.Generic.SortedList<DoBDateTime, PersonRecord>

?
No one claims that types are a stand in for all unit tests.

They stand in for the banal unit tests.

The comment didn’t claim that types are a stand in for tests either! IMO, they are orthogonal.
The comment explicitly set out to refute the idea "...that unit testing is just as good as type checking" by describing the former as simply inferior.
No. They refuted the claim that "a static type check is just a stand-in for a unit test". That is a claim that you can just remove your type checks and replace them with unit tests at no loss. The comment stated that removing a type check just so you can replace it with a unit test is inferior. The prior state was already pre-supposed to have a type check/type checkable condition that you could replace.

That is the literal converse of the claim in the response to that comment arguing that the comment stated that all unit tests can be replaced with type checks. Those are not at all the same claim.

To make it even more clear the comment said: I saw a talk that said Type Check -> Unit Test. I said that is silly.

Response said: Unit Test -> Type Check is not reasonable. So clearly your claim that Type Check -> Unit Test is silly is wrong.

> A static type check doesn't stand in for "a" unit test; static typing stands in for an unbounded number of unit tests.

You have conflated "a static type check" with "static typing". Unit tests stand in, in the same way, for an unbounded number of states of real-world input. They're simply being subjected to a trial verification system rather than a proof system. It turns out that writing proofs is not very many people's idea of a good time, even in the programming world. And the concept of "type" that's normally grokked is anemic anyway.

> Put another way...

Rhetoric like this is unconvincing and frankly insulting. You pass off your taste and opinion as fact, while failing to understand opposed arguments.

The author is in the comfortable position of working on a system that does have a formal specification and a formally verified reference implementation. The post is not about how they wish things would work, but how their existing system (Cedar) works.

Regarding your point on Rust, the vast majority of software has nowhere near the amount of static guarantees provided by Rust. If you need more, use static memory allocation, that's what people do for safety critical systems. By the way, it seems that Rust aborts on OOM errors, not panics: https://github.com/rust-lang/rust/issues/43596

I think it's possible to write correct systems with dynamic languages, just not the ones we commonly use like Python and JavaScript. I find Clojure, for example to be one example of a dynamic language that is pretty easy to manage and I attribute that to the immutable nature and data-centric ethos. I'm sure there are other dynamic languages that would work as well.

Now, I wouldn't necessarily use Clojure on a huge multi-organization codebase (maybe it's fine, this is outside of my experience with it), but it can be the right tool for some jobs.

Common Lisp as well. I can’t explain why, but type errors are just not something I struggle with in Common Lisp! But it is in JS and Python for sure. Maybe someone knows why it feels different?
I think it’s cause there’s less imperative code and side effects to track data transformations through.

Like any random JS/php app is probably a huge pile of loops and if statements. To track what happens to the data, you need to run the whole program in your head. “And now it adds that property to the object in the outer scope, and now that object gets sorted, now it hits the database… ok…”. Whereas in clojure most functions are either a single atomic transformation to a set of data, or batch of side effects. You still have to run it through your head, but you can do it more piece-by-piece instead of having to understand a 1,000 method with class states being auto loaded and mutated all over the place. Also you have a REPL to try stuff out as you go.

Dont get me wrong, I LOVE static types. Statically typed clojure would be the best fckin language ever. But there is definitely a wide gulf between a dynamic language like JS, and one like clojure!

> Statically typed clojure

Well, if you also like Common Lisp, there's Coalton, which is Common Lisp with a Haskell-like type system: https://coalton-lang.github.io/

> Like any random JS/php app is probably a huge pile of loops and if statements. To track what happens to the data, you need to run the whole program in your head. “And now it adds that property to the object in the outer scope, and now that object gets sorted, now it hits the database… ok…”. Whereas in clojure most functions are either a single atomic transformation to a set of data, or batch of side effects. You still have to run it through your head, but you can do it more piece-by-piece instead of having to understand a 1,000 method with class states being auto loaded and mutated all over the place. Also you have a REPL to try stuff out as you go.

Nothing really forces you to write imperative code in a large fraction of cases, and typically the state-change operations can be quite localized within the code. And of course JavaScript and Python both also have REPLs.

But nothing forces you to write functional code either. I’ve seen a whooole lot of php and JS, and most of it has been pretty terrible lol. Of course you can write terrible code in any language, but I think the ease of starting with JS/php combined with the lack of built-in opinions makes it easy to build huge piles of spaghetti.

Though these days fresh typescript codebases are usually pretty decent. I love typescript and it’s really nice to work with a well-typed, modern project with proper schema validation and such. Def miss that in clojure.

Also I wouldn’t really compare JS or pythons REPL to clojure’s. Python’s is useful, but I pretty much live inside the clojure repl

I haven't done much with CL so I can only speculate, but I think stricter FP principles in general work to minimize the downsides of dynamic typing. CL, to my understanding, isn't the most "pure" when it comes to FP, but does a good job at giving the programmer a lot of power to constrain and explore systems.
> Perhaps we can start with something "dumber" like Rust or any typed program. If you want to write something correct, or you care about correctness, you should not be using dynamic languages. The most useful and used type of test is type checking.

Lean or TLA+ are to Rust/Java/Haskell's type systems what algebraic topology and non-linear PDEs are to "one potato, two potatoes". The level of "correctness" achievable with such simple type systems is so negligible in comparison to the things you can express and prove in rich formal mathematics languages that they barely leave an impression (they do make some grunt work easier, but if we're talking about a world where a machine can do the more complicated things, a little more grunt work doesn't matter).

I mean..sure, but I just want the first 80%. We don't have that. Instead, we are building kernels and infrastructure using bash scripts that who knows does what. We need a tool that is solid and rigid that LLMs can use to go through all of that.

It should be something that is familiar (so imperative style like C), easier to read (perhaps with type inference) and have strong modern type system (just give me sum type is enough for gods sake). Perhaps Python with (real) types.

But if LLMs get to the point they're smart enough to deal with the tricker aspects of programming, what makes you think they need help with the easier parts? Conversely, if they're not smart enough to deal with the trickier parts, why would a little help move the needle much? Despite trying, research has not been able to find a significant general[1] effect of language design on correctness or productivity for human programmers (at least among more-or-less high level languages; I'm not talking Java vs Assembly). We all have our preferences, and we tend to think they're universal, but it's precisely because of this bias that empirical study is needed, and it's not been conclusive.

If there's no big impact on humans, why assume there would be one for LLMs? I'm not saying that LLMs think like humans, but the default hypothesis should be that something doesn't make a big difference if there's no example in which it does. In other words, if something does not have a known effect, we shouldn't assume that it will in this case (I mean, it could, but we'll need to first find good empirical evidence for that).

[1]: Research did find some differences between TypeScript and JavaScript specifically, but that result hasn't generalised.

> To me, panic is the most laziest and worst ways to put in a specification.

This why the "existing programs don't have specs!" Hand-ringing is entirely premature. Just about every code base today has error modes the authors think won't happen.

All you have to do is start proving they won't happen. And if you do this, you will begin a long journey that ends up with a formal spec for, at least, a good part of your program.

Proving the panics are dead code is a Socratic method, between you and the proof assistant / type checker, for figuring out what your program is and what you want it to be :).

Orrrr just use an API that returns Option<T>, then bubble it up or manually panic if that's what you want?
Yeah, Rust has been pretty good for formal verification so far. Hoare spec contracts I think are the way forward, especially since they fairly naturally flow from unittests. I've been using Hax to pretty good effect so far. I'm generally suspect that advances in Lean proof solving by dedicated models are that useful for program verification, compared to generalist models, though it could help lower costs a good bit.
I agree. And learning a typed language is significantly easier now that AI can explain everything. The types also help AI to write a correct code. A very positive feedback loop.
a rust to js transpiler would be pretty sweet. idk if someone has turned rust into a frontend language yet like typescript.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal