I tried to implement this in C++ where, as usual, everything is possible but nothing is practical:
My static typing experience is limited to TypeScript and Rust, so I'm curious what a GADT would look like if they were added to those languages.
enum MyRecord {
ConstructorName(Int, String, Bool)
}
enum TrafficLight {
Red,
Yellow,
Green
}
enum AST {
Integer(Int),
Boolean(Bool),
Addition(Ast, Ast),
Multiplication(Ast, Ast),
Equals(Ast, Ast),
Not(Ast)
}
`Not(Equals(Integer(1), Integer(2))) : AST` enum IntErr {
Error(String),
IntResult(Int)
}
enum StringErr {
Error(String),
StringResult(String)
}
enum WithError<A> {
Error(String),
Result(A)
}
`Result(3) : WithError<Int>`
`Result(True) : WithError<Bool>` enum Weird<A, B> {
IgnoreTheType(Int),
}
`Weird<String, Bool>`
`IgnoreTheType(1) : Weird<String, Bool>`
`IgnoreTheType(2) : Weird<Char, Float>`
`IgnoreTheType(3) : Weird<Bool, Double>``Addition(Boolean(True), Not(Integer(3)))`
enum AST<R> {
Integer(Int),
Boolean(Bool),
Addition(Ast<Int>, Ast<Int>),
Multiplication(Ast<Int>, Ast<Int>),
Equals(Ast<R>, Ast<R>),
Not(Ast<Bool>)
}
Finally, we get our GADT: enum AST<R> {
Integer(Int) : Ast<Int>,
Boolean(Bool) : Ast<Bool>,
Addition(Ast<Int>, Ast<Int>) : Ast<Int>,
Multiplication(Ast<Int>, Ast<Int>) : Ast<Int>,
Equals(Ast<R>, Ast<R>) : Ast<Bool>,
Not(Ast<Bool>) : Ast<Bool>
}A question: how much of this would actually compile in Rust?
- Rust needs indirection for recursive data types (like the AST) using `Box`, references, or another kind of indirection. Otherwise, the size isn't known at compile time (since it's potentially infinite).
- Rust doesn't have GADTs (yet?) so the last AST is purely theoretical.
- `IgnoreTheType` (and `AST<R>`) would require explicit use of `PhantomData` for variants without `R` https://doc.rust-lang.org/std/marker/struct.PhantomData.html
On Rust's playground: https://play.rust-lang.org/?version=stable&mode=debug&editio...
(Also, I had not known about Rust Playground, so thanks for introducing me to that too.)
So, a bit of Haskell first, starting with algebraic data types. (Skip this if you already know about ADTs.) If you want to define a data type in most languages, there are basically two things you can do. Either you can define a record, with multiple values wrapped up as a single type:
Or, you can define an enumeration, where a type has a limited set of allowed values: Haskell calls the former ‘product types’ and the latter ‘sum types’.The thing is, there’s nothing stopping you from combining the two: making an enumeration where each value of the enumeration is a record. This is called an algebraic data type (because it combines sums and products, geddit?) For instance, this lets me make an AST for a simple language:
So now I can make values like `Not (Equals (Integer 1) (Integer 2)) :: AST`. (In Haskell, read ‘::’ as ‘has-type’.)Often, you want to make data types polymorphic over the record field type. For instance, I might have a type expressing either an integer or an error:
And I might have expressing either a string or an error: And eventually I might get tired of writing out all these types. In that case I can add a type variable representing any type: So now I can have `Result 3 :: WithError Int`, `Result True :: WithError Bool`, and so on. (This is basically the same as generics or templates in languages which have them.)The thing is, no-one says a type parameter needs to correspond to anything concrete. There’s nothing in Haskell which stops me from doing this:
This is a bit weird. If you have a value of type `Weird String Bool`, say, that value won’t actually have a string or a boolean within it. And `IgnoreTheType 1 :: Weird String Bool`, `IgnoreTheType 2 :: Weird Char Float`, `IgnoreTheType 3 :: Weird Bool Double` are all values of different types. You can basically change the type at will without affecting the value. Variables such as `a` and `b` are called ‘phantom type variables’.Now let’s back up a bit and have another look at that `AST` data type I defined earlier. It’s not particularly satisfying: you can easily create meaningless terms like `Addition (Boolean True) (Not (Integer 3))`. As Haskellers, we want to avoid this as much as possible. So let’s add a phantom type parameter to express the type returned by each `AST` constructor:
This says that multiplication, for instance, must take two `AST`s returning an `Int`, as encoded in the type parameter. But wait — how do we restrict the type parameter? We need to be able to say that something constructed with `Multiplication` must have type `AST Int`, but something constructed with `Not` must have type `AST Bool`. As it happens, vanilla Haskell has no way of doing this. There is no way of explicitly controlling the value of `r` based on which constructor is used.You might have guessed by now that this is exactly what generalised algebraic data types let us do. And indeed, `AST` can be easily implemented as a GADT:
And now, if we try to construct an incorrect AST, the compiler will give us back an error, because the types don’t match: