It looks like the code doesn't always check whether expected errors in the testsuite match the returned errors - which is rather important to ensure one isn't just incidentally getting the expected output.
So while JustHTML looks sort of right, it'll actually do things like emit errors on perfectly valid html.
Plus, the test suite isn't actually comprehensive, so if one only writes code to pass the tests, it can fail in the real world where other parsers that actually wrote against the spec wouldn't have trouble.
For instance, the html5lib-tests only tests a small number of meta charsets and as a result, JustHTML can't handle a whole slew of valid HTML5 character encodings like windows-1250 or koi8-r - which parsers like html5lib will happily handle. There's even a unit test added by the AI that ensures koi8-r doesn't work, for some reason.
Behind that is a smaller number of larger integration tests, and the even longer running regression tests that are run every release but not on every commit.
Yes. They came from the existing project being ported, which was also AI-written.
Those human tests are why your browser properly renders diversely messy HTML.
> They had 9000+ tests.
They were most probably also written by AI, there's no other (human) way. The way I see it we're putting turtles upon turtles hoping that everything will stick together, somehow.