Preferences

ProblemFactory parent
XML parsers are necessarily over-complicated for structured data, because it is a text markup language, not a nested data structure language.

<address>123 Hello World Road, <postcode>12345</postcode>, CA</address>

is perfectly sensible XML. The address is not a tree structure or a key-value dictionary - it is free text with optional markup for some words.

You can use XML to represent nested data structures with lists and dictionaries, but the parsers and their public APIs must still handle the freeform text case as well.


jstimpfle
Yep, the application to text documents is valid in my eyes, as well. Although there are lighter weight and/or more extensible approaches, like TeX. (update, clarificaton: I mean just the markup syntax, not the compuational model)
anonymouz
> TeX

Dear god no. I use and love (La)TeX daily to write documents. But as a markup format for data that's supposed to be processed in any way, other than being fed to a TeX engine, it's absolutely terrible. You can't even really parse a TeX document; with all the macros it really is more a program than a document. XML is far from perfect, but it works well as a markup and data exchange format that is well-specified.

_delirium
I like TeX for producing documents. But I'd take XML over TeX if I had to parse the markup myself, outside of the TeX toolchain. Any nontrivial TeX document is built out of a pile of macros, so you need to implement a TeX-compatible macro expander to parse it. And at least with XML there are solid libraries, while the state of TeX-parsing libraries outside of TeX itself is pretty poor. I think Haskell is the only language with a reasonably good implementation, thanks to the efforts of the pandoc folks.
jstimpfle
It doesn't need to be Tex-compatible. My point was just that the syntax is lighter weight and might be preferable for some applications.
RTF is essentially the same syntax, sans the option to define your own markup. Only barely human readable when produced by a word processor, though, but generated TeX is awful as well.
Someone
Lighter-weight? Tex is Turing-complete. You can’t even know whether interpreting it will ever finish, and writing a parser that produces good error messages on invalid input is difficult.
edejong
From someone who has written TeX macro’s before: you probably mistake the ‘clean’ environment of LaTeX with the core TeX language. The former is reasonable, if very limited, the latter is die-hard “you thought you knew how to program, but this proves you wrong”-material.

XML over TeX any time and LISP-like over XML (with structural macros)

kazinator
I would make a subtle adjustment there: "you thought Knuth knew how to design, this proves you wrong". \makeatspecial here we go\oh\boy ...
edejong
Ah, yes, agreed! That was my first thought as well when I went into the deep flaming pit of TeX macros. Nevertheless, I guess he didn't have much reference material back then or did he?
jstimpfle
FWIW, I remember a quote about Knuth and TeX, "He tried really not to make it a programming language, but ultimately he failed". I don't remember who said that.
You're probably thinking of Knuth himself. He's mentioned several times how he never intended to make a programming language, and how puzzled he is that people write programs in TeX macros.

E.g.:

> In some sense I put in many of the programming features kicking and screaming [...] Every system I looked at had its own universal Turing machine built into it somehow, and everybody’s was a little different from everybody else’s. So I thought, “Well, I’m not going to design a programming language; I wanted to have just a typesetting language.” Little by little, I needed more features and so the programming constructs grew.[...] as a programmer, I was tired of having to learn ten different almost-the-same programming languages for every system I looked at; I was going to try to avoid that.

etc. (https://www.ntg.nl/maps/16/15.pdf)

and:

> I was really thinking of TeX as something that the more programming it had in it, the less it was doing its real mission of typesetting. When I put in the calculation of prime numbers into the TeX manual I was not thinking of this as the way to use TeX. I was thinking, "Oh, by the way, look at this: dogs can stand on their hind legs and TeX can calculate prime numbers."

(Coders at Work interview)

In fact, if you use TeX the way Knuth intended and uses it, then the use of macros or programming is really quite minimal. It's only LaTeX that to pursue a better document interface for the user, ends up writing horrifically complex macros -- Mittelbach mentions that nine out of ten "dirty tricks" mentioned by Knuth in the TeXbook are actually used in the source code of LaTeX!

This item has no comments currently.