Preferences

HTML pretended to be a subset of SGML, but never really was, and the illusion quickly dispersed as time went on, since HTML was strictly pragmatic and ran in resource-constrained environments (the desktop), while SGML was academic, largely theoretical, and ran on servers, analyzing text.

XML, on the other hand, was more of a back-formation – a generalization of HTML; it was not, as I understand it, directly related to SGML in any way. The existence of XML was a reaction to SGML being impractical, so it would be strange if XML directly derived from SGML.


tannhaeuser
> XML [...] was not [...] directly related to SGML in any way

That's incorrect. XML is by definition a proper subset of WebSGML, the SGML revision specified in ISO 8879:1986 Annex K. These two specifications were published around the same time and authored by the same people.

In a nutshell, XML added DTD-less SGML (eg. such that every document can be parsed without markup declarations, unlike eg. HTML which has `img` and other empty elements the parser needs to know about) and XML-style empty elements. The features removed from SGML to become XML were tag inference/omission (as used in HTML), short references (for things such as Wiki syntax, CSV, and even JSON parsing), uses of marked sections other than `CDATA`, more complex use cases for notations, and link process declarations ("stylesheets") plus a couple others.

XML was intended as subset of SGML that can be meaningfully parsed without knowing DTD of document in question, which involves removing a lot of weird SGML features and constraining others. Formally XML is not SGML subset as there are some unimportant and some quite critical incompatible details.

This item has no comments currently.