Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's because the specification instead defines the manner in which HTML5 documents should be parsed. In theory someone can provide an XML schema which defines the rules for the XML serialization of HTML5, but parsing HTML5 as HTML isn't something which (so far as I know) can be expressed in a standard machine-readable schema format.

The downside is you don't have "drop this file into an SGML parser and it works". But since that never worked with real-world SGML-based HTML anyway, the upside is that you get a parsing mechanism which actually handles the sorts of things you'll encounter in the wild.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: