Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I like HTML. Wait, what are we talking about?


Tangent you got me thinking about:

HTML clicked for me one day when I mentally decoupled the hypertext from the actual browser rendering. So many of us think HTML and imagine the point is to render a webpage. But HTML describes the semantics, topology, and content of a document. It’s 100% valid to “render” HTML in some other format like a PDF or an mp3.


I'm hoping we see a move to allow the rendering of the webpage to be entirely up to the users. Just provide the data, and let me decide how I want to interact with it. But that would ruin SEO and Ads, so we're gonna get in a buncha legal battles about web scrapers instead.


“Reader Mode” is a successful example. I’m actually shocked it exists because of how it impedes the things you mention.


But reader mode is mostly bunch of heuristics with tons of ad-hoc special cases and hacks instead of relying documents to be well-structured. So in many ways it is the opposite of successful example.

https://github.com/mozilla/readability/blob/main/Readability...


Oh true. Which kind of demonstrates the penalty for abusing HTML so much that it’s no-longer semantically reliable.


How long can it be called abuse if it is how html has been used like almost entirety of its lifetime.


By then AI will have disrupted the ad-revenue model so fingers crossed we get the clean data!


Is it kind of a compromise then to "tag" HTML with classes for CSS?

CSS doing the "rendering," like laying out mobile-responsive versus desktop.

I wonder how we would separate out explicit class names from HTML, unless the tags themselves are <custom-names />. (Micro frontends & web components?)

Then it sort of works out nicely, I think.


HTML is the semantics, CSS is the styling, but you need both. Which is why browsers come with default CSS (which you can unset) for everything. You get the element tag to say "what it is", and you get the CSS classes to say "what visual rules to apply".


This is mostly true, but the asterisks cause a little chaos.

> HTML is the semantics ... the element tag to say "what it is"

Maybe this is best framed as a perspective thing.

"Semantic HTML" is about HTML authors using HTML elements in a way that is consistent with the definitions laid out in the specs. These definitions try to specify element semantics because user agents want to be able to do less-dumb things (things that don't work as well if HTML authors are constantly abusing tags for some presentational effect even though the semantics are weird or wrong).

The main consequence of this is that tag semantics (from the UA's perspective) won't always square with what the author assumes it means unless they go study the spec. For example, it's probably not hard to go find cases where the <address> tag is used for the obvious thing from the author's perspective: marking up addresses. The spec, however, explicitly contradicts this surface-level reading: https://html.spec.whatwg.org/multipage/sections.html#the-add... (i.e., it can be "correct" for pages to contain a mix of addresses that do and don't have the address tag.)


We also have a lot of tooling that invites semantic abuse for presentational effect (i.e., using markdown blockquotes as notes, and even the fancy behavior browsers attach to the <details> element).


Your comment is funny in such that ; recall the web before CSS?

1990s web, with flashing tags and just infant monkeys trying to cobble together a webpage?

Your comment brings so many images to mind.


Now HTML¹ is an output target to Flash-like games, tunneled video chat, and the flashpoint of global communities versus corporate priorities.

We can still do flashing tags and cobble together webpages; all we need is a text editor.

That's one allure of programming: we can (re)invent primitives of everything, for better or worse.

¹ With CSS and Javascript


That web was so much easier to scrape, though.


It’s a purity question. You can assign any attributes you want to an element. And some of them are formalized in various ways.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: