Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> There should be no ability to "verify" a browser, and anyone should be able to emulate any browser.

Hard disagree. The AI industry has absolutely shredded the various anti-scraping and anti-botting social contracts that were in place prior to the covid pandemic. Like it's now common knowledge that robots.txt isn't a hard requirement and can be avoided entirely, for example. They have absolutely turned the open web into a dark forest.

Having a browser session able to be verified as untampered and/or "trusted" is probably going to be a thing going forward. Sucks a ton, but we all did this to ourselves.



> it's now common knowledge that robots.txt isn't a hard requirement and can be avoided entirely, for example

Was it ever not? It's a text file, not law.

> They have absolutely turned the open web into a dark forest.

Only if you have an ideological problem with people you don't like using the things you publish on the open web.

I'd say the web can be very open even without being copyleft. It makes some business models non-viable, but it doesn't prevent anyone from publishing what they want.

On the other hand, I don't think I would call something that preserves copyright at the cost of only admitting "approved/certified non-LLM scrapers" via attestation or similar "the open web".

> Having a browser session able to be verified as untampered and/or "trusted" is probably going to be a thing going forward. Sucks a ton, but we all did this to ourselves.

Who did what to whom?


Protocols like HTTP or formats like HTML were initially made to be machine-readable. You humans make your site machine-readable, publish on the internet and then get unhappy when machines start actually reading it.

Anyway, just put a captcha or require a cryptocurrency payment if you are unhappy with bots, but several people unhappy about scraping are less important than billion people unhappy about tracking their activity.


You're looking at that pre-covid time with rose tinted glasses. Half the reason sites like reddit or twitter offered free/open APIs was to ensure that the bots were being as efficient as possible rather than hammering the sites (The other half was altruistic but that good will is a very small line item to an MBA). Scrappers got so much better at just going to what's presented to humans because these kinds of APIs are no longer common so they had to. So now the lazy option is to no longer check if a site offers an API, rather than to check if it did and save time / not worry about maintenance by coding for an API.


Browser verification doesn't stop bots, that will just funnel even more money towards click farms which are using unmodified devices on racks.


> we all did this to ourselves

We meant who?


we already live in that world, Google and Apple cooperates with vendors like Cloudflare to make, essentially, the PAT / WEI implementation that they wanted.


Another reason to criminally prosecute the AI industry.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: