Hacker Newsnew | past | comments | ask | show | jobs | submit | zeagle's commentslogin

I mean why wouldn’t they? All their IP was scraped for at their own cost of hosting it for AI training. It further pulls away from their own business models as people ask the AI models the questions instead of reading primary sources. Plus it doesn’t seem likely they’ll ever be compensated for that loss given the economy is all in on AI. At least search engines would link back.

Those countermeasures don't really have an effect in terms of scraping. Anyone skilled can overcome any protection within a week or two. By officially blocking IA, IA can't archive those websites in a legal way, while all major AI companies use copyrighted content without permission.

For sure. There are many billions and brilliant engineers propping up AI so they will win any cat and mouse game of blocking. It would be ideal if sites gave their data to IA and IA protected it exactly from what you say. But as someone that intentionally uses AI tools almost daily (mainly open evidence) IMO blame the abuser not the victim that it has come to this.

I'm not blaming the victim, but don't play the 'look what you made me do' game. Making content accessible to anyone (even behind a paywall) is a risk they need to take nevertheless. It's impossible to know upfront if the content is used for consumption or to create derived products (e.g. write an article in NYT style etc.). If this was a newspaper, this would be equivalent to scanning paper and then training AI. You can't prevent scanning, as the process is based on exactly the same phenomenon what makes your eyes see, iow information being sent and received. The game was lost before it even started.

That is a good question. However, copyright exists (for a limited time) to allow for them to be compensated. AI doesn't change that. It feels like blocking AI-use is a ploy to extract additional revenue. If their content is regurgitated within copyright terms, yes, they should be compensated.

The problem is that producing a mix of personalized content that doesn't appear (at least on its face) to violate copyright still completely destroys their business model. So either copyright law needs to be updated or their business model does.

Either way I'm fairly certain that blocking AI agent access isn't a viable long term solution.


> Either way I'm fairly certain that blocking AI agent access isn't a viable long term solution.

Great point. If my personal AI assistant cannot find your product/website/content, it effectively may no longer exist! For me. Ain't nobody got the time to go searching that stuff up and sifting through the AI slop. The pendulum may even swing the other way and the publishers may need to start paying me (or whoever my gatekeeper is) for access to my space...


That’s a perspective I hadn’t considered. Although the whole thing stinks of middlemen extracting all the profit between producers and consumers e.g. ag sector by the laws won’t catch up or even force integration. Thanks!

There is definitely a middleman question.

The bigger question is business model vs value-add. Copyright law draws a very direct line from value-add to compensation - if you created something new (or even derivative), copyright attaches to allow for compensation, if people find it valuable.

Business models are a different animal: they can range from value-add services and products to rent-seeking to monopolies, extracting value from both producers and consumers.

While copyright law makes no mention of business models, I don't know whether that is a historical artifact since copyright is presumably older, or a philosophical exclusion because society owes no business model a right to exist. I would suggest the existence of monopoly-busting government agencies argues that societies do not owe business models a right of existence. Fair compensation for the advancement of arts and sciences is clearly a public good, though.

Tying it back to the AI-in-the-middle question, it's yet another platform in a series of these between producers and consumers, and doesn't override copyright. Regurgitating a copyright (article, art, whatever) should absolutely attract compensation; should summarizing content attract compensation? should it be considered any different from a friend (or executive assistant) describing the content? And if the producers' business model involves extracting value from a transaction on any basis other than adding value to the consumer, does society owe that business model any right to exist?


Until recently I had a 4gb ram 80gb ssd+2tb hd VPS running debian in a Montreal data centre with a real use 700 mbit pipe to my city with a budget provider for the equivalent of $80USD/year. When fio speeds were slow they moved me to a less crowded server. I gave it up as don't need it and moved my personal sites back to NFS for peanuts a year and services to my NAS. The pricing, offsite storage for my backups, Canadian sovereignty, lack of perceived complexity with a big provider was all attractive. I'm a physician with a tech hobby and last serious tech work was in the LAMP days with perl and php. Trying to think of learning about AWS and screwing up usage based billing was daunting!


Yeah, don't try AWS. I tried it once and now I'm stuck with $0 bill emails coming each month that I can't stop.


A few months ago I was going through my secondary email and noticed I was getting a $0.01 monthly bill from AWS.

Having not used AWS for years, I logged in to check it out, navigated through the Kafkaesque maze of their services until I found what I was looking for:

A lone S3 storage bucket, with one file, "Squirrel.jpg". A 200kB picture of a squirrel that I uploaded 8 years ago and can't remember why.


> I was getting a $0.01 monthly bill from AWS.

I wonder what the cost to AWS was for keeping track of that and running your CC. There's no way they made money off you / that 12 cents/year cost them *at least* 12 cents to collect every year


That's funny. I kept getting a -$100 bill from a credit card for a few months after closing it. Eventually called them and suggested they can send me a cheque instead of a bill next time for similar reasons...


IIRC the CC they had on hand had long expired and they never actually managed to charge me for these minuscule amounts, which is why I didn't notice it for so long.


My vps provider bills in $5 blocks


That should be below the threshold for AWS’s free tier. I have more than that in S3 and I’m not being charged a cent.


AWS did some weird security thing and it invalidated my 2FA. I can't login to my account to update my expired card.

I have $6 in charges and so now my account is locked. Lol. Fuck off AWS.


> Trying to think of learning about AWS and screwing up usage based billing was daunting!

One of the hard rules we learned pre-pandemic was that services attached to usage based billing should really exit on error. It's a lesson I'm keeping in mind working with agents and routing (and the main reason I'm local-first).


Canadian here, could you share the name of the provider? I'd love to move to something more local and just need a basic small vps for a simple apache host. I know of a couple providers but never talked to anyone actually using one.


I did a detailed review of a few Canadian VPS providers last year.

https://lukecyca.com/2025/canadian-vps-review.html

Last year, I moved from DigitalOcean to FullHost (their Vancouver datacentre) for hosting a small SaaS and a bunch of personal projects. It's cheaper and FAR better performance.


Thanks! I'll check it out!


It was ServaRICA as someone else suggested. It was a Black Friday hybrid VPS deal from a few years ago, looks like they still have comparable stuff on their site. For the cost I would generally assume anything important needs to be duplicated in case the company folds or a fire unless you pay them for such a service. (I don't have any vested interest in suggesting them.)


Thanks! Nothing important, personal site with the source stored in a git repo replicated to a few places, so them folding would just be a minor inconvenience.


They're probably talking about ServaRICA. They post deals on LowEndTalk.


Thank you!


That sounds a lot like a newspaper subscription. I subscribe to my local (physical) paper once a week for this reason.


Modern-day patronage is kind of different from a subscription. It's a lot like a "pay what you want" subscription model, but people seem a lot more generous when you express it as a "donation with early access to premium articles" rather than payment for goods and services.


That's really fair. I think of my donations and support and usually higher than I would want to subscribe for!


Yeah, as long as you remove the "for-profit" part, it's essentially that. Once it's a for-profit business, it perverses the incentives, and it'll be a race to the bottom or a race to see what subscribers can survive the highest prices, which is exactly what we wanna avoid :)


Non-profits don't really stop any of that. Plenty of non-profits are after perverse incentives to gather as much money as they can to just pay higher ups more money, and use the non-profit status to pay employees less.


Maybe there's a third way. What about a company owned by a "perpetual purpose trust" - i.e. a trust with a defined purpose that is legally binding. It's the only shareholder, so no extracting value and all profits have to comply with the trust's bylaws in how they are used. Patagonia (US company) is one example of this; it's profits are legally bound to go toward environmental causes.

Bosch and Zeiss in Germany are comparable - they are Verantwortungseigentum (Steward-Ownership).


This is the business model of The Guardian:

https://en.wikipedia.org/wiki/Scott_Trust_Limited


That sounds kind of like a B-Corp, innit?


That's a third-party certification that can be allowed to lapse, not a legal or legally enforceable status.

https://www.bcorporation.net/en-us/certification/


> Plenty of non-profits are after perverse incentives to gather as much money as they can to just pay higher ups more money

Where is this specifically, in the US? Usually the laws of the country prevent this, since they're you know... Non-profits... But wouldn't surprise me there are a few leftover countries who refuse to join the modern world.


The US has this problem. There aren't really rules on paying executives as much as you want, or having bonus structures based on fundraising, as long as the board okays it and considers it as contributing to the mission. It is non-profit because it doesn't pay out profits to investors. This is a large way corruption happens in the US, ie a lot of those "X politician foundations" pay modest amounts of money to some cause, but a large percentage of the donations go to the executive as a salary for running the corp, the executive is the politician. Its a big shell game.


Yeah, seemingly a local problem rather than a problem with non-profits, unfortunately :/ Hope things get better over there over time!


What country do you live in and can you link to the laws regulating nonprofit employee pay so that we can compare and use them as a model?


You just find the optimal point for the most people if it's for profit.


I think that only holds if company ownership is not close with company leadership. Is a "subscriber owned" newspaper model possible? Like how co-op stores are at least nominally owned by their customers.

I could also imagine a system in which a local newspaper was actually run as a public utility by an independent corporation, but explicitly chartered and subsidized by a town/city/county.


I doubt that's true in practice, although I know many capitalists know that to be true in theory.


For someone who runs a small personal website and uses LE to secure this + some web exposed services, could you explain how this is different/better than acme-dns-certbot?


Let's Encrypt is a single point of failure.

WebPKI also suffers from an inability to properly do delegation. It's not possible for me to create an intermediary certificate valid only for *.mycompany.com

If I want to use WebPKI, I have to either expose every host inside my company to everyone (via CT transparency logs) or use a wildcard certificate. And wildcard certs allow attackers to impersonate anything within my domain, if they get access to just one host.

X.509 technically supports name constraints ( https://www.rfc-editor.org/rfc/rfc5280#section-4.2.1.10 ), but its implementation was inconsistent. In particular, some implementations did not apply it to the Common Name. Fortunately, Common Name is on the path to deprecation.


CT logs do allow enumeration, but avoiding that is just security through obscurity. WebPKI is intended for publicly-accessible hosts, so hopefully you already have some kind of firewall in place to protect them! If you want to avoid enumeration of internal-only hosts: just use your own self-signed root cert. CT logs are a crucial part of protecting against rogue CAs, so don't expect that to go away any time soon.

With ACME most of the delegation issues have pretty much been solved. Publicly-accessible hosts can easily get a cert - if and only if the domain resolves to that host. Want even stricter enforcement? Nobody's stopping you from writing an ACME proxy which only forwards requests from known-good hosts to LE & friends.


> CT logs do allow enumeration, but avoiding that is just security through obscurity.

Well, yes. There are also other issues, like rate limits. Some companies have hundreds of thousands of hosts (some virtual) and requesting certificates for all of them might be problematic.

> If you want to avoid enumeration of internal-only hosts: just use your own self-signed root cert.

This becomes increasingly problematic, as browsers start relying on DoH/DoT, or making it more difficult to enroll custom root certs.

> Nobody's stopping you from writing an ACME proxy which only forwards requests from known-good hosts to LE & friends.

I actually tried that. LE uses multiple viewpoints to resolve the challenges, so you need to open your internal DNS resolvers/HTTPS to basically all the world. Or play with the horror of split-horizon DNS.


Neat! I travel for work a fair bit and saw a local craft sale in a mostly fly in northern community with this. The lady mentioned using super glue and then transferred into silver and gold jewelry. Success rate did not sound high for individual flakes but I guess the winter is long… try again.


Kudos to that! I kid you not: yesterday I used bing to search for “CRA my business account” (Canadian IRS equivalent) to set up some payments and the first result below copilot was a phishing site with a cloned UI! Makes me thankful for services like yours (and angry about the other things).


I feel like about the only thing not worth pirating these days due to enshitification is games and podcasts. Steam still makes it easy, questions about licensing aside.


I wish there were one click or download large mod packs to modernize the game. I find for this, new vegas, oblivion I spend two evenings getting everything to play nice/give up then run out of steam and don't actually plan anything.


There are. These days there are OpenMW-specific mod packs. You run one command to download everything, another command to install and configure it. The instructions are really good, it's hard to mess up.

See here: https://modding-openmw.com/lists/


Thanks again. That is much easier than in the past. Fired it up yesterday, tried downloading 30+ mods manually, got a premium nexusmods account to automate and speed up the downloads and it just works. Damn, there is a lot of content and upgrades from a few hours in it. Wish I could be a teenager spending many hours a week on it again!


Yeah. Especially Tamriel Rebuilt adds a boatload of content. They're still working on it, but a good chunk of the mainland is explorable and full of quests. :)


That's helpful. I'll take a look. Thank you!


Immich is a night and day improvement for photos vs nextcloud. You could roll it in addition if you wanted to try.


I went from cloud to local smb shares to nextcloud to seafile. Really happy with the latter. Works, no bloat, versioning and some file sharing. The pro version is free with 3 or less usernames. I use the cli client to mount the libraries into folders and share that with smb + subst X: into the root directory on laptops for family. Borgbackup of that offsite for backup.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: