> New approaches are needed, like more dynamic approaches using behavioural analysis
Does this set off alarm bells for anyone else? Of course the best way to know if a visitor is a human or a bot is to deeply analyze their behavior. But that's at odds with the right of us humans not to be analyzed by every website we visit. What happens if we do reach a standoff where the bots become good enough at mimicking human behavior that the only way to tell us apart is unacceptable and illegal behavioral analysis?
I recently wondered whether the good reviews on WWW shopping sites, are actually written by the bots. The market for astroturfing is so competitive, that the paid reviews probably learned a long time ago you need to leave quality reviews to get repeat customers.
They also 'care' more than actual customers in many cases. Real customer -> "Stop sending me review reminders. It was a comb. Block." Bot -> "Dutifully review all kinds of products. 500 words on the life changing experience of hair brushing with this comb. A+ reviewer."
I find it difficult to believe that the bot networks would not have just immediately rolled every single generative AI advance into their networks (write convincing reviews, generate convincing product examples without buying, beat captchas more reliably, automated screen clicking, human eye scan impersonation). Need to be better than every other group doing paid reviews. Need to be better than actual humans. They might write critical reviews.
Also, lot of sites already doing some behavioral analysis. Popup every time you consider clicking 'leave' on websites lately? "Before you go..."
> I recently wondered whether the good reviews on WWW shopping sites, are actually written by the bots.
Probably, but if these are just like, Amazon reviews/etc, they likely violate FTC regulations. Enforcement is lacking, but I'd still be very hesitant to break the law.
Maybe the ultimate solution is to make people pay. As in microtransactions
Human or bot is not really the problem, spam is the problem, and bots makes spam so cheap that admins can't deal with it. So, bots are banned. Human spammers can still get in, and you can pay people to solve captcha, but humans are more expensive, so there are less of them and moderators can deal with them.
If we had people (or bots) pay a few cents to access a service, it could be enough to keep spam to a manageable level.
The problem is, people don't like to pay, and unlike with phone numbers, the web doesn't have a good microtransaction architecture so behavioral analysis it is.
Any payment system will be used to track and unmask users. Then they'll double dip selling your even better identified user data to whoever wants it. Probably while still showing you ads.
You can slowdown bots with proof of work. A crypto-miner seems like the only possible payment method that would resist tracking, I think Brave tried something similar. Not sure I like that idea!
I think “slowing down the bots with proof of work” is essentially what captchas were, back when they weee easy for us and hard for computers.
But now, when I see a captcha, I hit back, press unsubscribe, and find a new vendor. The work is harder for me than it is for a computer and so I won’t do it.
Accordingly, we can see that proof of work is the opposite of a solution.
I’ve been rereading your post here and I think I’m coming around. Captcha is not proof that I did the work; it’s an inference that because I was able to do the work, it must have been easy for me (because I’m a human).
I let my my cryptoskepticism run a little too freely here. Thanks for making me think harder.
If the point is to stop spam it doesn't need to be implemented like that. One way to do it could be, pay $5 to create an account. Your money will be returned to you as you post on the site. If someone determines that you're spamming, any money that hasn't been returned to you is lost.
The idea is not to make people pay 5 cents to see a website. Instead replacing captchas with a toll for about how much it would cost to have a human solve the captcha for you.
There are already 'solve captcha as a service' sites all over, with highly developed APIs for their use. Lots of people use them for sneaker bots and ticket bots etc.
How do you combine microtransactions with the need to be indexed by search engines?
Microtransactions solve the issue of bad bots, and possibly websites monetization. But then do you want to give free pass to search engine crawlers? The big ones will be strong enough to refuse to crawl your site if you don't. The small ones will be financially unable to crawl if you don't. If you allow them all, you're back to step 1. If you allow only one or a few, you basically freeze search engine innovation.
Spammers will have more disposable funds than me and more utility from making payments to spread their message than I will. Essentially this is more likely to exclude poor people while waving spammers on in through the ticket gate.
Not to mention credit card fees making sub $1 payments a no-go and crypto being it's own barrel of nightmares.
There are the recently announced hardware root of trust solutions[1] that aren’t behavior driven.
But they just trade one privacy issue for potentially another, depending on your view
It does seem sadly unavoidable. Perhaps the internet has to go full circle and we need real identities if we want to ensure we’re not talking to machines?
It is not only a privacy issue, "certified by entrenched gatekeeper mega-corporation" is a nightmare for user freedom. The first ones who will be impacted are the minority who root their devices and compile their own software, but in the long run there are detrimental effects of a monopoly at the gate, like the ease to implement surveillance, censorship and drm, that will apply to everyone.
Yeah, at second glance it actually (as proposed - huge caveat) might even be better for privacy.
I’m can’t begin to theorize how the future will play out if you need a PAT to access most web destinations. Do cloudflare or Apple engineers use Linux machines ever? Surely they do, and either know this is bad or have some plan to make it work?
As am I, which is why it's not that big of a deal to take this stance.
> most major web players already 100% know who you are
I have no doubt about this, but there's also a whole internet full of others who I want to remain pseudonymous with. I've been using a handful of online identities for over 30 years now, and have never tied them to my real world identity.
The reasons for avoiding that hold more true now than ever before. Having to tie my online identities to my actual identity is unthinkable.
Then we require a phone number for everything (it's not easy to make unlimited new phone numbers) and use OIDC to authenticate to one of a couple providers. You won't be able to do anything on the internet without logging in first, but the login is safe at the identity provider.
If you think about it this is no different than showing your ID to get into a bar.
I trust any bar more about privacy than anyone on the internet. Their incentive to maximize consent to stalk me and my behavior is close to non-existing.
Phone numbers are a cheap and reusable resource. Pushing the problem on another site with OIDC doesn't help either, if their CAPTCHAs have the same limitation.
"Real" cellular phone numbers are a very finite pool and require nontrivial amounts of money, a physical phone (with a burned-in hardware identifier), an in-person interaction, and government ID that validated against a state database.
"Real" cellular phone numbers are a very finite pool and require nontrivial amounts of money, a physical phone (with a burned-in hardware identifier), an in-person interaction, and government ID that validated against a state database.
You'd think so, but no.
I signed up for T-Mobile service early this year with no ID, and paid cash.
The store is so eager to complete the transaction that it keeps a government ID document in a drawer and the sales people whip it out whenever anyone looks queasy about providing information.
I didn't resist giving my information. All I did was pause because I wasn't sure if I brought my ID with me. Even that little hesitation was enough for the clerk to say, "Don't worry about it. I got you covered" and he pulled out the ID.
So I have a T-Mobile account that I can pay for with cash and no ID on file, and someone else's address.
Now, if a government was really interested in me, it could probably pull the security camera video or follow the signal around or whatever. But it turns out that KYC is easily bypassed when the incentives are right.
Some places might require all that but a lot don't need in-person, id card. And there's also sms verification services that charge you a few cents per verification, and they use "real" numbers.
In the US, you can walk in a T-Mobile store and get a SIM card for cash, no questions asked.
Also once you are sitting on a few dozens phone numbers, you can use them again and again to spam or abuse different services (possibly for sale). It's not like CAPTCHA solutions that you have to do every time.
> What happens if we do reach a standoff where the bots become good enough at mimicking human behavior that the only way to tell us apart is unacceptable and illegal behavioral analysis?
Sophisticated bots are already good enough at this that a variety of behavioral-based bot analysis tools exist and are in semi-widespread use. They're not illegal.
Thankfully, soon we will have the web integrity API to verify that a visitor is human.
Apple devices already support something like this when connecting to websites behind cloudflare and fastly, and as cloudflare explains this "vastly improves privacy by validating without fingerprinting"[1].
Does this set off alarm bells for anyone else? Of course the best way to know if a visitor is a human or a bot is to deeply analyze their behavior. But that's at odds with the right of us humans not to be analyzed by every website we visit. What happens if we do reach a standoff where the bots become good enough at mimicking human behavior that the only way to tell us apart is unacceptable and illegal behavioral analysis?