Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> New approaches are needed, like more dynamic approaches using behavioural analysis

Does this set off alarm bells for anyone else? Of course the best way to know if a visitor is a human or a bot is to deeply analyze their behavior. But that's at odds with the right of us humans not to be analyzed by every website we visit. What happens if we do reach a standoff where the bots become good enough at mimicking human behavior that the only way to tell us apart is unacceptable and illegal behavioral analysis?



I recently wondered whether the good reviews on WWW shopping sites, are actually written by the bots. The market for astroturfing is so competitive, that the paid reviews probably learned a long time ago you need to leave quality reviews to get repeat customers.

They also 'care' more than actual customers in many cases. Real customer -> "Stop sending me review reminders. It was a comb. Block." Bot -> "Dutifully review all kinds of products. 500 words on the life changing experience of hair brushing with this comb. A+ reviewer."

I find it difficult to believe that the bot networks would not have just immediately rolled every single generative AI advance into their networks (write convincing reviews, generate convincing product examples without buying, beat captchas more reliably, automated screen clicking, human eye scan impersonation). Need to be better than every other group doing paid reviews. Need to be better than actual humans. They might write critical reviews.

Also, lot of sites already doing some behavioral analysis. Popup every time you consider clicking 'leave' on websites lately? "Before you go..."


> I recently wondered whether the good reviews on WWW shopping sites, are actually written by the bots.

Probably, but if these are just like, Amazon reviews/etc, they likely violate FTC regulations. Enforcement is lacking, but I'd still be very hesitant to break the law.


Maybe the ultimate solution is to make people pay. As in microtransactions

Human or bot is not really the problem, spam is the problem, and bots makes spam so cheap that admins can't deal with it. So, bots are banned. Human spammers can still get in, and you can pay people to solve captcha, but humans are more expensive, so there are less of them and moderators can deal with them.

If we had people (or bots) pay a few cents to access a service, it could be enough to keep spam to a manageable level.

The problem is, people don't like to pay, and unlike with phone numbers, the web doesn't have a good microtransaction architecture so behavioral analysis it is.


Any payment system will be used to track and unmask users. Then they'll double dip selling your even better identified user data to whoever wants it. Probably while still showing you ads.

You can slowdown bots with proof of work. A crypto-miner seems like the only possible payment method that would resist tracking, I think Brave tried something similar. Not sure I like that idea!


I think “slowing down the bots with proof of work” is essentially what captchas were, back when they weee easy for us and hard for computers.

But now, when I see a captcha, I hit back, press unsubscribe, and find a new vendor. The work is harder for me than it is for a computer and so I won’t do it.

Accordingly, we can see that proof of work is the opposite of a solution.


Well... no. Proof of work doesn't mean you the human have to do work. Bitcoin is a proof of work system but you don't calculate the numbers by hand.

The question there is if an acceptable amount of work (e.g. cpu/gpu work) on a mobile phone is a large enough deterrent for a bot.


I’ve been rereading your post here and I think I’m coming around. Captcha is not proof that I did the work; it’s an inference that because I was able to do the work, it must have been easy for me (because I’m a human).

I let my my cryptoskepticism run a little too freely here. Thanks for making me think harder.


Yeah, I don't think anyone is going to pay 5 cents to see a website. That number is likely to get big real fast if you like news aggregation sites.


If the point is to stop spam it doesn't need to be implemented like that. One way to do it could be, pay $5 to create an account. Your money will be returned to you as you post on the site. If someone determines that you're spamming, any money that hasn't been returned to you is lost.


The idea is not to make people pay 5 cents to see a website. Instead replacing captchas with a toll for about how much it would cost to have a human solve the captcha for you.


If you just open HN once per day and open all promising links in a new window you're already out a lot of cents.

This will only work with mechanisms that let you pay like 0.05 cents. Should be enough to deter bots that practically run for free these days.

Too bad any intermediary will want 0.30 dollars per transaction for the 0.05 cents :)


I mean... you just batch those cents and have cash out limits. Preload accounts, etc.

* note that the word just does an obscene amount of lifting in that sentence.


There are already 'solve captcha as a service' sites all over, with highly developed APIs for their use. Lots of people use them for sneaker bots and ticket bots etc.

This is alosing idea out of the gate.


No fundamental reason a micro transaction has to be more than a tiny fraction of a penny.


How do you combine microtransactions with the need to be indexed by search engines?

Microtransactions solve the issue of bad bots, and possibly websites monetization. But then do you want to give free pass to search engine crawlers? The big ones will be strong enough to refuse to crawl your site if you don't. The small ones will be financially unable to crawl if you don't. If you allow them all, you're back to step 1. If you allow only one or a few, you basically freeze search engine innovation.


Isn't the big issue not bots reading but bots generating? You could allow bots in a read only fashion.


Spammers will have more disposable funds than me and more utility from making payments to spread their message than I will. Essentially this is more likely to exclude poor people while waving spammers on in through the ticket gate.

Not to mention credit card fees making sub $1 payments a no-go and crypto being it's own barrel of nightmares.


There are the recently announced hardware root of trust solutions[1] that aren’t behavior driven.

But they just trade one privacy issue for potentially another, depending on your view

It does seem sadly unavoidable. Perhaps the internet has to go full circle and we need real identities if we want to ensure we’re not talking to machines?

[1] https://blog.cloudflare.com/eliminating-captchas-on-iphones-...


It is not only a privacy issue, "certified by entrenched gatekeeper mega-corporation" is a nightmare for user freedom. The first ones who will be impacted are the minority who root their devices and compile their own software, but in the long run there are detrimental effects of a monopoly at the gate, like the ease to implement surveillance, censorship and drm, that will apply to everyone.


Yeah, at second glance it actually (as proposed - huge caveat) might even be better for privacy.

I’m can’t begin to theorize how the future will play out if you need a PAT to access most web destinations. Do cloudflare or Apple engineers use Linux machines ever? Surely they do, and either know this is bad or have some plan to make it work?


> Perhaps the internet has to go full circle and we need real identities if we want to ensure we’re not talking to machines?

If that's the case, then I'll be done using the web entirely.


Yeah, I think I might be too. I mean, I’m headed that way to a degree anyhow.

Although arguably most major web players already 100% know who you are - just maybe not your name.


> I’m headed that way to a degree anyhow

As am I, which is why it's not that big of a deal to take this stance.

> most major web players already 100% know who you are

I have no doubt about this, but there's also a whole internet full of others who I want to remain pseudonymous with. I've been using a handful of online identities for over 30 years now, and have never tied them to my real world identity.

The reasons for avoiding that hold more true now than ever before. Having to tie my online identities to my actual identity is unthinkable.


George Hotz is correct in that we all need our own AI. This is the only way to give us a fighting chance against bad actors.


It's the new Gun.

You don't need one, but if you don't have one, you're at a disadvantage against those who do.


How would that help?


He explains that it's like having a body guard. It will protect you from spam, psyop,scams, etc.


They are working on that: See the Web Integrity API. One of the goals was to separate the humans from the robots.


Then we require a phone number for everything (it's not easy to make unlimited new phone numbers) and use OIDC to authenticate to one of a couple providers. You won't be able to do anything on the internet without logging in first, but the login is safe at the identity provider.

If you think about it this is no different than showing your ID to get into a bar.


The problem is that this time, it's not a bar, it's just any store (or really, anywhere) you go.


and they're recording your ID number and sharing that with other stores to track your purchases.


I trust any bar more about privacy than anyone on the internet. Their incentive to maximize consent to stalk me and my behavior is close to non-existing.


Phone numbers are a cheap and reusable resource. Pushing the problem on another site with OIDC doesn't help either, if their CAPTCHAs have the same limitation.


"Real" cellular phone numbers are a very finite pool and require nontrivial amounts of money, a physical phone (with a burned-in hardware identifier), an in-person interaction, and government ID that validated against a state database.


"Real" cellular phone numbers are a very finite pool and require nontrivial amounts of money, a physical phone (with a burned-in hardware identifier), an in-person interaction, and government ID that validated against a state database.

You'd think so, but no.

I signed up for T-Mobile service early this year with no ID, and paid cash.

The store is so eager to complete the transaction that it keeps a government ID document in a drawer and the sales people whip it out whenever anyone looks queasy about providing information.

I didn't resist giving my information. All I did was pause because I wasn't sure if I brought my ID with me. Even that little hesitation was enough for the clerk to say, "Don't worry about it. I got you covered" and he pulled out the ID.

So I have a T-Mobile account that I can pay for with cash and no ID on file, and someone else's address.

Now, if a government was really interested in me, it could probably pull the security camera video or follow the signal around or whatever. But it turns out that KYC is easily bypassed when the incentives are right.


Some places might require all that but a lot don't need in-person, id card. And there's also sms verification services that charge you a few cents per verification, and they use "real" numbers.


In the US, you can walk in a T-Mobile store and get a SIM card for cash, no questions asked.

Also once you are sitting on a few dozens phone numbers, you can use them again and again to spam or abuse different services (possibly for sale). It's not like CAPTCHA solutions that you have to do every time.


> Then we require a phone number for everything (it's not easy to make unlimited new phone numbers)

There are services that allow you to verify for as low as $0.03/activation and their stock is massive and diverse so that's not a solution


It is though. It's how every major company validates new customers without captcha.


> What happens if we do reach a standoff where the bots become good enough at mimicking human behavior that the only way to tell us apart is unacceptable and illegal behavioral analysis?

Sophisticated bots are already good enough at this that a variety of behavioral-based bot analysis tools exist and are in semi-widespread use. They're not illegal.


Can you give any examples of these tools?



It's also short-sighted because any behavior analysis that is stored in a database somewhere could be used to train a new AI model.



Thankfully, soon we will have the web integrity API to verify that a visitor is human.

Apple devices already support something like this when connecting to websites behind cloudflare and fastly, and as cloudflare explains this "vastly improves privacy by validating without fingerprinting"[1].

https://blog.cloudflare.com/eliminating-captchas-on-iphones-...


Please tell me you are not honestly cheering that atrocity on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: