Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A distributed peer to peer list of bad actor IP addresses and phone numbers (sentrypeer.org)
106 points by Brajeshwar on Feb 5, 2022 | hide | past | favorite | 85 comments


To give a good example of why something like this matters...

I spent a bit more than a year as the most senior dev on the team that owned the sign up page for a cloud computing company. We faced bot attacks constantly. They had a variety of reasons to attack, but International revenue sharing fraud (IRSF) was a major one. In short, bots would convince our sign up flow to make verification phone calls to numbers that charge a lot of money but don't actually exist. (Note: whatever "why don't you just" you're about to reply with, we had an entire team of people doing nothing but trying to stop this for months and years- we tried that and it didn't succeed, or there were business reasons it was not feasible.)

I switched companies. In my new role, I happened to be on a team in the same org as their sign up page. Chatting with them, I learned that they were not facing similar attacks, but identical ones. The countries involved, the patterns of the attack, everything- this was the same attacker, going up against two completely unrelated companies.

I don't know if this system is the right implementation- I need to do a deeper dive on this- but I do know that something like this is needed so that companies can coordinate their defenses.


I've worked on a large federated IP reputation system. It sounds like a great idea, but does not work in practice.

The main problem is that it's just too hard to map different kinds of abusive/non-abusive actions to a shared scale. The model here seems to be basically a boolean OR: if anyone flagged an IP as bad, it's marked as bad in the data set. Let's say that you're running a shop and detect a bunch of fraudulent transactions for expensive items, and add the IPs to the list. What can somebody else using this list for say spam-filtering comments to a web forum do with your data points in isolation? Not much, because the action you're protecting is rare, high impact, and unlikely to have FPs so the threshold where you'd mark the IP as abusive would be very different than for the spam-filter use case.

At a minimum you'd need all the client of the IP reputation signal to also share the positive reputation signals, not just the negative, and to export some forms of volumes or ratios of good and bad traffic. But it'll still be really hard to combine those reports to a single verdict that's generally useful.

A secondary problem is that different kinds of abuse don't even correlate particularly well. Let's say that you've got an IP address sending SMTP spam; the odds are that this IP will not be doing ssh credential stuffing, credit card fraud, mass-scraping, traffic pumping, warez distribution, or DDOS attacks.

The classic SMTP IP blocklists work only because everyone using them is using them for the same purpose, and is as such on a shared scale, and there is a high likelihood of the same abusive actor attacking multiple different organizations. Your example would fit that as well, and doing reputation for that one domain would actually tractable unlike federated cross-domain IP reputation.


> whatever "why don't you just" you're about to reply with,

I love that you added this. So many armchair experts with brilliant solutions to problems they know nothing about.


It is arrogant to assume that if you haven't found a solution then nobody will. (trust expert when they say something can be done, ignore them if they say it can't be done, just do it)


Nobody is assuming that. The arrogance is when people with no knowledge of the people, systems, or problem at hand jump in as though they’re experts. It happens all the time here (hell I’m guilty myself)


It is also arrogant to assume you can find a solution in a few minutes after reading a HN comment when an entire team couldn’t think of one in years.


> an entire team couldn’t think of one in years

There's literally no other solution for customer verification other than an automated phone call into third world countries with corrupt telcos?

We clearly aren't talking about the engineering team here.


There are clearly other constraints involved that we don’t know.


'I could fix it over a weekend' is pretty much the motto of this page.


These two smug all-knowing above comments aside, how do you actually get charged money for dialling numbers that don't exist?


They do usually exist. They answer immediately and than play the "it is ringing"-sound forever.

(Some SIP providers allow you to disallow expensive numbers to be called and/or state the minute costs before the call starts (giving you time to hang up.))


> They answer immediately and than play the "it is ringing"-sound forever.

Except that clearly is a number which exists? One that answers, GP isn't fighting 1980's phreakers with a bluebox, they are using Twilio, Signalwire, etc to make automated verification calls/sms for a very large company as far as I understood it. The call being answered is baked entirely into your programmatic logic.

Unless I'm missing some niche telco semantics or lingo here? Or they are rolling their own?

To embrace the "don't question it, you don't understand" narrative, I'm actually quite stunned someone would let this go on for such a long time, if this is corruption, crime or some other sort of telecom fraud aided and abetted by foreign governments/companies why not just turn off the tap and ban the country code if it was such a cost sink? Why not use some of the many alternative (and significantly more secure) verification methods for your customers?


Countries are demanding access to private citizen encrypted messages,

but can't legislate away the basic 1980s phone scam.

Wonder why their priorities are so skewed.


Corrupt carriers in other countries splitting the money, etc.


So stop dialling out to these few places? Otherwise it's a cost of doing business if you insist on keeping at it knowing full well what happens.


This is what most people do, at my employment we restricted it to US/Canada with a reach out for other countries.


The project page?


The HN page.


:-D


Oh good, mob justice on the networks. Surely this IP system itself can never be manipulated to cut off people by mistake or for revenge.

Humans always investigate thoroughly before serving themselves.

None of these IPs could be zombies; here come internet Supermen to save us!


Yeah, this is to gather the raw data and learn probe and attack patterns. How you use the API is your choice really.


Since you say you tried and hard to stop this, I guess you are implying that it's not possible to tell in advance what it costs to call a given number? How do telcos get away with that?


IRSF is a fascinating topic.

As it was explained to me (anyone feel free to correct me if I'm wrong), a shady/criminal organization takes control of a piece of the phone network. They pick a range a phone numbers for a country that is usually expensive to call, maybe a range of numbers that doesn't actually exist in that country, and they use the phone network to "advertise" a slightly cheaper route to connect phone calls to those numbers. Instead if it costing $1, it costs $0.80.

When someone makes a call to those numbers, they charge the cheaper rate, and then don't actually connect them to a real phone. Every time a call is made, they just get money.

The "S" is for "shared". They offer other organizations "for every call you generate to this range of numbers, we give you a cut". Let's call it 20 cents.

Now picture how many organizations are vulnerable to this. Every "call me back when a customer service rep is available". Every "call me to let me type in a code to verify I'm human". Every "leave a number our sales team can reach you at". They're all being attacked by these sorts of things constantly.

The cost to get a bot to trick the system into making a phone call is much smaller than the revenue generated. As long as that is true, the attacker wins.


For consumers at least, it's the law in the EU. Every phone network has either a pricing document or a webpage where you can type a number and it tells you the price, and if you were charged something that isn't that price, I guess you could demand a refund, and escalate to a court if the refund wasn't forthcoming.

Note that these documents frequently list some numbers at ridiculously high prices (eg. €25 per minute), presumably to deter such fraud.


I think the problem with that is that these schemes rely on the phone network being ridiculously insecure. The scammers send a surprise bill to your telco, then they pass it along to you.

It wouldn't surprise me if the list only applies to consumer plans, or (as you say) high-balls stuff in certain countries.


I wonder what would happen if the telco would say "we wont pay the bill, since we believe it is fraudulent". They wont "lose much" if some fraudulent telco from third world cuts them - the numbers dont work anyway. And what can the scammers do? Try to sue an US telecom? If they reveal their true names, then the embassy knows where to send the "operators" who will solve the case for good.


There is normally a long chain of intermediate telco operators between the fraudsters and the US operator. Some of those intermediate operators are large and have a lot of legit calls. Each operator adds their profit margin on and passes the bill to the next. So no operator has any incentive to police this stuff, because they all profit from it.


It does make me wonder why they don't just switch to pre-paid sim cards (or the contractually equivalent thing that doesn't involve cell phones).

Good luck collecting your scam cash from my personal line! The phone company doesn't have a mechanism to pass the cost on to me, so I'm guessing the telcos already figure out how to block this at the network level.

Has anyone tried calling some of the bad actor numbers from a burner phone?


There are definitely two kinds of bad actors when it comes to scraping/scripting things.

The automated drive by ones trying to spam posts in every form or are probing for word press exploits or what have you.

The ones where someone is specifically taking time out of their day to tailor their attack to your specific page.

The former is easy to deal with the latter is far harder to sort out.


This problem would not exist if the price of a call would be known before making the call. But we live in a world where "it's cheap, just use it, we will the exact bill later" is the norm.


A calling prefix for "place call if it's free" would be great. Or even "place call if it's at or below this standard fare".


I hope to expand the protocol list once I can figure out Network Discovery on the WAN - https://github.com/zeromq/zyre/issues/701


It s like the electricity companies who cant preemptively check usage to cut before it explodes, cant anything be done at the phone provider level to just...cut before the cost is incurred ?

I mean if credit cards can do it...


Why are these systems making verification phone calls?


As they normally have a deal with the numbers they can successfully call for outpayments from the Telco.


two factor authentication


What happens if an IP, rightfully included in the list, gets assigned to me or, worse, to the server I just rented?

Or if someone added my phone number as a prank, a revenge or an attack on me?


Don't worry, the lost revenue won't be material to earnings.


classy


Additionally some VPNs route through legitimate devices.

The only way I’ve seen this somewhat work is to have a complex system that pulls apart the connection info, and then you use a combination of data science + threat intel + good ol’ reversing to make decisions of if something is malicious or not. Then you need multiple teams to run and tune these functions as attacks change.


Would you be using that IP to make SIP calls or route SIP traffic? The bad actors will only be valid for 7 days.


What happens if a bad guy uploads your ip?


Exactly, sometimes these systems are purposely used to deny legitimate users. Sort of like OS account lockouts (too many failed attempts). What if their goal is to lock the users out?


What happens when someone uploads 4 billion IPs


i recon he'll get blocked from contributing to the block-list


There's some room in the missing ~295 million IPs


What happens is that you find out why this is a terrible idea.


And someone gets sued for tortuous interference and defamation.


They'd need to originate SIP traffic from your IP or try to spoof your IP. If they are doing that, they may be targeting different things not SIP related.


Could it be built so that is an org uploads an ip, only they can block it. Other companies can still see the ips, but not be able to download until a certain number of companies have uploaded it. Sort of like blockchain confirmations. Maybe, ips from countries with higher numbers of attacks would require less confirmations. The bad guy can't be in all the participating orgs. Another alternative is to allow all ips to be downloaded, with data about the number of confirmations, so companies could decide on their own depending on the risk they're willing to take (might not work is most go for zero risk).


Off-Topic question: Is there an existing static file that contains all the attacking IP's from a 3 or 6 month period that spawned the creation of this system? I would like to compare it to what is in firehol's repo. [1] I am not trying to downplay anything rather hoping to see the delta of the outliers. In the past I logged the people querying my DNS servers for SIP names but have since switched to NSD which does not have query logging.

[Edit] I am running tcpdump right now and it didn't even take 15 seconds to start seeing queries for sip SRV records.

[1] - https://github.com/firehol/blocklist-ipsets.git


How are these verified?


2fa with a text to your cellphone


"Please use your phone to verify that you actually are a bad actor"


Really cool project. To implement it though, one really should be careful to not just load the IPs nillywilly into ones ACL’s…

Run some test on the list before using it, to make sure your own assets isn’t mentioned in there. If you don’t, you end up creating a DOS vulnerability in your system :D


Yes, you'd want to compare against your own IP allocations as a network operator. The real value is the test phone numbers being tried and probe patterns so you can get an early warning to a customers or users system being compromised before anything happens.


Any relation with ShouldIAnswer (https://www.shouldianswer.com/), which is also a peer reviewed/maintained DB of bad phone numbers?

ShouldIAnswer is probably more oriented toward end users but maybe both projects can still benefit from each others?


Thanks. I'll take a look. No relation. My goal is to put the raw data in the hands of technical end users or network operators.


How are the caller IDs (phone numbers) vetted to verify that they're actually used in robocall scams instead of legitimate EBR calls that some grumpy customer didn't appreciate?


In the world of ipv6 how useful is keeping a list of ips any more?

I suppose if you see a lot of requests coming out of a network you can just block ips based on the network portion of the address.


ipv6 block-lists are not a thing, hence all big mail-senders use v4 only


Note that the checkboxes in the feature list (along with the source code) say that the P2P sharing/replication of the bad actors database is not implemented yet.


I'm just working on that at the moment. LAN things are OK, but still researching WAN network discovery. Some chat here https://github.com/zeromq/zyre/issues/701#issuecomment-94780...


Having a good way to let legitimate users appeal the blocklist seems to be the hard part and why Spamhaus for example has been successful.


Who watches the watchmen?


i often wonder if we could have actually built robust and reliable systems with the time it took to curate all those block-lists


Can this be legal, if we have GDPR? A regulator can argue that IP addresses and phone numbers are hackers' personal data, and it's illegal to share them.


IP addresses can only be personal data if associated with a person. No sane court would rule against you for having random IPs in a blocklist.txt file.


> No sane court would rule against you for having random IPs in a blocklist.txt file.

challenge accepted


Where can I send my GDPR request if I find out that my IP address or phone number is in there? Both are personally identifiable information.


You would have to ask companies that you called how they shared & used your information. You probably can't reject to the data collection as it's based on legitimate interest (fraud detection) and not consent.


Why would your phone number in the list?


IIRC GDPR only applies if you are operating a business or organization.

Depending on how this project is structured, it wouldn’t have to process your GDPR request.

Edit: I was only partially correct. According to Article 2, Point 2C, the regulation (GDPR) does not apply to the processing of personal data “ by a natural person in the course of a purely personal or household activity”

So if this project can be viewed as a personal project (which I suppose it could, in theory…), then it wouldn’t have to comply.


I am 90% sure that this is wrong.


No, GDPR applies to everyone. The government is proactively only enforcing GDPR vs organizations, private persons need to sue to get GDPR enforcement against each other. Also - should companies use this list, their usage would of course need to be GDPR compliant.


No, it does not. Article 2 point 2 covers the exact scenarios in which it does not.


It indeed doesn't cover data processed by "a natural person in the course of a purely personal or household activity" But arguably, as soon as you upload stuff to github.com you already fall outside of that narrow definition.


You can definitely be fined as an individual for GDPR non-compliance.

https://www.enforcementtracker.com/ETid-1022


I see how what I wrote can be read as stating that an individual can’t be fined.

Please note that that is not what I’ve meant. I’ve updated my comment with a bit more info.


very very cool I love it. Highly needed.


From the Github page, "I started this because I wanted to do C network programming"

I think this is a poor choice (from a security perspective). It should be written in Go or rust. C programs (exposed to the network) are dangerous even when written by experienced developers.

Really, in 2022, everything Internet facing should be written in a memory safe language, running as a normal user (no root) and have a strong MAC policy applied. Anything else is too risky.


Damn, do I have to shut down my wireguard VPN now? Memsafety is not everything. IMHO, everyone should write security relevant code in Ada Spark. There is reasons not to do it, I guess. At least now one can rewrite it in Rust an post it on HN...


Does Wireguard not do a privilege downgrade? That seems important. I know it needs some additional privileges, and therefore (right now?) will not run in a container (which is annoying), but after it has set up an interface, why doesn't it back off its privileges? I really want to try Wireguard but kept getting hung up on stuff like this.


Neither rust or go have a formally verified compiler, they cannot be used for security critical programming.


I had a lot of analysis paralysis over just this...in the end I chose what I chose for the reasons listed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: