To give a good example of why something like this matters...
I spent a bit more than a year as the most senior dev on the team that owned the sign up page for a cloud computing company. We faced bot attacks constantly. They had a variety of reasons to attack, but International revenue sharing fraud (IRSF) was a major one. In short, bots would convince our sign up flow to make verification phone calls to numbers that charge a lot of money but don't actually exist. (Note: whatever "why don't you just" you're about to reply with, we had an entire team of people doing nothing but trying to stop this for months and years- we tried that and it didn't succeed, or there were business reasons it was not feasible.)
I switched companies. In my new role, I happened to be on a team in the same org as their sign up page. Chatting with them, I learned that they were not facing similar attacks, but identical ones. The countries involved, the patterns of the attack, everything- this was the same attacker, going up against two completely unrelated companies.
I don't know if this system is the right implementation- I need to do a deeper dive on this- but I do know that something like this is needed so that companies can coordinate their defenses.
I've worked on a large federated IP reputation system. It sounds like a great idea, but does not work in practice.
The main problem is that it's just too hard to map different kinds of abusive/non-abusive actions to a shared scale. The model here seems to be basically a boolean OR: if anyone flagged an IP as bad, it's marked as bad in the data set. Let's say that you're running a shop and detect a bunch of fraudulent transactions for expensive items, and add the IPs to the list. What can somebody else using this list for say spam-filtering comments to a web forum do with your data points in isolation? Not much, because the action you're protecting is rare, high impact, and unlikely to have FPs so the threshold where you'd mark the IP as abusive would be very different than for the spam-filter use case.
At a minimum you'd need all the client of the IP reputation signal to also share the positive reputation signals, not just the negative, and to export some forms of volumes or ratios of good and bad traffic. But it'll still be really hard to combine those reports to a single verdict that's generally useful.
A secondary problem is that different kinds of abuse don't even correlate particularly well. Let's say that you've got an IP address sending SMTP spam; the odds are that this IP will not be doing ssh credential stuffing, credit card fraud, mass-scraping, traffic pumping, warez distribution, or DDOS attacks.
The classic SMTP IP blocklists work only because everyone using them is using them for the same purpose, and is as such on a shared scale, and there is a high likelihood of the same abusive actor attacking multiple different organizations. Your example would fit that as well, and doing reputation for that one domain would actually tractable unlike federated cross-domain IP reputation.
It is arrogant to assume that if you haven't found a solution then nobody will. (trust expert when they say something can be done, ignore them if they say it can't be done, just do it)
Nobody is assuming that. The arrogance is when people with no knowledge of the people, systems, or problem at hand jump in as though they’re experts. It happens all the time here (hell I’m guilty myself)
They do usually exist. They answer immediately and than play the "it is ringing"-sound forever.
(Some SIP providers allow you to disallow expensive numbers to be called and/or state the minute costs before the call starts (giving you time to hang up.))
> They answer immediately and than play the "it is ringing"-sound forever.
Except that clearly is a number which exists? One that answers, GP isn't fighting 1980's phreakers with a bluebox, they are using Twilio, Signalwire, etc to make automated verification calls/sms for a very large company as far as I understood it. The call being answered is baked entirely into your programmatic logic.
Unless I'm missing some niche telco semantics or lingo here? Or they are rolling their own?
To embrace the "don't question it, you don't understand" narrative, I'm actually quite stunned someone would let this go on for such a long time, if this is corruption, crime or some other sort of telecom fraud aided and abetted by foreign governments/companies why not just turn off the tap and ban the country code if it was such a cost sink? Why not use some of the many alternative (and significantly more secure) verification methods for your customers?
Since you say you tried and hard to stop this, I guess you are implying that it's not possible to tell in advance what it costs to call a given number? How do telcos get away with that?
As it was explained to me (anyone feel free to correct me if I'm wrong), a shady/criminal organization takes control of a piece of the phone network. They pick a range a phone numbers for a country that is usually expensive to call, maybe a range of numbers that doesn't actually exist in that country, and they use the phone network to "advertise" a slightly cheaper route to connect phone calls to those numbers. Instead if it costing $1, it costs $0.80.
When someone makes a call to those numbers, they charge the cheaper rate, and then don't actually connect them to a real phone. Every time a call is made, they just get money.
The "S" is for "shared". They offer other organizations "for every call you generate to this range of numbers, we give you a cut". Let's call it 20 cents.
Now picture how many organizations are vulnerable to this. Every "call me back when a customer service rep is available". Every "call me to let me type in a code to verify I'm human". Every "leave a number our sales team can reach you at". They're all being attacked by these sorts of things constantly.
The cost to get a bot to trick the system into making a phone call is much smaller than the revenue generated. As long as that is true, the attacker wins.
For consumers at least, it's the law in the EU. Every phone network has either a pricing document or a webpage where you can type a number and it tells you the price, and if you were charged something that isn't that price, I guess you could demand a refund, and escalate to a court if the refund wasn't forthcoming.
Note that these documents frequently list some numbers at ridiculously high prices (eg. €25 per minute), presumably to deter such fraud.
I think the problem with that is that these schemes rely on the phone network being ridiculously insecure. The scammers send a surprise bill to your telco, then they pass it along to you.
It wouldn't surprise me if the list only applies to consumer plans, or (as you say) high-balls stuff in certain countries.
I wonder what would happen if the telco would say "we wont pay the bill, since we believe it is fraudulent".
They wont "lose much" if some fraudulent telco from third world cuts them - the numbers dont work anyway.
And what can the scammers do? Try to sue an US telecom? If they reveal their true names, then the embassy knows where to send the "operators" who will solve the case for good.
There is normally a long chain of intermediate telco operators between the fraudsters and the US operator. Some of those intermediate operators are large and have a lot of legit calls. Each operator adds their profit margin on and passes the bill to the next. So no operator has any incentive to police this stuff, because they all profit from it.
It does make me wonder why they don't just switch to pre-paid sim cards (or the contractually equivalent thing that doesn't involve cell phones).
Good luck collecting your scam cash from my personal line! The phone company doesn't have a mechanism to pass the cost on to me, so I'm guessing the telcos already figure out how to block this at the network level.
Has anyone tried calling some of the bad actor numbers from a burner phone?
This problem would not exist if the price of a call would be known before making the call. But we live in a world where "it's cheap, just use it, we will the exact bill later" is the norm.
It s like the electricity companies who cant preemptively check usage to cut before it explodes, cant anything be done at the phone provider level to just...cut before the cost is incurred ?
Additionally some VPNs route through legitimate devices.
The only way I’ve seen this somewhat work is to have a complex system that pulls apart the connection info, and then you use a combination of data science + threat intel + good ol’ reversing to make decisions of if something is malicious or not. Then you need multiple teams to run and tune these functions as attacks change.
Exactly, sometimes these systems are purposely used to deny legitimate users. Sort of like OS account lockouts (too many failed attempts). What if their goal is to lock the users out?
They'd need to originate SIP traffic from your IP or try to spoof your IP. If they are doing that, they may be targeting different things not SIP related.
Could it be built so that is an org uploads an ip, only they can block it. Other companies can still see the ips, but not be able to download until a certain number of companies have uploaded it. Sort of like blockchain confirmations. Maybe, ips from countries with higher numbers of attacks would require less confirmations. The bad guy can't be in all the participating orgs. Another alternative is to allow all ips to be downloaded, with data about the number of confirmations, so companies could decide on their own depending on the risk they're willing to take (might not work is most go for zero risk).
Off-Topic question: Is there an existing static file that contains all the attacking IP's from a 3 or 6 month period that spawned the creation of this system? I would like to compare it to what is in firehol's repo. [1] I am not trying to downplay anything rather hoping to see the delta of the outliers. In the past I logged the people querying my DNS servers for SIP names but have since switched to NSD which does not have query logging.
[Edit] I am running tcpdump right now and it didn't even take 15 seconds to start seeing queries for sip SRV records.
Really cool project. To implement it though, one really should be careful to not just load the IPs nillywilly into ones ACL’s…
Run some test on the list before using it, to make sure your own assets isn’t mentioned in there. If you don’t, you end up creating a DOS vulnerability in your system :D
Yes, you'd want to compare against your own IP allocations as a network operator. The real value is the test phone numbers being tried and probe patterns so you can get an early warning to a customers or users system being compromised before anything happens.
How are the caller IDs (phone numbers) vetted to verify that they're actually used in robocall scams instead of legitimate EBR calls that some grumpy customer didn't appreciate?
Note that the checkboxes in the feature list (along with the source code) say that the P2P sharing/replication of the bad actors database is not implemented yet.
Can this be legal, if we have GDPR? A regulator can argue that IP addresses and phone numbers are hackers' personal data, and it's illegal to share them.
IP addresses can only be personal data if associated with a person. No sane court would rule against you for having random IPs in a blocklist.txt file.
You would have to ask companies that you called how they shared & used your information. You probably can't reject to the data collection as it's based on legitimate interest (fraud detection) and not consent.
IIRC GDPR only applies if you are operating a business or organization.
Depending on how this project is structured, it wouldn’t have to process your GDPR request.
Edit: I was only partially correct. According to Article 2, Point 2C, the regulation (GDPR) does not apply to the processing of personal data “ by a natural person in the course of a purely personal or household activity”
So if this project can be viewed as a personal project (which I suppose it could, in theory…), then it wouldn’t have to comply.
No, GDPR applies to everyone. The government is proactively only enforcing GDPR vs organizations, private persons need to sue to get GDPR enforcement against each other.
Also - should companies use this list, their usage would of course need to be GDPR compliant.
It indeed doesn't cover data processed by "a natural person in the course of a purely personal or household activity" But arguably, as soon as you upload stuff to github.com you already fall outside of that narrow definition.
From the Github page, "I started this because I wanted to do C network programming"
I think this is a poor choice (from a security perspective). It should be written in Go or rust. C programs (exposed to the network) are dangerous even when written by experienced developers.
Really, in 2022, everything Internet facing should be written in a memory safe language, running as a normal user (no root) and have a strong MAC policy applied. Anything else is too risky.
Damn, do I have to shut down my wireguard VPN now? Memsafety is not everything. IMHO, everyone should write security relevant code in Ada Spark. There is reasons not to do it, I guess. At least now one can rewrite it in Rust an post it on HN...
Does Wireguard not do a privilege downgrade? That seems important. I know it needs some additional privileges, and therefore (right now?) will not run in a container (which is annoying), but after it has set up an interface, why doesn't it back off its privileges? I really want to try Wireguard but kept getting hung up on stuff like this.
I spent a bit more than a year as the most senior dev on the team that owned the sign up page for a cloud computing company. We faced bot attacks constantly. They had a variety of reasons to attack, but International revenue sharing fraud (IRSF) was a major one. In short, bots would convince our sign up flow to make verification phone calls to numbers that charge a lot of money but don't actually exist. (Note: whatever "why don't you just" you're about to reply with, we had an entire team of people doing nothing but trying to stop this for months and years- we tried that and it didn't succeed, or there were business reasons it was not feasible.)
I switched companies. In my new role, I happened to be on a team in the same org as their sign up page. Chatting with them, I learned that they were not facing similar attacks, but identical ones. The countries involved, the patterns of the attack, everything- this was the same attacker, going up against two completely unrelated companies.
I don't know if this system is the right implementation- I need to do a deeper dive on this- but I do know that something like this is needed so that companies can coordinate their defenses.