Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: How do large sites manage outbound emails/queueing
45 points by maverhick on Aug 18, 2010 | hide | past | favorite | 23 comments
How do large sites like fb/myspace/amazon/ <insert large email sender> manage to send millions of emails a day. What kind of infrastructure is used. How are the IPs warmed, how are the messages queued. How do they manage connections?

Any interesting reads on this topic?



The guys at MailChimp released a nice guide with an overview of what you need to know when running a high capacity email infrastructure: http://resources.mailchimp.com/email-delivery-for-it-profess...


Wow - reading through the Mail Deliverability one and this is really great info. Surprised how much MailChimp spills the beans here and gives lots of insider secrets of their business. Thanks for the link!


I know some guys who used to send 100 M emails every weekend with a qmail cluster. It was sorta opt-in, sorta. Lets just say that they're not doing it anymore.

I don't think there's a big technical issue in scaling SMTP sends; you can have N machines doing it in parallel and it will work (almost) N times faster than one machine. There IS the question of minimizing your cost per unit email, and the real spammers address that by building SMTP senders that don't comply with the standard.

What does bug me about email is deliverability. It's pretty much impossible to send e-mail to AOL members, for instance, if you (1) don't pay ice to AOL, or (2) are big enough that AOL is afraid you'll sue them. AOL has shrunk a lot lately, so this isn't as big a concern as it was years ago.

It's not hard to get burned by other organizations as well. For instance, I sent an email shot on the the behalf of a campus organization at a university from off campus... an opt-in list of 2500 subscribers, really just chicken feed, but only 1999 went through -- after 2000 connection attempts, they firewalled my IP address and for all I know that address is still firewalled today. I knew the guy who runs email for that school by name and refused to talk about the whole affair... That's what you're really up against.


I was the operations manager for Smartgroups.com an egroups (yahoogroups) clone.

We used two small Sun E250s to send upwards of a million emails a day using qmail and a tweaked version of the Solaris TCP settings to close connections as quickly as possible. These two servers also handled the bounces as well as the usual spam attempts. If I was doing something similar again I'd still used the Solaris/qmail combo as it was very reliable.

Our biggest problem was with people marking us as spammers even though they had to sign up to groups to use the service but that's a constant headache in this game. We also had some shell scripts that culled SMTP connections (either incoming or outgoing) that had been help open too long as that helped us deal with DOS attacks.

The site also throttled SMTP traffic to give HTTP traffic priority if required as that's OK for SMTP with its store-n-forward architecture but not for HTTP


I'd second that. Things may have changed recently, but historically, Linux hasn't been a great platform for mail servers because of the file system semantics. (Linux systems can get very ~weird~ under heavy email loads, kinda the way that Windows systems get ~weird~ in everyday desktop operations.)

Solaris is very good, and the x-BSDs often do better if you're a free software fanatic.


I don't understand. Can you elaborate on what you mean by 'weird' and why the BSD's do better?


It has to do with the way fsync works and metadata updates on directories. Linux does a lot more work updating directories, which hurts performance on mail servers in general but really hurts with qmail.

Practically you might find your CPU utilization is 2%, but the load average is 50 and you do 'ls' on a directory that has 20 files in in it and it takes 30 seconds.


Huh, thanks! Now I finally know why the hell that happened.


I helped design and setup 2 large SMTP infrastructures for two different European mobile carriers. Both carriers had more than 1 million customers and processed close to 100 million messages a day. It was about 7 years ago but I doubt things have changed that much.

We made heavy use of Mirapoint. But we also made use in different places of qmail and postfix. At the high end of scaling SMTP routing starts becoming similar to routing IP traffic. You're just not going to beat a dedicated device like Mirapoint with some Unix box you make from scratch. You need dedicated and differentiated mail routers for inbound, outbound and mail storage. You monitor their usage and when they get overloaded you just add more. Since everything is load balanced you can scale close to linearly. We used SAN for storage. Local storage just doesn't work.

Mail hits an inbound SMTP router and that router does an LDAP lookup to find which storage box actually handles hat account. Then it forwards it. That's all it does and it does it fast. Each storage box handles mail for the accounts it stores and IMAP/POP3 access for those accounts as well. The outbound routers just take mail from the storage routers and spool it for outbound.

I could go on but I think you get the idea. The main idea is to put everything behind a load balancer that you can so that you can scale it linearly as much as possible.


Local storage just doesn't work.

Can you explain why?


You want to give each user a fixed amount of storage. But most users don't use anywhere near their total. So you oversubscribe to save money by pooling everyone's storage in one massive SAN.

I imagine Gmail does something similar. I bet if every Gmail user suddenly started using 100% of their email storage Google couldn't cope. But Google knows this won't happen.

Over subscription is how you make money in the end. Getting better at figuring out exactly how much you need when you need it directly affects your margins.


I suspect it's two things. 1, the need for shared resources (users shouldn't have to connect to server71231.example.com, they should be connecting to mail.example.com. (2) performance (via fiber link or something)


I'm starting my new job as Lead Architect at Experian Cheetahmail at the end of the month; I understand their system can send billions of emails in a month. I'm looking forward to finding out how they do that. It sounds very impressive, even before dealing with spam filtering.


I worked for an company that sends billions of messages per month. Their largest list is over 1500000 recipients, and delivery is tracked. Since the problem is trivially parallel, I just wrote a tiny jruby app that drove postfix, and multiplied the rig as needed.


We use a similar approach, though not sending emails at your scale. The issue then becomes that of deliverability. Emails will be sent, but will not reach


Maybe you should ask how http://sendgrid.com/ do it. Probably like terra_t said, with tons of SMTP senders in parallel.


Adding: Just noticed that some of the email servers are responding back saying they will accept a max of 10 emails per connection.

I am sure every emailing receiving server has its own darn rules, so how do large cos manage sending emails to all such services without getting blocked or by adhering to the random rules of each of the email receiving servers?


I'm surprised this is the first time anyone has mentioned this in the thread. Throttling is incredibly important. As Joe Blow web app, you can't just connect to Google/Yahoo/etc and start cramming 10,000 emails a second down their throat. They'll drop the connection, and at some point, you won't be able to connect at all.

There's a size range between the little guy, who has no need for large, scalable email delivery, and the huge players who don't get shut out because of their name. For these people, it involves a lot of footwork. You have to identify where the bulk of your email is headed and establish relationships with those entities. This means talking to them and understanding their limits.

Or you simply outsource part of the job (e.g., Return Path) or the whole thing (e.g., Send Grid).


Are you talking about the actual SMTP server part, or how they would pass on those messages to the SMTP servers?


I, too, would be interested to know whether sites of that size tend to build their own email dbs in-house or find that anything available off-the-shelf is adequate (or make their own modifications to the latter).


I don't know what the specific implementation of each company is, but having a separate program for email delivery makes the most sense compared to using qmail or any other off the shelf MTA. There is most likely an MTA that is used to receive mail, but a separate program can be used to deliver mail from the same machine. There is a Javamail API, for example, which has been open sourced and can be used to deliver email from any java program. Threading is an important issue here since it is possible to have as many open connections as you want to deliver mail, but you would not want all of your open threads to be attempting to send emails to one domain all at once. Reputation determines how many emails get accepted or are bounced.


They use companies like Silverpop


Maybe the use the technology behind SilverPop which is PowerMTA.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: