Great! Now if only they would implement allowing HEAD requests and support for any of the various headers that would allow me to get a 304.
The reason I ask is because every time my Safari crashes, if I have 50 HN tabs open (as I often do on by Friday [1]) I'll get IP banned from HN because Safari will do a GET request on each page, but it can't pass any of the headers necessary to get back a 304, because HN doesn't support it.
[1] The way I consume HN is I load up HN once or twice a day, open up all the interesting links and their comment pages in new tabs, and then go back to work. Then when I have some downtime (most of which is on the weekend) I read through all the open tabs.
> The way I consume HN is I load up HN once or twice a day, open up all the interesting links and their comment pages in new tabs, and then go back to work.
By that time most pages are probably updated, so you wouldn’t get any 304s anyway.
Since recent update, Firefox behavior is to only reload restored tabs on demand. This avoids that issue.
> By that time most pages are probably updated, so you wouldn’t get any 304s anyway.
Actually, quite the opposite. HN comment pages seem to peter out at about 12 hours in most cases, and after that you'll rarely see a new comment, so really the content stops changing after about a day.
The behaviour has been built in and accessible since Firefox 4 (initially through about:config, later through the Preferences), but has only been enabled by default later on. I've personally been using it since it appeared, and it's great if you literally have hundreds of tabs open, or use tabs as temporary bookmarks.
That behavior shouldn't get you banned, but some of your requests will get dropped if you make too many too quickly.
You will see 304s for static resources, like images and CSS. Efficiently handling HEAD requests for dynamically generated pages is actually painful, and many frameworks support it by handling the request like a GET and just omitting the body. In our case, most pages contain fnids and will never be the same twice anyway.
Well, it drops every request I make for about 12 hours, so that's equivalent to a ban. :)
And I'm aware of how painful it is to handle HEAD requests for a dynamic page, having had to deal with it on reddit. There are some easy hacks to make it work pretty well. One is to just keep a cache of the last update time of each page, and then use that cache when you get a HEAD request.
I'm not sure what an fnid is, but if you have something on the page that changes so quickly that the hash changes when the content doesn't, then I'd contend you're doing something wrong.
Their specific implementation of continuations means that each time a page is generated, each user-specific/authenticated link off of the page (Flag, Voting, Delete) is evaluated as a possible continuation of the user's state and the user-and-action specific hash "fnid" (Function ID) is cached and interpolated into the HTML.
Hence, the hash absolutely changes for every logged-in user every page load.
Lots of older sites still use continuations to great effect; Continuations were popular in the WebObjects days but since they require lots of user state information to be stored server-side they're comparatively difficult to scale.
Continuations are a really cool pattern to study, since most hacking in HTTP-backed web services comes from the inherently stateless nature of the protocol, and continuation-based programming helps prevent the inverted control flow that page-based programming lends itself to.
Actually, it does get one banned. It happened to me once and it was the most nerve wracking experience trying to figure out what exactly happened and why i was separated from my favorite site.
I came up with the following simple :-) hack: When there is a need to open a bunch of HN links (either saved tab sessions or more importantly when restoring a crashed browser session), I disconnect the network connection. Once all the tabs are launched, I turn the network connection back on and refresh each tab when i get around to reading it.
No. I come here because it's an entertaining distraction, but there are few articles I feel I have to read. So I normally visit every few hours, open a bunch of interesting articles and comment threads on the homepage, start reading one or two, realise I have better things to do than read said articles, close half of them without looking at them, and read/skim the one or two that remain, if they look especially interesting. I also often use the "mark all as read" feature in my RSS reader to get rid of a backlog of articles.
It seems much more sane, even when my machines have 8GB+ of RAM to simply have a "read later" bookmark folder on your bookmark bar^. I have my bookmark bar shown with single icon hot shortcuts, a blank folder with all my bookmarks and a simply "revisit" folder with articles, youtube videos, whatever to read during downtime. (I guess quite simply I don't trust my browser or session recovery enough to do that).
Plus, Chrome Sync means I get those free on my mobile too. Nice for when I'm stuck a waiting room or something.
Also, isn't there a whole class of "read it later" type services? Seems a bit much overhead for me.
^ tip, you can drag the padlock/favicon next to the URL right onto/into your "revisit" bookmark folder too, in case you didn't know
I use those "Read Later" services, but I only put something in one once I've read the first paragraph and am sure it's actually something I want to read. Preceding that, I open everything in a bunch of tabs, in the same way you'd put a pile of resumes in an inbox. If my browser crashes, I ideally want them discarded, not saved.
An alternate mechanism for the same thing would be to make an HN extension that works more like StumbleUpon: have the browser history function as the "tabs", and just advance through the things you're reviewing (seeing both the HN thread and the original article at once) by clicking either "Save" or "Discard".
If you ask spdycheck.org to check a site that doesn't support spdy, and customise that site's Server header to include arbitrary html, spdycheck.org will include that header, verbatim and unescaped, in the page it presents to you (or anyone else who checks that site).
I don't know that it's a security flaw in this context, but it's sloppy.
Ahhh. Thanks. All info from the web server, such as the protocols returned in the NPN extension and the Server header were passing through an HTML Encode function. Except there was one case where, if the site didn't support SPDY, and SPDY check could not determine the type of web server, the Server header output was not getting HTML encoded.
Not likely Thomas, but still, would look kind of silly for someone who spoke at BlackHat for years about JavaScript and XSS like I did to release a free tool with a DOM-based XSS... :-)
That's cool. Why isn't HN available over IPv6 though? Softlayer (the hosting provider HN appears to use) has supported it natively since 2011.
It is pretty scummy that they charge for IPv6 allocations if you want more than the one address they give you though - I've never seen anyone else doing that - some providers even give you a /56 or /48 free or charge, but Softlayer charges $4 a month for a /64...
There's still free IPv6 tunnel brokers, and pg can probably afford an extra $4 a month. The fact is that no website will enable IPv6 unless they have spare time and want a mini-project.
It is of course much, much preferred to have a native address. But if you're lazy you can enable IPv6 using 6to4 encapsulation and hope for the best:
modprobe ipv6
MYV4=`ip addr show dev eth0 | grep 'inet ' | awk '{print $2}' | cut -d / -f 1`
MYV6=`printf "2002:%.2x%.2x:%.2x%.2x::1\n" $(echo $MYV4 | tr . ' ')`
ip tunnel add 6to4-ipv6 mode sit remote any local $MYV4 ttl 255
ip link set 6to4-ipv6 up
ip -6 addr add $MYV6/16 dev 6to4-ipv6
ip -6 route add 2000::/3 via ::192.88.99.1 dev 6to4-ipv6 metric 1
echo 'nameserver 2001:470:20::2' >> /etc/resolv.conf
echo "@ IN AAAA $MYV6" >> /your/zone/file.conf
echo "ns1 IN AAAA $MYV6" >> /your/zone/file.conf
echo "www IN AAAA $MYV6" >> /your/zone/file.conf
iptables -A INPUT -i eth0 -p ipv6 -j ACCEPT
You do get one free v6 address, if I'm reading their product page correctly.
It doesn't really need to be a mini-project in your spare time - on some hosts I use, setting up native IPv6 took a few clicks and less than five minutes...
I added IPv6 to an AWS machine using http://tunnelbroker.net - setting it up in Debian literally took five minutes and then adding an AAAA record took another two or three...
It takes more time than that if you've never done it before. You have to set up DNS, set up the host's networking and firewall, configure and restart your network service, test the service (all this assuming you don't need a tunnel broker).
I don't think it's difficult to do at all, but the onus on the admin to figure it out (and the fact that IPv4 works just fine) means nobody ends up doing it.
IPv6 is for when you can't get an IPv4 address anymore. To the best of my knowledge, there aren't currently users with only IPv6 addresses, so there's zero reason for us to support it.
In the general case, why should any IPv4 website bother with IPv6? Is there any benefit?
To set an example and encourage IPv6 adoption. Many people think IPv6 is going to fizzle, and as justification they point to the currently low adoption rate of IPv6. I'm very afraid this will become a self-fulfilling prophecy and we will be stuck with a future where home users are forced to use carrier-grade NAT and even web site operators may have a hard time getting new IP addresses. I want to avoid that future if at all possible, so I do everything I can to encourage IPv6 adoption today. There may be no short-term benefit but helping IPv6 adoption has substantial long-term benefits.
- Carrier-grade NAT forces users to share IPv4 addresses, making it difficult to ban offenders without collateral damage. This problem is only going to get worse.
- NATs are full of state, and state is messy. Making your site route around them may improve performance.
- This is one of the few areas where software people can directly make the world a better place, by preventing "our" Internet from regressing into something that more resembles the telco network.
- Lack of IPv6 support strains the credibility of companies that purport to be on the forefront of technology. Google, Facebook, and Wikipedia (and Bing?!) figured this out; why can't Hacker News?
- It gives your site a green 6 in IPvFoo, instead of a red 4. Six is 2 better than four, and green is 100THz better than red.
I remember at the last Google I/O the Chrome team said they were working on a more granular permissions system and recognized how scary "Access your data on all websites" was. Not sure when that's supposed to debut though.
If you hover, it explains that it's green when SPDY is used for the "top-level" document and grey when it's only used for "sub-documents" which I guess means included resources. If there isn't any SPDY on the page, the icon goes away.
Has anyone built an NSURLProtocol subclass for SPDY? I did a quick GitHub search and found nothing. Seems to me that's one big use case that SPDY could dramatically improve, is all these native mobile apps which open SSL connections to their APIs.
Why do you assert it has no benefit for mobile APIs? Here are a few that I can think of off the top of my head:
* SPDY Multiplexing is superior to HTTP pipelining. Pipelining requires in-order responses, which leads to head of line blocking.
* SPDY header compression is a win for mobile, since mobile uplink bandwidth is often a bottleneck. Request header compression allows for fitting more requests into fewer packets.
Because SPDY is designed for quickly loading websites with lots of resources (the average page has 50-100 components). Just because it is new and fancy and from Google doesn't mean you should use it for other things that transport over HTTP/S.
If you are going to dedicate resources to switching to SPDY, you should instead investigate protobuf.
* Almost all APIs are transactional. Your API should be returning all the data you need to render a single screen in your mobile app, or it is inefficient.
* You should be stripping all request headers except for User-Agent, and the server should be responding based on the known capabilities of the app version.
* Pings are not the correct way to deal with hangs. I don't want to spill any secret sauce here, but it should be obvious to anyone with low level TCP experience.
* Pipelining also uses a single connection. Opportunistic FINs are not unique to SPDY.
I'm not going to argue whether or not other libraries/protocols (e.g. your example of protobufs) may provide higher value. I was simply questioning your assertion "SPDY has no benefit for mobile APIs."
That said, I've got some comments on your new points:
* "Almost all APIs are transactional. Your API should be returning all the data you need to render a single screen in your mobile app, or it is inefficient." - Addressing this completely would take awhile, and there's no good point in our discussing this exhaustively. I'll simply note that while an API response may return all data necessary to render a single screen in the app, it does seem nice to allow for prioritized out of order responses like SPDY does. There does not seem to be a good reason to have head of line blocking in the responses, since the app should be able to render incrementally.
* "Pings are not the correct way to deal with hangs. I don't want to spill any secret sauce here, but it should be obvious to anyone with low level TCP experience." - I think you must be misunderstanding this, or it must not be obvious Google's TCP team, since they agree with the usage of SPDY PINGs. Perhaps you think SPDY PINGs are a keep-alive mechanism? Note that the article I linked to identifies them as a liveness detection mechanism.
* "Pipelining also uses a single connection. Opportunistic FINs are not unique to SPDY." - I don't know why you bring up pipelining again when I've pointed out that multiplexing is superior. Why put up with response head of line blocking? And I don't know what you're exactly referring to with opportunistic FINs, perhaps you can explain in further detail?
And I'll throw in extra data points. Despite the fact that you don't feel like it's useful for mobile APIs, other non-Google parties clearly do:
Great, but interested in why it still loads up pages only marginally faster than PayPal? Seems to have kept the same page load performance this despite all the news about better servers etc. I know we don't want this place to become Reddit[1] but is there also a built-in delay implemented?
[1] I can only imagine this is why the pagination still appears completely broken (unknown or expired link).
From the looks of it, real world SPDY performance increases are non-existent for most websites due to how they are organized and served.
You have to really change how your website gets served (at the cost to non-SPDY users) to get an increase in performance. Which usually ends up not even being 25%.
From what I've been able to gather, especially with little to no reports of any real-world benefits (I've only seen criticism of "bad" tests from SDPY supporters, but at the same time no one has posted a "good" test), I have to say SDPY is turning out to be hot air.
http://www.chromium.org/spdy/spdy-whitepaper has data on improvements in lab tests.
http://googlecode.blogspot.com/2012/01/making-web-speedier-a... has a blurb where Google announces that they've made search (already highly optimized) faster with SPDY. This is a fascinating result, because this result was obtained after Google Search switched to using HTTPS, which typically makes websites slower, but in Google's case, made it faster because of SPDY.
Note that Twitter and Facebook have also adopted SPDY. I'd be rather skeptical that Google, Twitter, and Facebook would all switch to SPDY if there weren't real world benefits.
When you enable SPDY for your average website, you'll get some real world results.
When you talk about Google, or Facebook, or Twitter using it to squeeze out a few extra percentage points out of their latency or bandwidth or load-time - in their very highly specialized and optimized and conditional and resourceful environment, that's about as non real world as it get for the rest of us.
The fact that SPDY adaptation has mostly failed for the rest of the internet, says more about it than any white-paper or lab-result can.
I can say you're going to be wrong ;-) Lots of sites are adopting it through services like CloudFlare and large shared ISPs will likely turn it on as a feature (or for all users) once the Apache/Nginx pagespeed plugins stabilize and SSL grows cheaper thanks to IPv6 addresses. The main problem is that SPDY adoption basically goes hand-in-hand with SSL adoption, and SSL hasn't taken off though it should. You wouldn't say SSL has failed, would you? ;-)
news.ycombinator.com Does Not Support SPDY
SPDY Protocol Not Enabled!
Seriously?
This SSL/TLS server is using the NPN Entension to tell browsers it supports alternative protocols, but SPDY is not a protocol it supports.
The server is not making SPDY an option. Since all the pieces are in place, hopefully it will be easy to enable SPDY support with this server.
We read quite many posts suggesting that SPDY doesn't help much but that's simply not true. At least in our case. If you're delivering assets over secured sockets layer, then SPDY is definitely a plus.
Maybe... What are doing with your SPDY server? I had no problems with parallel GET requests, but there was no way to do parallel XHR file uploads, which works pretty well without SPDY.
The reason I ask is because every time my Safari crashes, if I have 50 HN tabs open (as I often do on by Friday [1]) I'll get IP banned from HN because Safari will do a GET request on each page, but it can't pass any of the headers necessary to get back a 304, because HN doesn't support it.
[1] The way I consume HN is I load up HN once or twice a day, open up all the interesting links and their comment pages in new tabs, and then go back to work. Then when I have some downtime (most of which is on the weekend) I read through all the open tabs.