Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In my experience language shootouts in general are often terrible by design -- with these being just a prime example.

A lot of the problem is in the questions they're designed to answer versus the questions people use them to answer.

For example, if I'm comparing Python and C, I typically want to know "how much slower would my program be in Python?", not "how much slower is my program in Python if I spent so much time hyper-optimizing it that I might as well have written it in C?"

But the test cases usually try to answer the latter, not the former.



It might be more reasonable than it seems at first glance. It's true that it's good to know how fast typical code runs, but there's another important question: when I run into performance problems and need to optimize a bottleneck, how fast can I make it before I have to resort to non-portable code or C extensions that complicate my build process?


Anybody writing, for example, Python code to solve these sort of problems in the real world would instantly reach for numpy. Which, while not part of the core language distribution, is pretty close to being a standard library for most python programmers. I'm sure several of the other languages have similar libraries that are being ignored in these benchmarks. Without taking things like that into account, theses results don't say too many useful things about real world performance.



Cool, I missed that


Languages/implementations also vary in how much overhead switching to C costs you, especially in loops say, which eg the pidigits benchmark does sort of measure.


> In my experience language shootouts in general are often terrible by design

My experience is that this is the sort of thing where everyone rags on it, but no one actually attempts to provide something "better".


Because there's always a "for what" in there. An an analogy, consider MMA. On the surface, it seems to answer the question "what's the best martial art?". But really, it only answers the question of what is the best martial art for fighting a single opponent in an octogon shaped ring in front of an audience aiming for a submission? And the answer is of course Brazilian Ju Jitsu, exactly the answer the founders of UFC wanted...

Former cop Rory Miller writes about this in his book, the police experimented with BJJ and found it useless. Why? Because in BJJ you pin your opponent on his back because it makes a better show for the audience, but as a cop you always pin your opponent on his front so you can handcuff him!


Ok, sure, but I'd rather see a rough attempt at getting some numbers than just throwing your hands up in the air and saying "gee, that's a hard problem".

Also, I think that we all know enough about programming and languages and their many uses that we can talk directly about it, rather than about an analogy.


Why does somebody saying "I wouldn't use this." have to provide an alternative? If such a statement is backed by reasons I find it interesting to read. They're saving me trouble trying it out, just like any other review.


Why? Because people want to know how fast languages are, and it's not a stupid question if you are considering it in a wider context.

And they're going to do benchmarks.

So you can either complain that they're not good, or you can try and improve them.


"So you can either complain that they're not good, or you can try and improve them."

Yes, that sentence is literally correct. But it sounds like it's saying one option is not useful. And I still haven't heard a single reason why reviews of benchmarks are bad.

Responding to a criticism with "those who can't do criticize" is super, super boring. It's been done to death. You're just tarring all criticism with an overly broad brush. If it's bad criticism why is it worth responding to? And if it's plausible criticism why aren't you focusing on the actual details?


Bitching about language benchmarks has been done to death too.


Who are you saying is bitching? OP or DarkShikari or me?


Maybe there's a space for "if you spent an average amount of time optimizing" :) For example my PyPy optimizations took maybe 4 hours total, with 0 time spent looking at assembly.


> the questions they're designed to answer

You'd think there'd be some kind of statement about that?

http://shootout.alioth.debian.org/help.php#why

> the questions people use them to answer

You'd think there'd be some kind of advice about that?

http://shootout.alioth.debian.org/dont-jump-to-conclusions.p...

> I typically want to know "how much slower would my program be in Python?"

And we should all know the answer - It depends on how you wrote your program in C and it depends on how you write your program in Python.


The question marks seem to suggest that you're poking a hole in grandparent's argument, but it seems like you're both in agreement that the shootout is misused. What am I missing?


"terrible by design"

Something can be well designed and yet still be misused.


'Terrible by design' isn't the same as 'terrible design'. I think he was saying it's deliberately bad for some use case.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: