Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
From Python to NumPy (2017) (labri.fr)
145 points by rahimnathwani on June 10, 2022 | hide | past | favorite | 73 comments


This book was published in 2017. The author, Nicolas Rougier, also has a few other books online books on data visualization[0] and the reproducibility problem in research[1].

[0] https://github.com/rougier/scientific-visualization-book [1] https://rr-france.github.io/bookrr


I think the readability issue of vectorized numpy if often ignored (while readability is one of the main strength of python in my opinion). I could not resists to write the example of section 2.2 in Julia:

     using Random, BenchmarkTools
     seq = rand(0:2,10_000); sub = rand(0:2,4);
     # translation of the "readable" but slow python code 
     function_1(seq, sub) = [i for i in 1:(length(seq)-length(sub)) if view(seq,i:i+length(sub)-1) == sub];
     @btime function_1(seq, sub);
     #  93.858 μs (5 allocations: 1.98 KiB)
Which is more than twice as fast as the vectorized (and quite unreadable) python

     %timeit function_2(seq, sub) 
     215 µs ± 1.94 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Having a JIT (like julia, numba...) has really a bit potential to improve code readability and reduce the need of vectorized numpy operations. In some cases, it can even be faster due to be better memory utilization and by computing more directly what you need.


Blame python for being off in the weeds adding syntax for software engineering (eg., walrus operator), and not for data analysis.


I think it's helpful to keep in mind that Python is general purpose and used in many domains in addition to data analysis (no matter which side of the walrus operator you are on)


I think it was. It isnt now. It should develop its syntax in response to its actual users, not now, merely the subset it has been serving for 20yr


Yes. I'm not even sure what is Python used for these days, other than ML/Data Science.


> Yes. I'm not even sure what is Python used for these days, other than ML/Data Science.

Well, if you look at the thing it has actively maintained libraries for with active communities, it's a lot besides ML/DS.


what does that even mean lol


I'm totally a fan of educational work aimed at a subset of programmers. And I myself went from Python to NumPy. This would have been great if it was around when I was working with SciPy stuff.

That said, I think a NumPy to Python, or really any big package or framework to Python (e.g. Django to Python) is the normal progression these days. People seem to learn the language through a package and end up with some misleading inferences.

Again, I don't intend to diminish the article, I just see the opposite progression so much I figured I'd comment on it.


> That said, I think a NumPy to Python, or really any big package or framework to Python (e.g. Django to Python) is the normal progression these days.

It always has been common from the time Python had popular domain-focussed libraries that could be used without deep knowledge of the rest of the language, but once you get there you also tend to move out to other specialized packages.


I feel like I never use directly numpy anymore. All my data is in Pandas, and I mostly import numpy for the odd `df.transform(np.somefunction)`

I remember it being quite fun to build these vectorized numpy functions (in the crosswords and puzzles are fun, kind of way)

Figuring out how to make a transformation in Pandas is always frustrating, when it isn't straightforward.


That's funny, I went the opposite direction. I only use pandas if I need to group by and aggregate, or to use in plotting functions. When doing array operations, I pretty much always use numpy, figuring out how to broadcast and vectorize operations is one of the most fun parts of my job in my opinion.


I have had the same experience and instead of Pandas have been using numpy-groupies to handle aggregate/groupby operations. It's quite performant and feels a bit cleaner to use than importing pandas for a couple operations.

https://github.com/ml31415/numpy-groupies


I use numpy sporadically and always forget things about it. I always liked this decidedly shorter presentation of the library to reacquaint myself with the essential basic concepts:

https://betterprogramming.pub/numpy-illustrated-the-visual-g...


Discussed at the time: From Python to Numpy - https://news.ycombinator.com/item?id=13355034 - Jan 2017 (48 comments)


Thanks. I clicked on 'past' immediately after posting, but all I saw were Algolia search results for the keywords in the title. If I'd seen it had been posted before, I probably would have deleted my post. ISTR 'past' used to show matches for the URL, and not the title. Perhaps I'm mistaken.


I wouldn't treat these as warnings about posting (you're allowed to repost after some time). These are more about sharing relevant older conversations for the interested.


The problem is deeper in that, because NumPy is bloated too (see https://numpy.org/doc/stable/reference/generated/numpy.mean.... versus https://numpy.org/doc/stable/reference/generated/numpy.nanme...).

Perhaps even further, the shift now is for "Python" to be written in neither vanilla Python nor NumPy directly, but at times in PyTorch or TensorFlow where each library provides its own workarounds.


[flagged]


I'm half amused and half concerned for your mental health.

There are too many places to start a counter argument. It could start with what MLMs actually are, it could be that many businesses don't care about the talks and use scipy packages, it could be the cause and effect confusion of scipy packages existing and academic users using them--which relates to the Ruby part.

But seriously, I hope I'm wrong, but your post comes off as a bit paranoid and crazy.


His MLM analogy does not fit, but I think the point is that Python got where it is because of luck. If the scientific computing and research community (and plenty of physicists) had known a better language, they would not be using Python now. Maybe it helps that Python was installed by default on many Linux systems, and Ruby was not.

Much of the success of Python these days is related to non-CompSci users. And no offense, but many of them are focused entirely on results which have very little concern for code quality, maintainability, or other software engineering concerns.

So now we have startups galore who want to be "AI" or "data science", and their PHBs gravitate to Python because Python is the machine learning language. Or so they've heard.

Ironically, and perhaps one thing the parent poster was alluding to, most AI/ML/data science companies actually end up doing very little of those and instead just doing normal software development. Maybe they have a small sprinkle of data analysis.


The “luck” Python ran into was that some people came together to write NumPy. The alternatives at the time were Matlab and R. Matlab is not a great general programming language and is expensive. R has esoteric syntax and also isn’t a great general purpose language. This shift didn’t happen so long ago that it can’t be remembered. It certainly didn’t happen because it was installed by default by some Linux distros.

For better or worse (and honestly, probably often for better), people writing scientific Python care about solving a particular numerical problem. Code quality often matters less than just getting the right answer. Knowledge of a language or code quality is rarely the hard or important part.

Scientists generally spend time learning math and software engineers spend time learning good design practices. It shouldn’t be a great surprise what each group becomes good at. Time is finite.


Yes, it was some luck ... and a lot of hard work. But it wasn't only NumPy that helped.

I was around in the 1990s when Python started making waves in the scientific computing and research community. Python wasn't installed by default (we mostly used IRIX back then).

Python was excellent for "steering" low-level computations, to use the buzzword at the time.

That is, if you had a large body of existing C, Fortran, whatever code, you could turn them into Python extensions, let Python glue the components together, and use a Python program/script to organize the high-level data- and control-flow.

Swig was a popular tool to help automate that process. Swig 1.0 had support for Tcl, Perl, Guile and Python, back in 1996.

I used Swig to interface to an existing cheminformatics toolkit, and published my experience at https://www.drdobbs.com/cpp/making-c-extensions-more-pythoni... . Python made it very easy to convert the implicit object model of the underlying C toolkit into an explicit OO model in Python. Python's reference-counted garbage collection was also a good fit to the C library.

Together these helped simplify scientific research work quite a bit.

And that's without NumPy.

Even now I do a lot of non-numerical scientific software development, and rarely have need for NumPy. While I often use Python to steer components written in C or C++.


This is the only reasonable comment in this whole thread. Python's was growth was basically organic, NumPy/SciPy/etc. are later extensions of that organic adoption.


> Knowledge of a language or code quality is rarely the hard or important part.

That is true.

The problem is, that code in Python of questionable quality (but real utility) is now turned into production code; and it is done so by building Python stuff around it. So now the insignificant implementation detail which was the language choice is now driving the entire project.

Also, a PhD data scientist/mathematician can get hired for a big salary at a company (especially fintech), and their code will drive critical business decisions. A regular software engineer is mandated to build a structure around that code to make it production worthy. In my experience there is no traction to say, "Great, you've learned how to answer this tough question; now let's implement your solution in a real language."


Data scientists are hired for a certain set of skills, and software engineers are hired for a different set. That’s the reality of specialization. I don’t often see data scientists complaining that software engineers are bad at statistics. I’m not sure why so many software engineers feel the need to complain that data scientists are bad at software engineering.

At my job I work both sides of the fence. Sometimes I write research code in Python/Julia/Matlab. Sometimes I write production code in C++/CUDA. The choice of production language is primarily due to performance needs rather than code quality, though.

I’ll repeat myself but slightly differently: the main reason Python is used over, say, Ruby is because Ruby lacks a comparatively good library for working with arrays. It subsequently didn’t develop all the ecosystem for ML and scientific computing (SciPy, Matplotlib, sklearn, skimage, Torch, etc). It’s basically never going to catch up. And when it comes down to it, Ruby is basically as ill-suited as Python, Clojure, Elixir, and most other languages when it comes to doing something like automatic differentiation on a GPU. And again, even so, Python is still the best choice among those because so many resources have been poured into PyTorch, Matplotlib, NumPy, etc.


I'm not attacking data scientists. I have no objection to how they solve their problems.

My point is that their choice of Python (because of the libraries they use, or the training they got in school?) bleeds out and causes Python to be used in places where it doesn't fit so well.

If the libraries you mention were written for Ruby (and they absolutely could have been... probably with fewer defects and fewer lines of code), then we wouldn't be having this discussion.

There is no use of any of those libraries that someone could demonstrate which could not be equally or better demonstrated in one of the other languages you mention. Therefore, the only argument for using Python is that it's what those libraries are written in.

In fact, with a little effort an API could be written and wrapped around those libraries to make them available to other languages, decoupling the bigger projects from the library-specific logic. This is how it should be done at any reasonable company anyway. The data science wizards build their specific systems in whatever language they need, and that gets built into a minimal service. But things often start highly coupled, meaning the special code in Python gets business logic and persistence stuff built around it also in Python.


> My point is that their choice of Python (because of the libraries they use, or the training they got in school?) bleeds out and causes Python to be used in places where it doesn't fit so well.

They use those libraries because there aren’t alternatives in the languages you prefer. Millions of man hours went into developing the Python ecosystem.

> If the libraries you mention were written for Ruby (and they absolutely could have been... probably with fewer defects and fewer lines of code), then we wouldn't be having this discussion.

This probably isn’t true because the core of NumPy is C. A theoretical NumRs would be the same. The issue with both languages is that they’re slow, so you have to resort to C wrappers. NumRs might be incrementally better at best. It wouldn’t be a revolution.

> There is no use of any of those libraries that someone could demonstrate which could not be equally or better demonstrated in one of the other languages you mention. Therefore, the only argument for using Python is that it's what those libraries are written in.

But no one wrote these libraries in Ruby. Someone had to do it, and no one did. Someone did it in Python. It would be impossible to justify recreating NumPy in Ruby today. If I want to do some simple image processing, Fourier transforms, simple optimization, and plotting in Python it’s trivial. If I want to do it in Ruby it is a ton of work.

> In fact, with a little effort an API could be written and wrapped around those libraries to make them available to other languages, decoupling the bigger projects from the library-specific logic. This is how it should be done at any reasonable company anyway. The data science wizards build their specific systems in whatever language they need, and that gets built into a minimal service. But things often start highly coupled, meaning the special code in Python gets business logic and persistence stuff built around it also in Python.

This isn’t a company scale problem. The Python numerical computing stack has developed over 20 years with enormous investment from companies and universities. Code could maybe be shared via an API, but someone would have to do it. It would have to be done very carefully to guarantee performance. If it were easy you’d expect to see significant code reuse between, for example, Ruby and Java web frameworks. I guess you could say this API does exist, and it’s called C FFI interfaces. Sane companies do use this, but it still sucks pretty bad.

Python’s lead isn’t unassailable, but any language that is just another C wrapper isn’t going to unseat it. Whatever does it will need to be compiled. I’d bet on it being dynamic with type inference, too, because static type checking almost never helps with correctness issues in numerical code while adding significant cognitive overhead. If it isn’t Julia, Julia is at least the blueprint.


To me, this seems like the wrong way of doing it.

I'm a scientist, and I code in Python, but my code is never directly incorporated into the production code base. (Similar story for the portions of my work involving electronics, mechanics, etc). A decent generalization is that once I solve a problem, and demonstrate a result using Python, the actual amount of code that needs to be understood and translated into the production language is usually a few tens of lines at most.

On the production side, the choice of language is just a tiny part of the entire process. They have frameworks, revision control systems, coding standards, review processes, the whole nine yards. Just using a "better" language isn't even an important step towards writing production code. If I wrote in the software department's favored language, they would still have to turn it into production code within their procedures.

The fact that a few lines of Python represents hundreds of lines of production code makes it more productive for me to think my way through problems and express the principles underlying the solution to the coders. I see no reason why the tool that's used for discovering a solution has to be the same tool as the one used for putting the solution into production. Laboratory chemists don't use industrial manufacturing processes when trying to make a milligram of a new compound.


If you run a company, then you might make better choices (or at least allow for proper discovery of what approach to take to building a large system).

But there seems to be an accepted rule that "if you're doing data science, you are a Python shop". And if you are trying to pitch a data science company to investors, they know Python for data science.

New companies start all the time using a technology because "that's what you use", without knowing why or why not. In fact, (and not that I idolize him, which I really do not), Paul Graham's http://www.paulgraham.com/avg.html essay talks about this. Most companies follow whatever is the accepted practice for doing things. It may "work", but it definitely may not be optimal. And the companies that choose a different path based on more consideration or more awareness may have an easier time.

I mean after all, there was a time when everyone wrote C (because C was so much better than COBOL or whatever). And believe it or not, there was a time when it was impossible to convince a company to try Java. Or as the old saying went, "nobody got fired for buying IBM". After all, it was the safe path. Substitute Oracle, or C, or C++, or Java, or Python, or PHP, or Windows (there was a time when people didn't trust Linux and couldn't imagine why you would ever try to run your business software on Linux machines).

Your reasoning makes sense, but in practice in the software world, it is not as well planned or thought out as you might expect.


For machine learning companies, Julia is probably the dark horse language that could give you a competitive advantage. Like always, the problem with using something more niche is the ecosystem strength. You need to train people more, build more tooling yourself. Still, there is performance to be had if you work for it.


[flagged]


Please post a concrete example of one that you have ran into.



Can you share one explicit example that is still there today? IIRC the things from that thread were all solved.


>If the scientific computing and research community (and plenty of physicists) had known a better language

The scientific computing and research community heavily used C/C++ for years/decades after the decline of Fortran.

Python came and solved their headaches by abstracting them away.


I agree with everything you say (in fact another post I made on this page deals with "it should be NumPy to Python" because so many people learn the language through some popular library)

I agree with your points, but the only counterpoint I will make is that Python was created as a simple scripting language that could utilize lower language code. Basically it's a convenient wrapper for faster c code. So the fact that Python is populated with scientific libraries that are optimized in c is not purely luck. That said, like a lot of things in life, it's the combination of viability and luck that lead to success.


to re-state the obvious, as a low-level C programmer by training, Python is exactly "a simple scripting language that could utilize lower language (C-linking) code." I wrote language glue for C libraries long ago, Python fit well and that is what I use it for to this day, decades later.

maybe people without the C background do not see that at all?


Maybe they do not. It's ironic that many people criticize the speed of python when it (or at least CPython) was originally made to run highly optimized C-linking code.

But I also take the Cython route when it's just one to a few functions that need optimization/parallelization... (Did you know you can instantly compile Cython in Jupyter to use with Python?). So I avoid writing anything else unless it's really necessary. (avoided* I haven't done that in a while)


The speed or lack thereof of Python is not the problem. The problem is language (mis)features.

But interestingly, Go is a very simple, straightforward language. It's got many of the same goals that Python had, and it meets those goals quite well. And it is blazingly fast. And it compiles so fast that it might as well be interpreted. And it is statically typed, something that is all the rage again these days and which Python pretends to offer but is really just noise and decoration.

I would go so far as to claim that anything built in Python can be built better in one of many other languages, even if you use Numpy or some other Python-specific language. And the way I would meet that challenge would be to isolate the use of the Python library and provide a minimal interface to that, and then call that Python program from another better language. Then I get the utility of the Python library from a language that scales better conceptually and performance-wise.


Go is quite a bit cleaner than Python and its concurrency/parallelism primitives can be well suited to scientific workloads.

You may want to have a look at Gonum (https://www.gonum.org), and the Go HEP package developed by CERN (https://go-hep.org).

I was also surprised to see DSP and pretty sophisticated packages, although I never used them: https://awesome-go.com/science-and-data-analysis

And of course Go has Jupyter integration, it's almost like running a script thanks to its fast compilation time.


My third language was C, after Pascal and Modula 2. And I've built a good bit of production C long ago. (And beside the point, it was not difficult...? Memory management and pointer math is not rocket science. But I digress.)

It seems like the people promoting Python for its ability to interact with C (to outsource time/memory constrained functions) are no familiar with the other uses of Python -- namely to build large systems in object oriented fashions.

A 10 or 100 or even 1000 line "script" in Python is just fine. Read it like a story, let it do the one job it needs to do, and go home.

But scale up to 10000 or 100000 lines, and you will certainly be forced to go OOP. After all, most Python modules use OO classes in cases where a module of functions would suffice... Now you are in the realm where Python is comparatively very poorly suited for the job.

It's a frog-in-the-pot boiling situation. It started out innocently enough with the frog choosing the nicest small body of water nearby, but now the frog is in pain and dying and unable to understand how they got into that situation.


How is prothletising a free tool an MLM scheme?


I'm guessing the parent poster really meant something like "hyped snake oil".

There is absolutely nothing about Python which makes it especially suited to scientific programming or any other kind of programming. In fact, it is poorly designed for most software engineering goals, short of just simple one-file, few line scripts. But those could just as well have been written in many other languages.

Python just happened to be installed by default on most Linux systems, and it was more powerful than *sh. Just maybe one could argue that it was more accessible than perl because it had fewer $#!@$ symbols. If Python had not tried to offer OOP features (entirely built with duct tape, hotglue, and post-it notes) maybe Python projects wouldn't be such a mess. Simple procedural code would be better than the @decorator littered mess which is required to make Python have OO features approximating (but not fully supporting) other languages.


> There is absolutely nothing about Python which makes it especially suited to scientific programming or any other kind of programming.

The amount of anger people have that things that, in fact, determine suitability for various applications are not the things which ought to do so in their own mental models of the world is... interesting.


If I hand you a dull butter knife and some loose, rusty pliers and ask you to fix a car engine, you should be angry. Of course, if all you've ever known is sub-par tools, then you won't see a problem.

This is a pointless conversation. You cannot know what you don't know. https://en.wiktionary.org/wiki/Blub_paradox


> Of course, if all you've ever known is sub-par tools, then you won't see a problem.

Yeah, sure. What makes you think that I haven't used a wide variety of programming languages covering every paradigm from unstructured imperative to procedural to functional (pure and impure) to OOP to relational to logic to...?

When I am amused at the anger people have about their idealized mental model of what should matter not being what actual does matter most to practical utility, I’m not speaking from ignorance of the different features of programming languages and the theoretical and practical benefits they bring. I am speaking from four decades of programming and understanding all that, as well as the human and social side of the activity.


> There is absolutely nothing about Python which makes it especially suited to scientific programming or any other kind of programming.

Compared to Java and JS there is. The lack of operator overloading would make numpy matrix manipulations absolutely painful in those languages. It's just a chance that numpy wasn't built on top of ruby.


Referring to Python hype as "snake oil" is hilarious; I recommend anyone arguing the underlying thought use it.


I use Python all day every day. It's horrible compared to many other languages I have used.

There are many versions of hype promoted for Python. Coal powered steam locomotives were great in their day, before there were better alternatives. But we don't irrationally hold the mentality that there is nothing better and why would you need anything else anyway?

Tell me any library which can only be built for Python?

Tell me any feature of Python which does not exist in one or more other languages?

Then I'll tell you at least some features which exist in several other languages but which do not in Python.

Python is inferior, and frankly anyone who is infatuated with it is ignorant of the other, better options.


> Tell me any library which can only be built for Python?

Are you really making the "every programming language is Turing-computable" argument?

All your talk of features is missing an important point - Python is one of the few languages designed with insight from how to develop a programming language for non-programmers. In Python's case, experience drawn from ABC.

These are people for whom "public class java { public static void" is a barrier to entry.

For whom diagnosing segfaults is arcane knowledge.

And for whom hygienic macros sounds like a bathroom product.

I like Python because it's a language that my colleagues - most of whom are not software developers but who do some programming for their research - can productively use. Even with all the complexity of modern Python.

So no, there are zero features in Python which don't exist in other programming languages.

And it doesn't matter.

What better options do you suggest to a grad student in chemistry, biology, or physics?


You are just arguing that a simple language is good for beginners, and I agree with that.

We want a beginner language which has easy (and consistent) syntax, no requirement to specify datatypes (implicit), and can be interpreted (no compiling required).

In its most basic use, namely just writing procedural code with basic built-in data structures, Python is pretty ok. But it has some warts and gotchas, not the least of which is mutable default parameters in functions.

Now maybe the ultimate beginner doesn't use defaults for parameters, so it's not a problem (yet) for them. Same goes for the OO features. While the beginner may not write OO Python, using modules written by other people probably will require them to step into the Python version of OO. Now it gets messier.

Ruby is superior in the simple beginner case and the complex (OOP) case. It is superior and more consistent for functional cases as well, despite it not being designed as a functional language (it still has mutability risks all over the place as does Python, except for that default parameter razorblade).

The only beginner-unfriendly thing I can think of for Ruby is the optional parenthesis on function calls. I still think optional parens is a bad design choice, and I encourage people to always use them. It's usually a style choice (but not always, because sometimes it matters for how code is parsed), and I think it's better to know as a reader that foo.bar() is a function being called. foo.bar is unclear.

One small but significant feature Ruby provides is the ability to write more expressive function names with ? and !. A common pattern is to use foo? to indicate that the function will return a boolean. The alternative, without ?, is is_foo(). Likewise, ! usually (at least for standard libraries) indicates mutation.

Back to the point though, Python used simply in isolation is not terrible for a beginner. But inevitably outside modules will get used, and the misfeatures of Python will appear more frequently. And then a further eventuality is that this new programmer will now be a "python programmer", choosing Python for future bigger work because that's what they know. Why not start with a language that is also beginner friendly but also better for large, long-lived projects?


> You are just arguing that a simple language is good for beginners, and I agree with that.

No, I am not.

As you must surely know, "simple" is not a well-defined concept. Forth is a simpler language than Python. Yet I am certainly not saying it's as good a language for beginners.

I'm saying that languages can be treated as a user interface, and improved based on feedback with the target audience. Python - unlike most other languages - incorporated insights from developing the ABC language as a teaching language for non-programmers.

Your "We want a beginner language" description is too broad as it also describes ABC, without including discussion of why Python succeeded where ABC did not.

> Ruby is superior in the simple beginner case

If you are basing things on your intuition and personal experience, then my intuition and experience says otherwise.

That's why these arguments are rather pointless.

> One small but significant feature Ruby provides

Yes, I get it - you like Ruby. But shrug some people like Fords while others prefer a Chevy.

And still my point is that Python explicitly incorporates design elements based on experience from teaching non-programmers to program. If it's part of Ruby, it's much less of an emphasis.

That's nothing at all to do with a language feature, which was your focus earlier.


Anyone who is "infatuated" with ANY tool is likely ignorant of alternatives.

That said, I think python is a very convenient wrapper for lower language utilizing scipy libraries. I also think it is great for quick prototyping. (It's been a couple years, but I even prototyped PySide Qt stuff in ipython notebooks, now Jupyter notebooks, back in the day).

Now, simply the popularity (perhaps created by luck), led to a huge ecosystem of scipy tools for the science community. That in-and-of-itself snowballed into a reason to use it. Could that have happened to a different language? Yes. but that doesn't invalidate that it DID happen to Python.

Perhaps your day job forces you to use the Python hammer for a screw problem; I'm sure that problem frustrates a lot of us working people, but I don't think it's inferior for the specific uses I outlined.


I guess that's the problem. Python is just fine for many narrow tasks. But the people who use it, or the people who manage teams who use it, don't know how important it is to consider proper tooling when it comes time to turn the small thing into a big thing.

There's nothing wrong with using Python and its great libraries for some purposes. But none of those cases mean that Python is a good language. And frankly, the better the library is, the less Python language features you would use directly. The powerful libraries hide the failures of the parent language.

But as soon as you need to build something around your solution, unless you have another library which does the new thing you need (and encapsulates your solution neatly), then you will end up using real Python, on a larger scale. Then you will start disliking it.


Numpy is a great thing, and also a terrible thing (because it is built on Python).

Python is a terrible language compared to many others, not the least of which is Ruby (and I would include Java, Clojure, Elixir, and even C++ and probably C# and almost certainly F# if I knew them well).

To make matters worse, there's a common mentality amongst Pythonistas of being aggressively complacent. A typical response to a question about a missing capability is, "Why would you ever need that?"

I have a lengthy collection of "why Python sucks" notes. However, I would argue that anyone who disagrees that Python is bad (excepting the practical capabilities afforded by libraries like Numpy which could just as well have been build for other languages) has simply not spent enough time other other, better languages.

Python is Blub, from the Paul Graham essay Beating the Averages. http://www.paulgraham.com/avg.html


Sorry, that discussion is so overrated and elitist. I'll never understand people complaining about one technology/language/tool that made it possible to dozens of people to join to the - once restrict - small club which was computer science and which is now making it possible to a lot of people to get a job with an easy to learn and less verbose language. One can argue about speed/memory/better languages out there to make data science (hello Julia) but that's just how the world out there works. Python has a great/dope community that I didn't find when I worked with C back in the day, for example.


> Python has a great/dope community

Python's community is one of the greatest and worst parts of the language. Python might not be the best language for anything, but it's the second best language for everything.

I've dabbled in many programming languages, a lot of which are backed by a single large company or consultancy:

- Java (Oracle, at least initially)

- TypeScript, C# (Microsoft)

- Golang, Dart (Google)

- Elixir (Nubank)

- Clojure (Cognitect)

- Kotlin (JetBrains)

- Objective-C, Swift (Apple)

- Julia (Julia Computing)

Then, there are languages like Python, Ruby, JavaScript, PHP, Rust, etc. that are more diversified and distributed in terms of core contributors, maintainers, and evangelists. Meanwhile, there's a long list of other lesser known languages that have failed to grow a community as large as Python's.

As I get older and older, I have realized that our programming languages are not static entities, but they change and evolve over time. They are influenced by users, companies, and researchers. I hope that one day, there's a humanities subject that studies programming languages just like we study natural languages like in linguistics. The closest thing right now seems to be history of science, but it's quite a broad field.


> Python's community is one of the greatest and worst parts of the language.

If you come from a Ruby background, and you are now doing Python, you may find yourself on Stackoverflow looking up how to do X idomatically in Python (something you did all the time in Ruby). Or you may ask colleagues who know Python but do not know Ruby.

A common answer from the Pythonista community is, "Why would you ever want to do that?" (or Why would you need that?)

Ternary operators? Who needs that. Ok sure, here, but let's make it awkward. Where did Python get the model to do "x = 1 if y else 2"? But ironically, unlike Ruby where you can say "x = 1 if y", in Python you cannot do that. Oh, but you cannot say "x = 1 if y else return". That's not allowed.

Python has finally, as of 3.10, added a switch statement. This is something that many other languages have had for ages. You should see the contortions people have gone to to approximate a switch statement in Python... https://stackoverflow.com/questions/60208/replacements-for-s...

The Python community may have some good, helpful, and friendly people; but most programming communities have that. That is not a plus for any language, because it should be a given.


I'm not a Pythonista by any means, but these complaints are banal. Ternary operators? Plenty of new popular languages don't have a ternery operator (Go, Rust, Elixir, Kotlin, Nim).

My list of python gripes are long, but language constructs are not on the list.


The point is about inconsistency.

You can say "x = 1 if expr else 2", but you cannot say "x = 1 if expr". You do a two line

if expr:

    x = 1
You cannot say "x = 1 if expr else return".

[1] is a list with one item, 1.

{1} is a set with one item, 1.

(1) is a number, 1, with pointless parenthesis. Perhaps you mean (1,), a tuple containing only one value.

Or how about mutable default values in function paramters? That makes a lot of sense...

I don't need to keep offering examples. You can search the internet and find many places where people list the warts, inconsistencies, and misfeatures of Python.


> You can say "x = 1 if expr else 2", but you cannot say "x = 1 if expr". You do a two line

This one liner works:

    if expr: x = 1
> Or how about mutable default values in function paramters? That makes a lot of sense...

This can be helpful under certain circumstances (e.g. memoization) [1]. The linked article also points out that it can be useful for rebinding global names in optimized code.

[0] https://stackoverflow.com/a/1145781

[1] https://web.archive.org/web/20200221224620id_/http://effbot....


> if expr: x = 1

But PEP8 and now the all-too-popular Black formatter disallow this. I think it's fine.

The momoization/optimization point is exactly the opposite of what should be a default use case for a beginner friendly language. And I highly doubt this mutable default design was intentional for these purposes. They are footguns.


These are pointless nitpicks. Every language has things like this. In 4 years of using Python in anger I’ve never written a line of code that could be improved by the features you list.


> These are pointless nitpicks. Every language has things like this. In 4 years of using Python in anger I’ve never written a line of code that could be improved by the features you list.

This is exactly the kind of response one often hears from the Python community. Translated it means, "I'm ok with it this way, so who cares if it could be better?"


Java is one of the few languages that actually has a specification and multiple, very different implementations. So I don't believe it is fair to call out it as being backed by a single company (even though in practise Oracle is indeed the one doing the most work on the platform, and they are doing a spectacular work at that). Even if Oracle would disappear tomorrow, Java has enough usage by many huge companies that the platform wouldn't even skip a beat.


Use any language you like to build a proof of concept.

But one you need to take that POC and build a real product which needs to be supported, extended, and maintained, then you must choose tools which fit that long term need.

Many of use first experienced school in kindergarten. Kindergarten was an important step in our education and socialization. But we didn't refuse to move on to the next level, no matter how much fun it was or how nice our teachers were. Maybe some people felt unsure about leaving their comfort zone, but there's no progress without risk and effort.

Is a university student elitist because they know something a kindergarten kid doesn't? Should they stop trying to promote continued growth through the educational levels?

If you were to use production Python on any decent sized project, and you had also used Ruby (just one example... some other languages can be even more succinct), then you would hate Python.


The issue with a lot of the gripes with Python is that there's no other language that combines mature linear algebra libraries, scientific computing libraries, and industry standard computer vision libraries with easy to use basic tools like csv reading, dataframes, json parsing, etc., while also providing easy syntax and fast iteration time. I work in C++ and Python, and things that are completely trivial in Python have a lead time measured in days, if not weeks when moving to C++. This is before even getting into the weeds of writing bug-free C++ code; you need to be familiar with the nitty-gritty details of the library you're working with, because things you might take for granted may not work the way you think.

Sure, maybe we should be rewriting POC code in a language that's more fit-for-purpose if we're talking about a large project, but for that you need mature libraries, or a team with some fairly serious chops (i.e. decent math background, senior coders, expensive to hire, and hard to replace). There's a lot to be said for a language that makes it easy to be productive on technically demanding problems, while also making it easy to find new talent. As someone who's written code relying on linear algebra and CV in Rust and C#, and I'd honestly pick Python every time for those types of jobs. It's not perfect, but that's not the benchmark.


I would not advocate for rewriting everything in non-Python. It is what it is, and it excels at tasks related to the important libraries discussed in this topic.

Just as Python wraps the libraries which do the heavy lifting in C, I would try to keep the Python wrapper as minimal as possible and stick some interface in front of that. Instead what happens often is people choose to either build more and more around this important stuff, staying with Python, or they say, "We're better off just supporting one language in-house, and Python is required already; so we are a Python shop."


It truly does depend on what you’re building, in which organization and with whom you’re doing it.

If you’re building a production system with a small, stable team, and Python is a good fit for the work, then there is no reason not to use it.

If you’re doing programming in the large with a large and constantly changing team, then you’d be wise to pick a statically typed language that many people know (basically C# or Java)

Your education example is off base. Python is not a more basic form of programming that one passes through on the way to enlightenment. It’s hand screwdrivers vs hammer drills or diesel pickups vs electric sedans. Different tools for different contexts.


It's a joy to work with, it's perfect for glue work, it reads like pseudo-code which makes it accessible for the non-programmers and children, it has a bustling community which means it's easy to find solutions should one encounter problems, it has a flourishing library, it's a great gateway language. For these reasons and more I posit that Python was perhaps one of the great milestones in the history of c̶o̶m̶p̶u̶t̶e̶r̶s̶ mankind, and Guido probably deserves a Turing award.


Paul Graham's essay is pretty bad, I really don't see any reason to cite it as he himself is pretty much at the "wrong" end of the blubness spectrum, as LISPs are not "God's language", modern PL research has plenty of areas not applicable to them in the general case (and no, just because you have macros and in theory can implement everything doesn't make it still the same language). It's almost like languages have different tradeoffs, and there is no one axis to place them on.


It's quite funny how you and some others here are saying python got into it's place by pure luck and being installed by default on linux systems.

Then you turn around and mention Ruby. I recall that there was a lot of hype around it for quite a while, largely due to Ruby on Rails (certainly outhyping Python), and it's popularity in terms of usage was very similar to Python. However, that did not last, so why is that? Are you saying Python "hyping" and default installation only happened after ~2011 (where Ruby really started to decline compared to Python)?

I would argue the success of Python is largely due to people building some packages like numpy and scipy in Python, because they sort it was the best tool for the job. Then came the ML wave and Python due to that ecosystem was ideally placed as a language that was easy to use for ML researchers. That you mention C++ even in that comparison shows that you do not understand why people use Python. I can tell you if I had to get my graduate students to use C++ for their analysis and lab automation tasks, we'd still be graphing by hand.


Luck no, path-dependency yes.

Numpy is fantastic and so is scipy, but there's no inherent reason why they couldn't have been "numruby" or "numlua".

With that said, I agree with you that your parent's thesis isn't really operative, Python and Ruby are almost exactly the same technology, therefore the decision on which one to use depends on the available ecosystem. Everything else is effectively rounding error.


>excepting the practical capabilities afforded by libraries like Numpy which could just as well have been build for other languages

So it's bad, except for all of the practical things you can do with it. And other languages could have had something like Numpy ... but didn't, for some reason. Not really seeing why this makes Python bad.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: