Unless you are doing Machine learning or using numpy, I do not recommend anyone use python for anything performance sensitive. The problem is not just GIL. Because multithreading is not so common in python, it’s really hard to know if some external library is threadsafe. Python also supports async but a lot of libraries do not have asyncio compatibility, so you need to mix threads with asyncio which leads to a big mess.
> I do not recommend anyone use python for anything performance sensitive
My default philosophy is to use python _until_ you find something that is performance sensitive, and then make a C/C++ extension for the slow bits. Pybind works great for a hybrid Python/C++ codebase (https://pybind11.readthedocs.io/en/stable/).
Then you can develop and prototype much quicker w/ Python but re-write the slow parts in C++.
Definitely more of a judgement call when threading and function call overhead enter the equation, but I've found this hybrid "99% of the time Python, 1% C++ when needed" setup works great. And it's typically easier for me to eventually port mature code to C++/Go/etc once I've fleshed it out in Python and hit all the design snags.
If you've never used Pybind before these pybind tests[1] and this repo[2] have good examples you can crib to get started (in addition to the docs). Once you handle passing/returning/creating the main data types (list, tuple, dict, set, numpy array) the first time, then it's mostly smooth sailing.
Pybind offers a lot of functionality, but core "good parts" I've found useful are (a) use a numpy array in Python and pass it to a C++ method to work on, (b) pass your python data structure to pybind and then do work on it in C++ (some copy overhead), and (c) Make a class/struct in C++ and expose it to Python (so no copying overhead and you can create nice cache-aware structs, etc.).
You're still kinda stuck with concurrency of the python code itself though. It sure would be nice to be able to just throw cores at problems for awhile.
Sure, and if you can't get stuff in and out of Python objects with concurrency, it doesn't help you much a lot of the time. Plus, again, computing is cheap: it'd be nice to use all my cores before I spend a lot of effort optimizing and rewriting things in native code.
If your data is fragmented across a bunch of small containers/classes, passing it around will be expensive whichever the method (either passing to C++, or just in terms of cache efficiency).
If you just pass an array of data back and forth it's cheap.
> If you just pass an array of data back and forth it's cheap.
Yes, and numpy is great, and all. Python works great as glue to marshal things to and from native code and do inexpensive (but possibly complicated) bits of control logic.
But if I'm trying to deal with large numbers of client requests, say... the lack of concurrency in python itself really hurts. Sure, I can punt almost everything to native code, but what's the point in having Python at all, then?
Not all problems have state that can be shared well across Multiprocessing or completely externalized to large lumps that travel to native code in a few calls-- I'd actually say these are special case exceptions than the rule.
> Note that having a solution setup where the end result is "a ton of small, individual API calls" could possibly indicate a bad system architecture.
Or just a lot of clients with a fair bit of shared state which is best kept resident, which is a pretty common use case.
It's a bummer to write python code that works well, and then maxes out at 130% CPU load when you grow your usage... and not have any obvious path to scale upwards despite you having 32 threads of execution around. Then, you can rewrite some of the more expensive things in native code to squeeze a little more performance, or add indirection to store the data somewhere else so multiprocessing works.
Other languages that have more finely grained locks scale 3-4x higher with minimal thought, and much, much higher with a bit of thought about how to handle locking and data model.
> At that point you'd look to Go or another language
Well, yah... this is us complaining about Python's concurrency problems.
I think the question is how much more cost does it take to move the code from python to C/C++/Rust/whatever? That's a human problem until ChatGPT can solve that problem for you.
And if you are using for example Numpy, you aren’t using Python for anything performance sensitive of course, because Numpy is almost certainly calling the system’s tuned BLAS implementation. Which should handle the parallelism I guess. If anything I’d expect parallel Python calls to Numpy to result in oversubscription…
I don't think numpy functions are neccessarily multithreaded and probably many are inherently sequential by their nature, so there are definitely case where multiprocessing can speed up the overall program.
Someone once said that python + numpy is probably going to be faster than writing it using basic C++, since numpy is using highly tuned libraries underneath.
I don't know for certain this is the case, but I'd like to see some benchmarks about it.
You would almost never use raw C++ when working with linear algebra stuff. You use a library like Eigen that interfaces with BLAS, LAPACK, etc., so you definitely get all the advantages of those highly tuned libraries, plus the speed of C++ and potential flexibility of not having to make multiple array copies and so on.
They aren’t necessarily threaded, but if you care about Numpy performance on an Intel chip at least you are already using MKL for Numpy’s BLAS, and MKL’s gemm is threaded.
Multiplying a large enough matrix in Python using MKL for Numpy, I can watch the cpu usage go to 400% in top. You may need to run it in a loop or make the matrices quite large, a surprisingly large amount of computation has to happen before it’ll show up in top.
I was working on a backup system, the usual. Walk a directory tree, track new/changed files, then queue them for sha256 and encryption, then upload to a server with gRPC.
I hit the GIL, switched to multiprocessing, which helped. Still was about half as fast as I expected. Switched to go and using channels and got the performance I expected. Was still debating, till I got deeper into Python's crypto package. I ended up really happy with go.
Yes. I've used multiple threads in Python. It doesn't work well. Some packages, including cpickle, don't work right with multiple threads because they have static variables internally. It can work; I've had multi-threaded Python code running for years. But it's not a good approach for new work.
Python does things the CPython naive interpreter can do easily, such as letting anything modify anything else. Any code anywhere can go find something far away in another thread in another module and mess with it.
Everything is a dict, so that works. This makes other things hard. Pre-compilation is hard. Optimizing is hard. JIT is hard. Threading is hard. You can't nail down stuff that probably won't change, but might.
Possible. I had a fair amount of CPU going on, trying to keep 8 cores busy with SHA256(plaintext), encrypt, and SHA256(encrypted_blob). I was just trying to be straight forward, keep files queued, so that when a core went idle it could grab another.
The go was similarly very straight forward, walktree -> CheckIfNewOrchanged -> channel -> sha256/encrypt/sha256. Channels made it really easy/clear and performed quite well. I was getting near linear scaling, CPU time consumed was 8x the wall clock, and speed increased 7.9x or so. With python I was getting significantly less performance per core and worse scaling.
With 8 cores 10Gbit is 1.25GByte/8 = 160MB/sec which is ok, not great, depends on how much computation you are doing. My goal is keeping 100gbit saturated, but I am adding cores as well.
I do hope to compare to Go vs Rust as well.
Channels are just threadsafe queues with language support and an N:M threading model.
Don't get me wrong; I agree it's easier to build performant applications in go, and to get the performance I want, I have to set my AWS boto3 S3 settings to have massive queues.
Leaving aside a moment the unexpected threaded fork issue mentioned I am thinking putting all the data to be shared among the processes into Redis would at least be a lot better than the clumsiness of pickling and unpickling.
I found it curious the author mentioned using the Multiprocessing with pickle() but not Pipe(). Pickle() streams entire objects, while Pipe() can be used to send data between processes. Maybe the latter is faster, especially if the data are short strings and the like?
I think they mentioned pickling because that's what the multiprocessing queue uses by default.
I think Pipe isn't necessarily a drop in replacement depending on the complexity of object you want to share but I have found it significantly faster for simple things.
Yup, if you need to serialize a complex object manually before piping then you end up paying the price again (like this the multiprocessing Queue)
I'm guessing that's part of the reason the article didn't mention it (it looks like they're talking about a Pandas DataFrame which I would say is non-trivial--compared to a primitive type)
I'd think Pipe + Parquet should beat filesystem though. That really depends on storage I guess
Iirc I played a bit with msgpack and orjson to see if there was anything to gain over Pickle but I don't think it made much difference. You'd probably need to deal with structs
Looking at CPython source (3.10), on Windows, you always get a NamedPipe. On other platforms, you get a OS pipe when duplex=False otherwise a socket (socket.socketpair)
This has the advantage of allowing work done inside a nested function, allowing large initial datasets to be shared and not have to be passed over pickle.
See the article for an explanation of why this is a bad idea. Note that on macOS Python disabled fork()-based multiprocessing as the default long ago (Python 3.8?) because it's so broken, and it will stop being the default for multiprocessing on Linux in 3.14, deprecated in 3.12.
Yes, and that's a good thing. Let numpy do its threading unless you know exactly why you don't want it to parallelise your matrix multiplications and what havbe you. Numpy releases the GIL so that works exactly as it should.
Threads aren't bad: threads that cause resource contention with the GIL is (possibly) bad. That is almost always done explicitly by the developer and almost never done without your knowledge.
Unfortunately NumPy's thread pool is only for BLAS, the underlying library for numpy.linalg functions, mostly. Other operations are single-threaded. So you need your own thread pool (or process pool) if you want to parallelize anything else.
Yes, I know. But your statement makes it sound like numpy using threads is somehow bad or undesired (it can be; but if it is, you'll know how to tell numpy not to thread.)
replaced the print() for sys.stderr.write() to make it compatible with python2.
the program still deadlocks on python3 but it works perfectly on python2, anyone know what changed between implementations that could be triggering this issue?
I don't believe OP's deadlock is related to a rare event of dead thread holding a lock. The problem is that any calls to multiprocessing objects if not guarded by if __name__ == '__main__' will result in infinite loop / fork bomb kind of situation.
For some reason on OP's computer that probably causes an appearance of the program hanging, but really, it will just crash after a while exhausting some resource.
Well, it's hard for me to tell from this stack trace what's happening here. I don't recognize PyObject_VectorcallMethod. Is this a new thing? Last time I wrote any C code for Python I don't recall seeing / using this.
Anyways, another common pitfall in multiprocessing is attempting to serialize multithreading / multiprocessing primitives s.a. locks, variables or mutexes. My memory may fail me, but, I think, it may result in deadlock too. I think, multiprocessing code tries to guard against it, but there are some weird rules for when it's OK for serialized objects to have those primitives (I think, initialization in __init__ is fine, but not so much otherwise or something like that), but the check isn't very good / just a heuristic... But, really, I don't remember this part well.
2. Part of the multiprocessing code (perhaps not present in Python 2) also grabs this lock.
3. If you fork at the right moment (which is quite likely with the loop) the lock is held by a thread that is now dead, and so now you're waiting for a lock to release that will never be released.
It's interesting that you decided to repeat this diatribe for the millionth time expecting... what effect exactly?
Stroustrup licked every boot and inserted himself into every possible committee to promote his language until the network effect picked up. And later he post-rationalized about his language's popularity, attributing it to qualities the language never had.
And you proceeded to conclude based on the diatribe you repeated that nobody should want good things, because everyone has to be satisfied with popular things, because you chose to eat garbage and you will be too jealous to see others getting better treats?
----
Yes. Python is garbage, and programmers should be actively discouraged from using it. Python is today in the hands of people both incapable and unwilling to improve the language in the aspects that matter, and that's why programmers need to try to defeat the network effect created by it instead of encouraging more complacency. It is pretty much in the same situation as C++, so, unintentionally, you sort of guessed the direction. Both languages started small and rode the popularity wave w/o actually filling the gaps in original design with the worthwhile contents. Kind of like a TV show that keeps capitalizing on its pilot, while not creating anymore engaging experience.
Exactly. That's why OP suggested Erlang. I switched from Python to Erlang and never had issues with GIL or scaling or operability. Elixir shares the same BEAM VM so that's another option as well.
I hate this quote. People still use things like floppy drives and fax machines in certain applications not because those devices are better but because of institutional inertia and bureaucracy. And the stuff we use isn't always the best thing out there, better products can still lose. Just look at Oracle.
> And the stuff we use isn't always the best thing out there, better products can still lose.
Better products will lose, most of the time, unless they're significantly better as a product to a point were they can hurt and displace the incumbents. It's what people found out with Plan 9 vs UNIX.
I think you wanted to write "niche programming language". I don't know what "niche programming" is. So, I'll answer the question as if it was about the language.
1. In my experience (I both had to learn a language for a job and had to teach a language to the newcomers), for an experienced programmer, learning a new language enough to be productive takes couple months. In the kind of projects I work with, it's typical that a programmer won't be productive for several months anyways due to having to learn about the project's structure.
2. I would prefer to work with people willing to learn something new, or those who already expanded their horizons enough to have experience with better-than-average language. It's a natural filter against people I don't like to work with.
3. Infrastructure created around languages is a doubly-edged sword. On one hand you get free stuff in the form of community-provided libraries, on the other hand, the quality varies a lot and you often have to make uneasy compromises, being held hostage of the third-party bad programming practices. In particular, when it comes to Python, since I've been often responsible for auditing dependencies used in our projects, without a trace of remorse, I can confidently tell you: all Python packages are of poor quality. You are forced to choose the best of the worst, and it hurts a lot to take on another dependency. Erlang has less infrastructure, and in many cases you'll be the master of your own libraries. It takes longer to build, but it has the same effect as with living in your own house vs renting.
4. I don't like working in huge programming shops where it's important to take into account the dynamics of the job market. You may hope to have a company with five or ten good programmers. There's no hope of having a company with thousand of good programmers. In the later scenario, you want to rely on simplified processes and practices to produce something of quality. In the former case you have a chance of just being good. It's similar to SOF army units. You cannot have the entire army being SOF, but SOF will have very different arms, tactics etc.
----
Why not Go?
I'm not necessarily opposed to it. But, my point was to show that something existed for a long time. So, Go isn't a good example of that. Conceptually, Erlang is in some ways closer to Python (being based on a VM rather than compiling to different targets depending on platform). Today, Python drifts more and more towards becoming a Java, so, it also becomes more similar to Go, but in its origins it favored short and interactive development cycle, which is also how Erlang is.
So, I was just looking for an example where a programmer would be able to keep similar workflow, but get a net benefit of using a different environment.
> Erlang has less infrastructure, and in many cases you'll be the master of your own libraries
Which, funny enough, are also going to be of "poor quality" when judged by external observer.
At least with Python packages their pitfalls are known and documented. But I get it, NIH can be very enjoyable.
> In the kind of projects I work with, it's typical that a programmer won't be productive for several months anyways due to having to learn about the project's structure.
I can't help but notice that you're shifting the blame to the type of project to protect your precious tech stack.
I actually read the article. And it's a thousand and first such article about the same problem. And the "solution" the article suggests is a non-solution. It's a ridiculous clutch. But, people who are used to Python are accustomed to living with clutches and patches where none are really necessary.
Python's multiprocessing doesn't hold a candle to Erlang's processes. It's a subprocess.Popen('python') with extra steps and lots of pitfalls. Erlang went through several iterations of process schedulers, several iterations of interfacing with foreign code from Erlang processes... Python developers don't even know yet they will have to solve these problems once multiprocessing becomes more mature (but it never will, so, who cares?)
Yes, who cares. You apparently. But mostly from an identity politics perspective stomping on Python developers.
Yet you know nothing about the specific particular context in which people would use Python Multiprocessing. And I bet that although not ideal from a purity standpoint PM is likely to be “good enough” for a lot of use-cases.
You seem so bitter and feel superior about those Python developers yet, here you are complaining like a bitter old person adding nothing of value to the discussion. I really feel sorry for you that you are like that.
Python notwithstanding its certain limitations, will remain a mainstay language beyond 2030. It is a lucrative option in writing glue code, scripts, and almost all of data engineering and related fields depend on Python.
Java is still a mainstay language, despite its popularity slowly dropping for more than a decade.
Python will be around for many years; however, now we have hit the end of Moore's law, the easiest way to speed up code is to multi-thread.
I code exclusively in Python, so I can't judge if languages like Julia are serious contenders. I also agree that building up the same ecosystem as in Python will take some time, and therefore I do not predict a rapid decline. However, many of the things I use Python for can be easily done in another scripting language.
if other languages don't have equivalents for the devex/productivity enhancements of:
* Django
* FastAPI/Flask
* Numpy/Pandas
* PySpark
* Jupyter Notebooks (gross I know but this is what "data analysts" and "ML/data engineers" use at many places)
Then Python will stick around for forever.
I would love if it the Go community would quit the "you don't need a framework, DUH it's GO" attitude. Make a Django/Rails for Go and there would be 10x the Go jobs.
Ironically, you mention PySpark. It is a Python wrapper around a Java/Scala code base. Jupyter Notebook supports many different scripting languages, such as Julia and R.
I agree that Numpy/Pandas will introduce more migration friction, and that is why I mentioned a slow death, similar to the trajectory Java is currently on. Java is still in the top three. However, its popularity has dropped in recent years. It is worth noting that both Cobol and Fortran are still in the top 30 programming languages.
it is funny that PySpark is just a Java/Scala wrapper, but the combo on PySpark + NumPy + Panda + Jupyter notebooks means that Python is option 1, 2, and 3 for any company that's just like "hmm let's start doing some data analysis/ML/whatever"!
Also R is what's most often taught in school in my experience and boy does Python feel like a breath of fresh air when you've been trained in R. When you're coming out of college trained in data analysis but not software engineering per se, you've got no idea about the larger world of what other languages could offer.
I've gotten into the "debate." People always suggest using anything but python but then provide no alternatives to the frameworks mentioned above. You _could_ certainly write it in Go. I write a lot of stuff in Go and I love it. I just don't want to recreate Django for the 100th time though, so I don't and I just use Python.
By all means, if someone wants to write these frameworks, people will use them.
agreed. I write Go, Rust, and Python (now only when I have to), but If I had to stand up a small business with a CRUD app in a couple weeks I'd go with Django no doubt - mainly for the no-worries auth and user management within the same monolith as everything else. I don't care about the ORM, I can quickly map out a relational schema. It's the user stuff that's killer.
With more time, yeah I might choose Go or Rust and setup a couple different nodes - an API gateway, a user auth node like KeyCloak, Ory, or SuperTokens, then a Go backend.
But it would be so nice to have that all ready in one.
I do enjoy building things but I learn more and more how important it is not to focus on things that are already "solved" if they aren't part of your core business.
What's with the scare quotes on "data analysts"? Just because you read on HN about how some such tool is bad, doesn't actually make it bad. Hating on notebooks is like hating on Excel: it's trendy in some circles, but generally only espoused by the ignorant.
I've held jobs where we just productionalized data analyst code from notebooks into data pipelines.
There's absolutely nothing wrong with notebooks, but they are a serious PITA to take from being handed basically the musings of a data intern into a a production pipeline.
I used scare quotes because unfortunately in many places there are essentially zero qualifications to start running around doing those jobs.
All of this exists in, basically, any mainstream language in some capacity. A lot of them offer superior alternatives.
There's nothing about Python's quality that's worth keeping. The reason for Python's popularity is its popularity. The reason why it won't die is inertia.
But, say, someone creates a "killer app", that runs on hardware different enough from anything we have today, and that someone hates Python (smartphones are the most recent example of such a change), then there'd be a chance to dislodge it. But I struggle to see how Python would "organically" die.
This is just false (spoken from a perspective of infra / system / automation person, being in the field for many years). But, this is very typical of religious people: to claim anything that suites their agenda w/o even trying to verify their claims, not even being bothered by whether the claim even makes sense.
First of all, there's no such thing as a "development speed of a language", just like there isn't a development speed of ice-cream. It's just kind of a Jabberwocky: it feels like it's in English, but it doesn't really mean anything.
Development speed differs by the kind of project (eg. Web site vs filesystem), quality requirements, size of the team working on the project, expertise of the team working on the project... Needless to say that there are areas where Python is entirely not applicable, so, the development speed wouldn't even be a factor. But, even in areas where it's commonly used there are often languages that will compete for this metric, and there are certainly teams using other languages that will beat teams using Python.
But, overall, Python projects tend to be easy to start and hard to develop further and to refine. Python projects tend to fare worse in large teams. Python also doesn't attract high-caliber programmers, while also is often the first language a programmer learns, so it tends to be populated by mediocre-bad programmers (similar problem used to exist in Java before it was replaced by Python in intro to CS courses).
Finally, a huge portion of development speed rests on company's infrastructure: how quickly and reliably can developers test their code plays a tremendous role in productivity. Ironically, Python tooling is so bad that sometimes it's faster to compile a C++ program of equivalent size than to install a bunch of Python packages implementing the same thing (god forbid you are using Anaconda, because that can take hours and days in the worst case to install a handful of packages).