You may know this, but since you always mentioned Nim & Julia together, it might confuse passers by. Nim does not, in fact, need LLVM (though there is a hobby side project using that). Mainline Nim compiles directly to C (or C++ or Javascript) and people even use it on embedded systems.
What seems to attract scientists is the REPL and/or notebook UI style/focus of Matlab/Mathematica/Python/Julia/R/... As projects migrate from exploratory to production, optimizing for interactivity becomes a burden -- whether it is Julia Time To First Plots or dynamic typing causing performance and stability/correctness problems in Python code or even just more careful unit tests. They are just very different mindsets - "show me an answer pronto" vs. "more care".
"Gradually typed" systems like Cython or Common Lisp's `declare` can sometimes ease the transition, but often it's a lot of work to move code from everything-is-a-generic-object to articulated types, and often exploratory code written by scientists is...really rough proof of concept stuff.
The time to first plots in Julia is drastically lower now. And still, it was something you only paid once per session, due to JIT.
Julia is the first language I find truly pleasant to use in this domain. I am more than happy to pay a small initial JIT overhead in exchange for code that looks like Ruby but runs 1/2 the speed of decent C++.
Plus, lots of libraries are really high quality and composable. Python has exceptionally good libraries, but they tend to be big monoliths. This makes me feel Julia or something like Julia will win in the long run.
Sorry I meant 1/2 the speed or 2x the time, edited :)
Consider that BLAS written in pure Julia has very decent performance. If you are into numerical computing, you will quickly understand this is crazy.
Carefully written Julia tends to be surprisingly fast. Excessive allocations tend to be a bigger performance problem than raw speed. Of course excessive allocations eventually have an impact on speed as well. There are some idiomatic ways to avoid this.
Having taught a number of scientists both pre and post grad, I agree with your take on notebooks/REPLs. Data-scientists are not generalist programmers, in some cases, they are hardly more advanced than some plain end-users of operating systems. They shy away from the terminal, they have fuzzy mental models of how the machine operates.
Being a generalist programmer that sometimes deploys the work that data-scientists craft, I'd really like an environment for this that can compile to a static binary.
Having to compile a whole machine with all the right versions of shared libraries is a terrible experience.
That's a good point about Nim. Nim has a nice set of compilation targets, which I tend to forget.
You might be right about the REPL aspect of things. On the other hand, R took off with a pretty minimal REPL, and my first memories of Python didn't involve a REPL. I think as the runtime increases a REPL becomes less relevant, and it seems like most languages with significant numerical use eventually get a REPL/notebook style environment even if it wasn't there initially.
R had a REPL from day one (or at least near it) because the S it was copying did. You could save your "workspace" or "session" and so on. Just because it was spartan compared to Jupyter or just because that might be spartan compared to MathWorks' GUI for Matlab doesn't alter "waiting/Attention Deficit Disorder (ADD)" aspects.
When you are being exploratory even waiting a half second to a few seconds for a build is enough time for many brains to forget aspects/drift from why they pressed ENTER. When you are being careful, it is an acceptable cost for longer term correctness/stability/performance/readability by others. It's the transition from "write once, never think about it again" to "write for posterity, including maybe just oneself"..between "one-liners" and "formatted code". There are many ways to express it, but it underwrites most of the important "contextual optimizations" for users of all these software ecosystems - not just "speed/memory" optimization, but what they have to type/enter/do. It's only technical debt if you keep using it and often you don't know if/when that might happen. Otherewise it's more like "free money".
These mental modes are different enough that linked articles elsewhere here talk about typeA vs typeB data science. The very same person can be in either mode and context switch, but as with anything some people are better at/prefer one vs. the other mode. The population at large is bimodal enough (pun intended) that "hiring" often has no role for someone who can both do high level/science-y stuff and their own low-level support code. I once mentioned this to Travis Oliphant at a lunch and his response was "Yeah..Two different skill sets". It's just a person in the the valley between the two modes (or with coverage of both or able to switch "more easiliy" or "at all"). This is only one of many such valleys, but it's the relevant one for this thread. People in general are drawn away by modes and exemplars and that represents a big portion of "oversimplification in the wild".
This separation is new-ish. At the dawn of computing in the 50s..70s when FORTRAN ruled, to do scientific programming you had to learn to context switch or just be in the low-level work mode. Then computers got a million times faster and it became easier to have specialized roles/exploit more talent and build up ecosystems around that specialization.
FWIW, there was no single cause for Python adoption. I watched it languish through all of the 90s being largely viewed as too risky/illegitimate. Then in the early noughties a bunch of things happened all at once - Google blessing it right as Google itself took off, numpy/f2py/Pyrex/Cython (uniting rather than dividing like the soon after py2/py3 split), a critical mass of libs - not only scipy, but Mercurial etc., latterday deep learning toolkits like tensorflow/pytorch and the surrounding neural net hype, compared to Matlab/etc. generally low cost and simplicity of integration (command, string, file, network, etc. handling as well as graphics output) - right up until dependency graphs "got hard" (which they are now), driving Docker as a near necessity. These all kind of fed off each other in spite of many deep problems/shortcuts with CPython design that will cause trouble forever. So, today Python is a mess and getting worse which is why libs will stay monoliths as the easiest human way to fight the chaos energy.
Nim is not perfect, either. For a practicing scientist, there is probably not yet enough "this is already done for me with usage on StackOverflow as a one-liner", but the science ecosystem is growing [1]. and you can call in/out of Python/R. I mean, research statisticians still tell you that you need R since there is not enough in even Python...All software sucks. Some does suck less, though. I think Nim sucks less, but you should form your own opinions. [2]
What seems to attract scientists is the REPL and/or notebook UI style/focus of Matlab/Mathematica/Python/Julia/R/... As projects migrate from exploratory to production, optimizing for interactivity becomes a burden -- whether it is Julia Time To First Plots or dynamic typing causing performance and stability/correctness problems in Python code or even just more careful unit tests. They are just very different mindsets - "show me an answer pronto" vs. "more care".
"Gradually typed" systems like Cython or Common Lisp's `declare` can sometimes ease the transition, but often it's a lot of work to move code from everything-is-a-generic-object to articulated types, and often exploratory code written by scientists is...really rough proof of concept stuff.