This seems very much like an outsider's critique. It's super interesting, but if the goal is actually improving R I'm almost sure a better approach would be to take a hand with development, fix some bugs, gain respect and push for incremental - or even less-than-incremental - changes. Core R are definitely trying to encourage new developers to join in, so the opportunity is there.
> This seems very much like an outsider's critique
Spot on! This same sentiment was expressed very well in a paper evaluating the R language [1]:
"This rather unlikely linguistic cocktail would probably never have been prepared by computer scientists, yet the language has become surprisingly popular."
It would be nice to see a post about R that analyses _why_ it's so hugely popular with data scientists. It's easy to write "R doesn't do what we computer scientists think languages should do, so it's no good". It's harder to analyse what R gets right (for its domain) that other languages get wrong. Personally, I think it's not just that R has the best data-handling libraries (ggplot2, plyr, data.table), it's its "unlikely linguistic cocktail" that is perfectly suited for data exploration.
I think that maybe we hear the views of software engineers who get handed a messy R script and are asked to make it run in production, or make it run on big datasets, and so they only ever see the downsides of R. R wasn't designed to make life easy for production! It's designed to make it easy to explore datasets, which often means one-off code, 99% of which you run and then delete because your hypothesis about the data was wrong.
I wholly disagree with this sentiment. I have seen, time and again, awful language design choices. Granted, they might have been the best option at the time, or maybe no one could think of a better method.
But things change with time. And quite often, the response to suggestions ends up being "it's too late to change it", "it's good enough", or any number of comments to minimize the obvious negatives. Too often people get stuck in "local maximums", thinking their way is best, until years later when time has proven them wrong.
I didn't say the guy was wrong. I said he wasn't going to change anything by standing outside and complaining. I think your argument supports that idea. Maybe in an ideal world people would be very open to drive-by critique. In reality, not so.
From experience, I find the better option is to just move on.
You can only shout at a wall for so long, before you realize that you're wasting your time. So you make your concerns heard (which this guy did), and you move on. If people want to take the advice to heart, great, but its not likely. Plenty of other languages to use, and plenty of other software to use.
I genuinely really like it - strict left-to-right or strict right-to-left are so much more predictable at least for arithmetical type expressions (I might still prefer specific precedences for =/==/etc.).
However at this point something PEMDAS-like seems to be substantially easier to understand for most people since it's AFAICT the common rule taught in (high school) mathematics these days.
I've just started reading about APL, so maybe I'm wrong, but I think there is one caveat: you need to know which operators are monadic, and which are dyadic, because if op is monadic, then it may be:
As someone who occasionally writes parsers for real languages, and as someone who was really into R in university, I am happy that I stepped back on this one. ;-)
R's syntax belongs in the same category as Ruby and JavaScript:
Too much freedom of expression makes the meaning of a program highly dependent on its execution. It is hard to say concise things about a program without running it.
It is the murky side of (untyped) Lisp, if you ask me.
Freedom of expression helps you think though. Lisp is helping the thoughts, other languages obstruct them (that certainly includes Python).
For a scientific language like R this quality is important.
Perhaps the ideal data science language would be a Lisp/R with excellent embedding qualities like Lua for the scientific parts. People could then choose their favorite language for shoveling data around.
Because with original thinking there is more than one way to do it. Pythonic conformity might make long term maintenance easier, but it's rigidity exacts a cost on expressing new thoughts. R is basically the epitome of Greenspun's 10th rule: it's really their implementation of a common lisp that looks like C with metaprogramming and conditions and restarts and all. They tried standardizing on Common Lisp first (XLisp-Stat) but S from Bell Labs was too popular. In short, today R is a lisp with access to modern numeric libraries.
I'm constantly discovering oddities about the R language. Since I use it interactively, it's extremely rare that such oddities cause any problems. Here's an example I found yesterday (lines 1, 2, 3, and 4 make sense, 5 is interesting, and 6 is perplexing!):
That's history coming back to bite it though. There was a period where the symbols True and False existed in Python but the bool class did not. True was _literally_ 1, and False was 0. Because of this, for backward compat, bool is a subclass of int.
As someone who actually likes R I think this makes sense. Line 5 is checking for equality, whereas in R as.* functions actually convert types or structures. The documentation on as.logical is also pretty clear on what would happen.
> Change values of a Raster* object to logical or integer values. With as.logical, zero becomes FALSE, all other values become TRUE. With as.integer values are truncated.
Yeah. There's a difference between "This language isn't being internally consistent" and "Due to my experience using language X, I find this confusing". As you say, this is well documented behavior, albeit perhaps unexpected to a novice R user.
as.logical(2) being TRUE is perplexing only if you interpreted
2 == TRUE
as
as.logical(2)==TRUE
rather than as
2==as.integer(TRUE)
[Edit: "If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw."]
I like this syntax better than the new lambda syntax, but I think it's good that a proper lambda syntax exists now. Not all higher-order functions accept formulas (they have to wrap their function arguments with rlang::as_function or equivalent), and there are probably some obscure cases where the distinction between a formula and a function matters.