As the article points out, the big first programming languages, FORTRAN, COBOL, LISP, and Algol each had their own way of doing variable assignment. FORTRAN used "=" and Algol used ":=" the other two languages used commands or functions to set or bind the values of variables.
In the mid 60's, when I started programming, most programs had to be keypunched. (There was paper tape entry and Dartmouth's new timesharing system using BASIC, but these weren't in very widespread use.) Until 1964, the keypunch machine in use (IBM 026) had a very limited character set. This is the reason that FORTRAN, COLBOL, and LISP programs were written in upper case. Even the fastest "super" computer of the time, the CDC 6600, was limited to 6-bit characters and didn't have lower case letters.
Naturally, symbols like "←" or "⇐" weren't available, but even the characters ":" and "<" were not on the 026 keypunch keyboard and so ":=" or "<-" were much harder to punch on a card.
These early hardware limitations influenced the design of the early languages, and the early programmers all became accustomed to using "=" (or rarely ":=" or SET) as assignment even though "⇐" might have been more logical. The designers of subsequent programming languages were themselves programmers of earlier languages so most simply continued the tradition.
> Naturally, symbols like "←" or "⇐" weren't available, but even the characters ":" and "<" were not on the 026 keypunch keyboard and so ":=" or "<-" were much harder to punch on a card.
This explains why early ALGOL implementations (e.g. Burroughs, the only dwarf who went all in on the language) and dialects (ALGO, JOVIAL, MAD, NELIAC) actually used ‘=’ for assignment. This was perfectly legal according to the language definition, which distinguished between the ‘reference language’, which used ‘:=’, and its ‘hardware representation’, which could be anything. (Naturally there was no portability.)
I feel like this is the best answer. People have to work with what they have. Much in the same way that SNES games look the way that do, even though the designers back then were just as smart and talented as the ones today. That’s simply what they could do with the medium at the time.
The limitations define the style. And then dogma perpetuates it, as the new generation continues to live with the results of once-great but now outdated thinking.
*The limitations define the style. And then dogma perpetuates it, as the new generation continues to live with the once-great but now outdated results of someone else's thinking.
Another interesting way to think about it is as "match". That is try to match the stuff on the right with the stuff on the left.
Take Erlang for example:
1> X = 1.
1
2> X = 2.
** exception error: no match of right hand side value 2
Notice variables are immutable (not just values themselves). Once X becomes 1, it can only match with 1 after that. You might think this is silly or annoying, why not just allow reassignment and instead have to sprinkle X1, X2 everywhere. But it turns out is can be nice because it makes state updates very explicit. In complicated applications that helps understand what is happening. And it behaves like you'd expect in math, in the sense that X = X + 1 doesn't make sense here either:
1> X = 1.
1
2> X = X + 1.
** exception error: no match of right hand side value 2
3>
It does pattern matching very well too, that is, it matches based on the shape of data:
1> {X,Y} = {1,2}.
{1,2}
2> X.
1
3> Y.
2
4>
In other languages we might say we have assignment and destructuring but here it is rather simple it's just pattern matching.
I'm sure I've mentioned it here before, but in my favorite Erlang talk I design the language with the equal sign as my core construct.
Since the equal sign is an assertion of truth, and since assertions cause a crash on failure, you can pretty much derive the rest of the language/BEAM characteristics from that.
Thanks for sharing. Enjoyed watching (from the link below). Well done!
I like the '=' as an assertion idea. That is, it's like running code with assertions turned on. Never thought about it that way but it makes good sense.
1> {X, X} = {1, 1}.
{1, 1}.
2> {Y, Y} = {1, 2}.
** exception error: no match of right hand side value {1,2}
If you're wondering what the use of that is, here's an Erlang "drop all occurrences of an element X from a list" function. (Prerequisite knowledge for this: Erlang functions have several clause-heads; which one is executed on each call depends on which one is able to successfully bind to—i.e. = with—the arguments.)
In other words—first, set up an accumulator. Then, go through the list, and if X can bind in both positions, skip the element; otherwise (i.e. if Y != X) then shift Y into the accumulator. At the end, reverse the accumulator (because you were shifting.)
I only had to make a few syntactic changes to the original program to obtain a Prolog predicate from it. The most significant change is that I use an additional argument to hold the original function's return value. An important consequence of the resulting relational nature is that this can answer quite general queries.
For example, we can ask: What are possible solutions Ls that arise when we drop X from the single-element list [A] ?
?- drop(X, [A], Ls).
X = A, Ls = [] ;
dif(X, A), Ls = [A].
We see that there are two possible answers: One where X is A, and hence the result is the empty list []. And the other where X is different from A, as indicated by dif(X, A).
Even more generally, we can systematically enumerate all conceivable solutions for all list lengths:
?- length(Ls0, _), drop(X, Ls0, Ls).
Ls0 = Ls, Ls = [] ;
Ls0 = [X],
Ls = [] ;
Ls0 = Ls, Ls = [_586],
dif(X, _586) ;
Ls0 = [X, X],
Ls = [] ;
Ls0 = [X, _598],
Ls = [_598],
dif(X, _598) ;
etc.
This uses iterative deepening to generate all possible answers.
This means that X must be different from each of these elements.
More generally, which elements can be dropped at all from the list [a,b,c]:
?- drop(X, [a,b,c], _).
X = a ;
X = b ;
X = c ;
dif(X, c),
dif(X, b),
dif(X, a).
Interestingly, the predicate definition does not even use "=" in its clauses. In Prolog, (=)/2 is a built-in predicate that means unification. In the definition above, we use implicit unification instead of explicit unification.
I have mixed feelings on this. On one hand I did want this notation while learning haskell. Partly because there are a bunch of places in haskell's type system where repeating a variable implicitly gives equality constraints.
On the other hand something like
drop _ [] = []
drop x (y:xs)
| x == y = drop x xs
| otherwise = y : drop x xs
is often more readable at a glance to me since I don't have to parse variable names to see control flow structures.
Also, in statically typed languages what happens if equality doesn't work for a value.
Though it's true that compilers usually turn assignments into SSA, that's not really the same as what parent was referring to. Static single assignment has the same semantics as normal C-style assignments. The parent was referring to assignments that have (Erlang's) pattern match semantics.
Also, there's a simple counterexample. You can have static single assignment that has dynamic multiple assignment (e.g. within a loop construct). Under the pattern-matching semantics, evaluating the assignment (or rather the pattern match) a second time could fail, whereas an assignment always succeeds.
I've used Erlang but not Elixir. Having an additional optional "pin" operator sounds like a bad idea, just piling on complexity. You can't mess up with the Erlang syntax.
>"In other languages we might say we have assignment and destructuring ..."
I'm not familiar with this term "destructuring." Can you elaborate on the concept? Might you have an example of a destructure operation and a language where its's used?
Basically it means that you can assign several variables at once, by asssigning a complex value to an aproprietly structured literal with variables as placeholders.
In pseudojavascript, let's say you got this from some API:
The other responses captured the essence of what it is, but I thought it might be interesting to briefly discuss how Erlang uses it. As someone who hasn't been able to use Erlang in his day job for a year now, I sorely miss the language.
In nearly every other language, when you call a function, the top of the function typically involves a fair bit of analysis of the arguments to decide what needs to be done.
(That's being generous: typically the entire body of the function revolves around making decisions about what to do, such that you don't know what's going to come out of the function until you've traversed multiple branch options.)
In well-written Erlang, the function itself branches at the top, taking advantage of pattern matching, destructuring, and function heads.
Let's say you're passing around arbitrary data in the form of a tuple that you need to display. Or not, depending on the value of some debug flag.
In Python you might see code like this, assuming that an integer is stored in your program like ('int', 3) and a string might have some arbitrary label, so it's ('str', 'label', 'real string'):
if (debug):
display(x)
def display(x):
if (x[0] == 'int'):
display_int(x[1])
elif (x[0] == 'str'):
display_string(x[1], x[2])
In Erlang, you can branch and destructure at the function head. You don't check the debug flag in the calling code, just pass it along for the function to decide what to do with it; if the flag is false, the function just returns false. (Variables in Erlang are capitalized; the lower-case "strings" in this code are atoms, also called symbols in some languages.)
Since you can choose the key data elements in your function arguments and branch accordingly in the function head, it becomes immediately obvious looking at the code how many different ways the code can evaluate, and the destructuring in the function heads means you don't have to waste code chunking the data apart before working with it.
(Updated: forgot to include the debug flag in the 2nd and 3rd function heads. In all 3 heads, the _ pseudo-variable basically says "I don't care what value we have here". Another option would be to use _Debug in the 2nd and 3rd heads to indicate to the reader what the flag is, or use true since we're expecting that value.)
(Incidentally, Python used to allow tuples to be destructured in the function head, but they dropped it in Python 3 because no one knew about it or used it. A sad day.)
I think they are forgetting that a lot of languages take que's from old math textbooks where you would see function definitions written as "f(x) = nx + b" or "y = nx + b" and constant assignment with a "c = <number>" notation.
if you want to calculate the output of a function into a data table(which is what early computers were often doing) iterating a variable x over a f(x) = xn + b(with n and b being fixed constants) is exactly how you would do it on paper so it's probably a case of computers emulating applied math rather then simply being pure theory machines.
That is, explicitly, the origin of ‘=’ for assignment in Fortran. The section describing assignment is titled ‘Arithmetic Formulas’:
A. An arithmetic formula is a variable (subscripted, or not), followed by an equals sign, followed by an expression.
B. It should be noted that the equals sign in an arithmetic formula has the significance of “replace”. In effect, therefore, the meaning of an arithmetic formula is as follows: Evaluate the expression on the right and substitute this value as the value of the variable on the left.
This works for initialization, but not for reassignment. Most notably, it is absurd for statements like x = x + 1. Meanwhile, such self-referential updates are very common in imperative programming, so anyone designing the language would come across this.
You have something similar in mathematics. Recurrence relations. Since we are dealing with a sort of time difference inside a computer, something like "x = x + 1" can be interpreted as "the value of x at the next time unit is the value of x at the previous time unit plus one". That is "x[n+1] = x[n] + 1" and this is a recurrence relation. My guess is that early programmers were deeply aware of this time difference, so "x = x + 1" made perfect sense.
The way these recurrence relations are taught at university is to distinguish the new value from the old value by using a ' pronounced prime.
Here, you would write x' = x + 1 to give the recurrence relation x[n+1] = x[n] + 1. Or, more generally x' = f(x) for x[n+1] = f(x[n]).
The reason for this is because in mathematics x = x + 1 is absurd (ignoring modulo arithmetic).
What you have there is a sequence. Recursively defined or not, taking the indices out of a sequence takes away the only thing that makes it a sequence (the mapping from the natural numbers) and would be a horrible abuse of notation.
I think language designers were certainly consciously aware that they were designing something well defined, and chose to use this kind of syntax because its simpler, rather than an implicit mapping to the natural numbers via something like CPU cycles.
There is a really simple mapping between recurrent sequences and functions. Simply, given a function $f: A -> A$ that is, a function that outputs from the set it gets input from.
We then have the sequence $x_n+1 = f(x_n)$. Often, when dealing with incremental algorithms, the notation x' = f(x) is used. Here x' (pronounced x prime) stands for "the next value of x". It's a nice balance between the correctness of using indices and the conciseness of leaving them out.
Going to a higher level, the sequence x' = f(x) is essentially trying to find a fixed point of the function f. To look at this in an actual for or while loop, you need to consider the stopping condition of the loop as part of the function.
Yes, I remember noticing how similar basic was to what you'd see in a math textbook. You'd expect to see a variable defined like "let x equal (whatever)", and basic just mimicked that: "LET X = 9".
Here's what Niklaus Wirth (Pascal, Modula-2, Oberon) said about using the equal sign for assignment:
> A notorious example for a bad idea was the choice of the equal sign to denote assignment. It goes back to Fortran in 1957 and has blindly been copied by armies of language designers. Why is it a bad idea? Because it overthrows a century old tradition to let “=” denote a comparison for equality, a predicate which is either true or false. But Fortran made it to mean assignment, the enforcing of equality. In this case, the operands are on unequal footing: The left operand (a variable) is to be made equal to the right operand (an expression). x = y does not mean the same thing as y = x. Algol corrected this mistake by the simple solution: Let assignment be denoted by “:=”.
> Perhaps this may appear as nitpicking to programmers who got used to the equal sign meaning assignment. But mixing up assignment and comparison is a truly bad idea, because it requires that another symbol be used for what traditionally was expressed by the equal sign. Comparison for equality became denoted by the two characters “==” (first in C). This is a consequence of the ugly kind, and it gave rise to similar bad ideas using “++”, “--“, “&&” etc.
From Good ideas, through the Looking Glass by N. Wirth:
At one point in my career, I was afflicted with a Wirth-designed language. His opinion on what constitutes good language design does not carry much weight with me. In particular, I can tell you that the extra typing of Pascal over C really does matter over a couple of years. And that := for assignment is a major pain when your left pinky finger is out of action for weeks, and you still need to hit shift for every assignment.
But then we get to this line:
> Because it overthrows a century old tradition to let “=” denote a comparison for equality
In what sense is "=" for comparison a century old? Hasn't it been used for "things that are equal" for multiple centuries? If so, and if that's distinct from "for comparison", then isn't for comparison also a new, non-standard use?
Does anyone understand what he's on about in this quote?
> it overthrows a century old tradition to let “=” denote a comparison for equality
It also, in science and engineering, denotes a method for obtaining lhs when you have values for the terms on rhs. That is the point of converting y = ax² + bx + c to
x = (-b ± √(b² − 4ac)) / 2a. Early languages like Fortran explicitly followed that usage.
And if the goal of mathematicians and programmers were the same, this might be a compelling argument. In no sense do I see my programs as giant algorithms, they're machines, and as such require different syntax to construct.
> But mixing up assignment and comparison is a truly bad idea
Then := is also a bad decision. You should use the fully historically supported form of 'let <variable> = <value>'. Both := and == suffer from the fact that if you omit a single character, you change assignment into comparison or vice versa.
It strikes me as a nitpicky argument for it's own sake.
You're missing the point - languages like Pascal have a := assignment operator because the = operator is for comparison only. If you accidentally drop the :, then you have a simple Boolean expression. If this Boolean expression is used as a statement, by itself, then the result is a compiler error.
The '=' sign is used because after that statement the variable will indeed have equality with the r-value.
Coincidentally in async languages like Verilog there is a another assignment operator '<=' which means that the variable won't take the new value until the next clock cycle (or more explicitly the next evaluation of the process block). '=' exists also and has the same meaning as with traditional languages.
I'm not sure what you find disquieting about the idea. "More frequently used operators should be less verbose" seems reasonable to me. What I find a bit strange about Go is that if they have that mindset then it's still a fairly verbose language, syntactically -- at least compared to some other modern languages. It seems to hang on to the familiarity of C syntax with a few optimisations. Which, of course, is not necessarily a bad thing at all. It's just not doing everything it can to reduce verbosity.
> "More frequently used operators should be less verbose" seems reasonable to me.
Syntax that looks pretty in isolation is a poor design guideline for programming language design. It ignores the problems of writing and debugging this code. I don't know how many bugs went unnoticed because of the "if (a = b)" mistake but it certainly weren't few. Sure, nowadays the compiler warns you about that but that took surprisingly long to be implemented.
The argument for saving a few keystrokes for potential longer debugging sessions is not a good argument. Code is much more often read than it is written.
But I didn't criticize Go's verbosity (after all, they "fixed" the assignment thin) but the mindset of doing or not doing things for specific reasons that look quite backwards for today (or let me be frank, just plain stupid) but the community then fights with vigorously for this. There are examples for that in the design of the language but the most egregious example of this is the package manager. This went from "we don't needs this" over "do it in this problematic way" over "let's do it like anybody else" to now "no, we are special, we need to do it completely differently".
IMO, Go would have been a fantastic language to have in the 90s. But looking at it from today's perspective, it looks outdated in many places. But compared to C which is a language of the 70s it is still great and therefore I understand its appeal for programmers who haven't found another language to replace C with (going from my own experience, most programmers have replaced C with multiple languages instead of just one).
Self and Newspeak sort of do it this way, adding implicit self to Smalltalk's keyword syntax and accessor convention.
These 'accessors' also work on local variables, there is no special syntax for assignment. So when you write x it means send the message x, which starts its lookup in the local environment and works itself outwards. When you write x:2 it sends the message x: with argument 2, also starting in the local environment.
The accessors are automatically generated in pairs for slots. It sorta works, but it seems a bit too convoluted just to say "hey, we can do everything with just messaging".
Yes yes, it's a set-word! type in the "do" dialect. That's not how it's learned when first starting, though, and not usually the thing you're thinking about when using it moment-to-moment.
also, there's no hand balance on either of the composite symbols. := is my right pinky, <- is different fingers, but bottom and top row. neither exactly flows from the fingertips.
I actually rationalized the '=' symbol in assignment into the following statement, "Let 'left hand side' be equal to 'right hand side'". Using this wording resolves some dissonance around its overloaded usage.
Depends on the dialect. It was mandatory on the Sinclair BASICs, for example (ZX81/ZX Spectrum), but optional on the Amstrad (Locomotive/Mallard) and BBC BASICs. A quick search suggests it was optional as early as "MicroSoft"'s Altair BASIC: http://swtpc.com/mholley/Altair/BasicLanguage.pdf
As someone who learned first on a ZX81, I'd always assumed that "LET X=5" was the canonical form, and that "X=5" for assignment was just a shortening for convenience.
Likewise - but of course in Sinclair BASIC it was mandatory because every command had to begin with a keyword - a fact enforced by the fact that the cursor started out as a flashing [K], and every key on the keyboard was mapped to directly enter a keyword in that mode. LET was on the 'L' key. You literally couldn't type "a = 1" into a Spectrum/ZX81 - you couldn't type an 'a' unless you were at an [L] cursor prompt, which would only appear after you had chosen a keyword. Typing an 'a' at a [K] cursor would get you the keyword NEW.
I wonder if you, like me, were also completely thrown when you first encountered a programming language where you didn't also have to enter line numbers...
> I wonder if you, like me, were also completely thrown when you first encountered a programming language where you didn't also have to enter line numbers...
Haha, yes, totally! For me this was AMOS on the Amiga (a Basic variant). I was so thrown I started by using them anyway as they were supported by the interpreter - it just treated them as labels.
I soon stopped though when I discovered it didn't reorder based on number, so you ended up with
I guess they had a Basic that only allowed expressions with a single operator or a single function call (some early systems with very limited memory did that) to the right of =, and didn’t have the memory for an infix parser.
Depended on the dialect. Whatever we had on my first PC as a kid, it was mandatory. Later, in HS, when we used BASIC for a first CS course, it was optional.
I just intuitively understood that symbols can have different meanings in different contexts, so never felt any dissonance or had any trouble understanding it. Like "read" has different pronunciations, or "plane" has different meanings, "=" is just multiple things under different contexts.
I agree in that this doesn't really take conscious thought anymore. However, the article had the following:
> How can a = a + 1? That’s like saying 1 = 2.
It looks like a lot of us don't have (never had?) trouble accepting a statement like `a = a + 1` and, upon stepping back, thought maybe it had to do with our internal dialog (which is now probably intuition as we've been processing statements like this for so long).
I've been programming professionally for 25 years using mostly C inspired languages, and I will still regularly write "if(foo = bar)" on a daily basis and not notice until there's an error. It's easily the most common syntax error I write, followed closely by using commas in my for-loop as apparently it looks like a function to my fingers: "for(x=0, x<10, x++)"
There is a Ruby idiom[0] that I find quite useful (esp. combined with a linter such as Rubocop), leveraging optional parens as a marker of intentional assignment:
# bad
if v = array.grep(/foo/)
do_something(v)
# some code
end
# good
if (v = array.grep(/foo/))
do_something(v)
# some code
end
My first programming teacher insisted we call the single = sign the 'gets' operator. So
int x = 10
int y = x
would read in English as x gets 10, y gets x.
Shortly after that class I took a break from any coding. When I went to another school I saw Softmore and Junior level programmers still struggling with this. I retained the habit of using 'gets'and never had a problem with this.
I've heard the same but with 'becomes' rather than 'gets'.
I think becomes works better for people with a mathematical background, where the concept of variables is totally natural. For people without this, 'gets' might be better because it helps anthropomorphize the variable. The coder 'gives' X a value to hold.
People with a mathematical background should be thoroughly familiar with words having overloaded and jargoned up meanings. Fiber Bundle, group, set, relation, etc... They all mean something different when you're not talking about mathematics, and '=' is no different.
Do you mouth the name of operators when you read code?
Like most people I do make the words sound in my head when reading text but only the variable names make sounds when reading code.
Something I've noticed is that I don't tend to subvocalise when reading code, but if I do, I generally read "x = 10" as "x e. 10" rather than "x equals 10" (so basically just the first syllable). I will read "if x = 10" as "if x equals 10".
Not sure why/how I picked it up, but similar principle, I suppose.
I don't understand the LISP line at all. I would have said "LET", "SET", and "EQUAL" (since it's talking about numbers). Is there something I'm missing?
EDIT: It's since been changed to "let", "set", "equal" (but still lower-case).
I don't understand it either. I suspect the author does not know Lisp, and badly misunderstood an explanation from someone who does.
I think that Lisp doesn't even fit well into that table, since putting LET in the first column would implicitly bring in declaration as well, much of the time SET in the second column would be misleading since mutation is so often done implicitly by a looping construct like DO, DOLIST, DOTIMES or whatever (no SET in sight), and there are all sorts of equality tests one could use.
Edit: ok, I see the author has both updated it, and is specifically talking about Lisp 1.5 which does not have the looping constructs. On the other hand, it at least has a bunch of alternatives to SET that probably should be listed as well, like RPLACA, RPLACD, ATTRIB, and so on. I think that even in Lisp 1.5 variables could be declared in different ways to using LET (like rebinding a variable passed in to a function, so the assignment was in invocation).
It's not a perfect correspondence, but I think the point is simply to show that back in the 1950's, there was no consensus yet on how assignment ought to look.
It just happens that there was no consensus yet on how it should work, yet, either!
Then again, neither of the last 2 (modern) programming languages I've used have had FORTRAN/ALGOL-style assignment semantics, so I think the jury's still out.
You'll have to pry my BCPL heffalumps out of my cold, dead hands.
The operator " = >" (pronounced "heffalump") is
convenient for referencing structures that are
accessed indirectly. The expression
a=>s.x is equivalent to the expression (@a)»s.x.
During the public comment period for the original ANSI C standard, we had at least one request to add ":=" as an alternate assignment operator. We declined, but I personally would have supported that. The use of = and == is one of my least favorite bits of C syntax.
Another question is why the assignment is from right to left when the natural direction would rather be left to right, like 2+2 => x to store the value 4 in x.
I was told once by a math professor that it is a habit inherited because we use Arabic numbers/maths which were really meant to be read from right to left. Don’t know if the theory has any merit.
In conventional mathematical notation, as used in science and engineering, formulas usually have the form ν = ε, which is both a statement of equality and a method for calculating ν from the terms in ε. For example, V = I R. If you have V and I and want R, you start by rewriting this as R = V / I. Computer languages explicitly followed this.
> English has Subject-verb-object word order, not object-verb-subject
Imperative programming corresponds to the imperative moodin English, where English normally has verb-object word order with the subject (the entity being commanded) ommitted. The subject of the command to set the value of x to the result of the addition of two and two is the computer/runtime running the code, not the variable x which is the direct object.
English has SVO order for declarative sentences, which correspond to declarative programmig, which tends to feature definition or binding rather than mutating assignment.
>A common FP critique of imperative programming goes like this: “How can a = a + 1? That’s like saying 1 = 2. Mutable assignment makes no sense.” This is a notation mismatch: “equals” should mean “equality”, when it really means “assign”.
I sort of disagree with this. Many functional languages pull heavily from lambda calculus and other forms of mathematics. In math, "a = a + 1" isn't the same as "1 = 2". The issue isn't equality, it's that you're trying to rebind a bound variable, which isn't possible.
In other words, rebinding a bound variable is not the same as "1 = 2".
> In math, "a = a + 1" isn't the same as "1 = 2". The issue isn't equality, it's that you're trying to rebind a bound variable, which isn't possible.
"=" means equality in math; a = a + 1 is the same as 1 = 2 because if you subtract a from both sides and add 1 to both sides you get 1 = 2.
Lambda calculus has the concept of binding variables, but it doesn't use "=" for that, it uses application of lambda forms. It's the same idea that's applied in some variants of Lisp, where LET is a macro such that (let ((x 1) (y 2)) ...) expands to ((lambda (x y) ...) 1 2).
The way it plays out is that rebinding is perfectly fine, because it's not really any different from binding in the first place. The same way that
Rebinding is essential for recursion, which is a concept in mathematics. Given a fib(x) function defined in terms of itself, the when we express fib(10), the parameter x is simultaneously bound to a number of different arguments through the recursion. We just understand those to be different x's in different instances of the f scope.
Rebinding is also needed for simple composition of operations. Given some f(x), the formula f(3) + f(4) involves x being simultaneously bound to 3 and 4 in different instances of the scope inside f.
Yeah I was pretty surprised when I found out that assignment and equality can be totally syntactically separate and nonambiguous with miminal language-design effort. Just get rid of the silly idea of allowing expressions as statements.
Although even then it's nice to use different symbols because they are different meanings. I don't like it when a word has different meanings depending on context.
respectively. And I think it would make sense to have a syntax rule that strictly enforces parentheses around truthy expressions used in such contexts.
Right, but then you're using the parentheses to distinguish the assignment from the comparison, it's not just a matter of not allowing expressions as statements.
I think GP might have meant that to go the other way around, i.e. don't allow statements to also be expressions, and more specifically, make assignment a statement.
But I agree with your conclusion, that statements should not also be expressions, so a = b = c shouldn't work (at least not the way we're used to, that the b = c is an assignment "statement" that also produces an expression value).
But in the end all this does is allow "a = b = c" to be a nonambiguous statement meaning, in more familiar notation, "a = b == c". Not exactly clear! So even though a language could use the same symbol for both assignment and equality-check, I don't recommend it! Although I still like to keep statements and expressions as strictly non-interchangeable constructs.
Just noting that I didn't mean to express any particular preference for syntax, just addressing whether it's technically possible. Personally I'm fine with = vs == being prevalent, but might prefer something like := and = respectively.
No, Python assignments are not expressions. "a = b = c" is just an assignment statement with multiple parts, it's not equivalent to "a = (b = c)", which in fact will throw a SyntaxError.
Implementing this typically means making substantial changes to a language that doesn't already have it, or implementing a new language. In such a language, having a special multiple assignment syntax based on trains of = and variable names would obviously be impractical.
It is extremely useful for all statements to be able to be used as expressions, to chain and nest values. It's purely syntactic clumsiness on the part of those language families that assignment and comparison are so close to the notion of "equals".
Using = for everything still has the problem that "x is always and forever this value" looks exactly the same as "y is this value for now but will be a different value in the future" or "give z, which previously had a different value, this new value". These are three different things and should have different syntaxes; = is appropriate for the first (and using it for that is not incompatible with using it for equality comparison) but not for the second or third.
I always liked DHH's take on these sorts of arguments (paraphrasing): who the hell cares? Once you know the purpose of the '=' how often do you make mistakes reading or writing code?
Whereas Java is all about protecting developers from themselves, Ruby (for example) let's you get away without variable type declaration because at the end of the day, how often do you not know whether a particular variable is a string or an integer? Enough to justify enforcing declaring everything at the outset or receiving errors throughout your code?
That's not to say that those enforcements never make sense. I think that structured languages like Java are better for larger teams where enforcing standards is more important than a smaller team. But there are other ways to do that and a lot of time the rules make development a horrible experience.
My issue with the larger teams vs. smaller teams argument is that the price of success is finding yourself with the wrong tools because successful projects usually grow large teams. The counterargument is that the price of choosing the right tools for a large team may be failure when those tools don't let you execute quickly enough. I don't really buy this trade-off. Maybe it is true for the specific language of Java for a small team (I'm not so sure it is) or Ruby for a large team (I have lived this and found it to be anecdotally true, so I do buy this one), but I don't believe you have to make this choice; I believe there can be languages that are productive for small teams while avoiding foot-guns for large teams.
> how often do you not know whether a particular variable is a string or an integer?
In someone else's code? 100% of the time. I was once a diehard Ruby believer, and I still use it for small programs. But having had the singular displeasure of working on a huge Ruby codebase (pre-JVM Twitter), I have learned, in the hardest possible way, that untyped languages are absolute nightmares when more than one person is involved.
As someone who does Rails development as his day job, man do I hate convention over configuration. The argument, "Once you know that X, it's very easy to understand" sounds great, but there are a lot of Xs! This is fine if Rails (or whatever) is your life and you are going to be Rails-boi or Rails-grl until the industry moves on and you pick up your next career at MacD's.
From a language, library, or framework I want to be in and out as fast as possibly. I want to have unambiguous usage that is easily discoverable. I want to avoid having to load a million things into my memory -- not least because I work with a huge pile of legacy code that all uses different technology. Rails is not the worst system I've ever used for this, but it's edging up there. Ruby, as a language, I don't mind at all, but the coding convention of people who primarily use Rails is something that I think can be improved dramatically.
> Once you know the purpose of the '=' how often do you make mistakes reading or writing code?
A nontrivial proportion of people who start trying to learn to program never get past that point, so it's worth taking them into account.
> at the end of the day, how often do you not know whether a particular variable is a string or an integer?
Strawman. Types are not about the difference between a string and an integer, they're about the difference between a user id and an order id, or a non-empty list and a possibly-empty list, or...
>"Strawman. Types are not about the difference between a string and an integer, they're about the difference between a user id and an order id, or a non-empty list and a possibly-empty list, or..."
Can you explain - how is the difference between a string and an integer different from the difference between a user id and an order id? You're calling the OPs comment a straw man but I'm not understanding how your examples are not the same thing given type system that understood all four of those.
Many Ruby users don't realise that the type system can understand all these differences. How often do you not know whether a particular variable is a string or an integer? Not very often. How often do you not know whether a particular variable is a user id or an order id? Much more often. So if you think the only kind of difference a type system can tell you about is the difference between a string and an integer, you vastly underestimate the number of bugs that a type system could help you avoid.
> because at the end of the day, how often do you not know whether a particular variable is a string or an integer?
I'm refactoring some data-pasta to more clearly declare types because we just have dicts of lists of whatever, so often enough that a random reader chimed in after 20 minutes.
Simple, programming is like using Old Speech, the language of Dragons. We cannot lie in that language and therefore when we make a statement it becomes true. (In case I'm being too obtuse.. Earthsea)
But really, this is such a pedantic discussion, trying really hard to not get sucked in.
In R, it is actually distinguished that way, example:
a <- a + 1
a + 1 -> a # also works, but REALLY bad practice
But, so does
a = a + 1
Granted, there are a bunch of R haters (especially from people with formal CS educations), I think this convention makes a lot of sense. While most will disagree about the '<-' I like it from a code reading sense in that you know that it is an assignment right away. Coming from a mathematical perspective before learning to code, this makes a lot more sense in the 'assignment' fashion.
In case you are wondering, the difference between <- and = in R is in scoping. For example, in the following function call:
foo(x = 'value')
x is declared in the scope of the function, whereas:
foo(x <- 'value')
x is now declared in the user environment. Granted, that is not good practice, but that is why there is a difference.
The first time I ever saw that you could do that was in a blog post about stringing at the end of the a series of pipes. I thought it was pretty neat (and actually used it a couple times), then had problems when I couldn't figure out why my code was messing up.
for example, this assigns a ggplot to 'plot':
df %>%
na.omit() %>%
ggplot(aes(x=x, y=y)) +
geom_line() -> plot
That is really confusing in that the way most people would read it is that it is something to be plotted. However, the assignment does occur and is masked. having 'plot <- df %>%' as the first line makes it clear that a new object is being created.
We actually had to modify our style guide to prevent the '->'
but Wikipedia says:
"The reason for all this being unknown. [footnote] Although Dennis Ritchie has suggested that this may have had to do with "economy of typing" as updates of variables may be more frequent than comparisons in certain types of programs"
while the OP says:
"As Thompson put it:
Since assignment is about twice as frequent as equality testing in typical programs, it’s appropriate that the operator be half as long."
Prolog is a good example of a language where "=" doesn't mean assignment. In Prolog, "=" means unification: You can read X = Y (which is =(X, Y) in functional notation) as: "True iff X and Y are unifiable". You can think of unification as a generalization of pattern matching. For example, f(X, b) is unifiable with f(a, Y) by the substitution {X → a, Y → b}. I think this aspect would be a valuable addition to the article.
Fun fact: If (=)/2 were not available as a built-in predicate in Prolog, you could define it by a single fact:
X = X.
Thus, you do not even need this predicate as a predefined part of the language.
BASIC also came out with “=“ character for assignment in 1964, 5 years before B. I think it also contributed to the adoption, considering how it was popular in the 80’s.
I always read it with an implied "let" in front, so it has always sounded completely natural. It's also ergonomic to type, easy to read and while you can confuse := and = on account of : being a small glyph even in monospace fonts - you can mistake it for empty space if you're just skimming, == and = are harder to confuse since one is twice as long as the other.
I like = for assignment, but I don't like it for configuration. HCL and TOML chose it, to their detriment, I think.
When it reads as "let x equal y" it makes sense. I think it makes good sense for imperative programming. It makes slightly less sense for functional programming. For declarative configuration, I think a colon makes much more sense.
The way I explain this to beginner programmers is that there are three types of equals in math and programming:
The math "=" means "these are equal"
The programming "=" means "make these equal"
The programming "==" means "are these equal?"
I value clarity and explicity a lot. And I’m someone who generally lines to think about proper naming of a variable twice to make it easier for future readers of code.
At the same time when reading
> How can a = a + 1? That’s like saying 1 = 2.
I think:
Well, depends how you interpret it.
If the translation is „we state, that from now on, a is the previous value of a plus 1“ it’s totally ok.
Not a big difference from saying
a = b + c
So i‘m not sure if the usefulness if changing all languages to using := is that high that it’s worth to think about changing it in current languages - and even when inventing a new language I’m not sure if it’s helpful.
I also wonder if reassignment, beside counters that need to be changed with iteration/ appearance of the event to be counted, is a thing that should generally be avoided if possible. Ok maybe in general everything where data is transformed in iterations...
Because programming languages are typically designed for the tiny group of existing programmers than the much larger group of future programmers. The same reason unnecessary tokens exist, and 'drop' means delete in databases. It's entirely cultural.
Technically, DROP is used because DELETE is taken by something that deletes rows and using DELETE for tables and rows is a bit scary so you get DROP. I suspect quite a lot of naming happens that way.
I'd have thought, not being a programmer, that drop was used because whilst the access and relations with the drop-ed table are updated the table isn't necessarily deleted.
Drop updates the logical structure without necessarily acting on the physical storage in a way commensurate with "deletion". You can this drop a million tuple table in a ms (less I expect) whilst deletion would take far, far longer (of the order of a million-times longer).
DROP <object-type> <object-name> is sort of like a macro for, loosely
DELETE FROM <catalog-relvar-for-object-type> WHERE name = <object-name>
Except that real RDBMSs don't usually have DDL that is really equivalent to DML against system tables, and particularly (esp historically, but still in many years DBs), DDL has a different relation to transaction processing than DML, so it's a very good thing for clarity and developer intuition to not overload DML keywords for DDL operations, despite the loose similarity between CREATE/DROP in DDL and INSERT/DELETE in DML.
>Because programming languages are typically designed for the tiny group of existing programmers...
So this is a dilemma I have while working on a new language. I'd like to go with `:=`, but `=` is absurdly popular, and I'm trying to keep the language as approachable as possible.
I don't think the clarity of `:=` is so compelling that it outweighs the `ew, why are there colons in there` reaction that I think most novice coders would have.
(Remove redundant commentary about database stuffs.)
The key isn't whether you use := or =, it's whether you allow assignment in expressions.
My advice: don't allow assignment in expressions. To me, it's like the case-sensitive issue: the language designers think it's a useful feature, but it actually works against most developers.
I definitely agree that assignments should be statement level operations.
I don't think case-folding identifiers is helpful. The language has decreed fooBar is the same as foobar, and that handles the error where you spelled the same idea two different ways, but it fails silently on the error where you spelled two different things a similar way. Worse, there are some people who are very sensitive to case and will be confused, while others will happily type their entire code in all caps.
I think a linter is the best way to catch these issues, and those subjective rules are precisely the sort of thing that need to develop more rapidly than the core parser.
Yes, but again, the issue is whether most developers will be hindered or helped by case-sensitivity in a language. Based upon my experience, identifier case-sensitivity is simply making things harder than they need to be on the developer.
Conceptually, what is the difference between these two identifiers:
myObjectInstance
MyObjectInstance
?
And the key here is the reason for the difference: if it's a typo, then a case-insensitive language design will allow it and no-harm, no-foul. If it's not a typo, then who wants to work on a codebase littered with identifiers whose only difference is case ? :-)
In Haskell, one is a variable, the other is a type, and that's enforced by the language. It's the same, albeit by convention, in Java. There are a lot of cases where you want to describe a type and a thing, so apple = new Apple() is pretty reasonable.
When I think of case-insensitive languages, I'm thinking of Basic, LISP, SQL, and those don't have a lot of type declarations.
And consider two counter-examples:
my_instance vs myinstance
things vs THINGS
The first shows case-folding is only a partial answer to ambiguous identifiers. The second shows that differences in case can be very obvious to the reader.
Those are motivators to me for pushing this off to the linter: there are a lot of subjective judgements in what should and shouldn't be the same, and having the language keep its rules for identifiers as simple possible seems like a good separation of concerns.
My final concern is metaprogramming and interoperability. In SQL, for instance, there are bizarre rules to work around case-insensitive identifiers. If another system asks you for "myObjectInstance" and "MyObjectInstance", it has to know your case folding rules to know those two identifiers are the same.
> If it's not a typo, then who wants to work on a codebase littered with identifiers whose only difference is case ? :-)
Ever worked on a Python project that interacts with Javascript, so it's snake and camel case?
I generally agree, I'd just prefer a gofmt-style utility that would just automatically resolve those and tidy everything up. I completely agree that just chucking error messages is a poor answer.
Finally, here's a challenge, if identifiers are going to be folded by the compiler: what locale should be used? In particular, how do you handle I, İ, i and ı?
No, in my example they're both references to an object instance - they're simply identifiers. Languages that are case-insensitive tend to force one to use identifiers that are also descriptive as to their usage, which is very helpful when reading code as you can tell a type from a variable from a....
Re: languages: Pascal/Object Pascal is case-insensitive, and is statically-typed.
Re: SQL: all implementations that I'm aware of use case-insensitive identifiers for all of the reasons that I've outlined. Any that don't are problematic, at best.
Re: locales: the way that this is typically handled is by a) restricting the allowed characters to the English alphabet (older), or b) by using Unicode (UTF-16) encoding for source files (newer).
Granted they had some bad keyboards; there's a oft-reproduced photo of Thompson & Ritchie at Teletype 33s, but those had '63 ASCII, including ←.
Probably more important is that the Unix/C developers came from Multics, written in PL/I, which (following Fortran) used ‘=’ for assignment. And Fortran was still important; Unix had a Fortran compiler at least as far back as Second Edition, when C was being born. Kernighan & Plauger's Elements of Programming Style used Fortran and PL/I for its examples, and Software Tools used Ratfor. ‘=’ is simply what they were used to.
Rutishauser might have gotten it from Konrad Zuse's Plankalkül, which used the right double arrow ⇒, which looks a little like =.
Plankalkül had the order and the assignment reversed relative to C, so to increment a variable Z1, you'd write:
| Z + 1 ⇒ Z
V | 1 1
S | 1·n 1·n 1·n
(The first line is the 'main line'; Z stands for Zwischenwert, i.e. "intermediate value". The second line is the 'value line', which contains the indices of the variables -- to store Z1 + 1 in a new variable Z2, you'd replace the second 1 with 2. The third line contains Struktur-Indizes.)
Fun fact, R also has -> (assign to the RHS instead of LHS), and "super-assignment" <<- and ->>, and also can use = sometimes too (and I think has different semantics). Yay R.
> also can use = sometimes too (and I think has different semantics)
That’s a common misconception (even the official documentation is misleading).
In reality, assignment `<-` and assignment `=` have the exact same semantics (except for precedence, and unless you redefine them, which you can). The confusion comes from the fact that the `=` symbol is syntactically overloaded: it is also used for named argument passing (`foo(x = 1)`). This leads people to falsely claim that, if you wanted to perform assignment in a function call (which occasionally makes sense in R), then you’d have to use `<-`. But this is false. You merely need to disambiguate the usage, e.g. with extra parentheses: `foo((x = 1))` works.
FWIW, that's what Ross Ihaka does as well (or at least did in a presentation that I watched - and pointed out that you can use = for assignment). Personally, I think = for assignment is a disaster, and wish I could use <- and -> in every language.
I've used R since it was in beta form, and one of the most interesting things to me (aside from the evolution of type hierarchy) has been changes in use of the assignment operator.
When I started, = was the norm for assignment, with == for evaluating equality. Later I started noticing that people were recommending <- based on scope concerns, but it was kind of subjective preference. Now I see articles like this saying that the "preferred use" is <-, and some people don't even know about =.
I agree = versus == can lead to tricky errors in code, but I still prefer = for various reasons in general in languages (although I get the scope arguments).
The reason is that = is shorter, and assignment is an definitional equality, which to my impression is the whole point of programming in general usually. As others have suggested here, between = and ==, = is by far the more commonly used operator in meaning, so to me it makes sense to use = for succinctness.
In math, there is an "equal by definition" operator, with three lines, so I could see that, but keyboards don't have that, so it's more steps. := is also a kind of standard "by definition" usage as discussed in the article, but again, it's more steps. I'd still prefer that over <- in R.
Interesting. I started using it in 2000 and was working directly with members of R-core at the time. My experience was that the only time I ever saw = used was by DTL.
It really wasn’t until Google published their style guide and then later when Hadley published his that I started seeing = in heavy usage.
To your point about = being shorter I wonder if the difference was that a lot of the folks I interacted with were using Emacs with ESS which would interpolate _ to <- so they wouldn’t have noticed? Just a theory.
Either way I was taught from some of the R creators that <- was evil and to be avoided. It wasn’t until I stopped using R regularly that I switched, it became too mentally taxing to change assignment operators when I bounced between languages
I teach R, and most of the students have never programmed before, at least not anything other than having been exposed to a configuration file or writing a little html.
x = x+1 is really confusing because of the way kids learn math. The notation x <- x+1 at least conveys the idea of taking a value and storing it somewhere.
In the original Smalltalk implementation, the character corresponding to _ in ASCII was a left facing arrow, which Smalltalk used for assignment at the time.
"In the original Parc Place image, the glyph of the underscore character (_) appeared as a left-facing arrow (like in the 1963 version of the ASCII code). Smalltalk originally accepted this left-arrow as the only assignment operator. Some modern code still contains what appear to be underscores acting as assignments, hearkening back to this original usage. Most modern Smalltalk implementations accept either the underscore or the colon-equals syntax."
The point is that ':=' evolved in Algol 68 to solve the ambiguity inherent in '='. C is simply from a less thoughtful and primitive language family and modern languages still seem to be copying C's horrible syntax. It's also probably why technical papers are written in Algol-like pseudocode using the '<-' symbol in LaTeX for assignment.
The syntax of C is a thing of beauty. The decision to use = for assignment and == for equality was natural, since assignment is more common than equality testing; similarly, articles are usually short words in natural languages. Algol 68 and Pascal got this wrong, and where are they now?
Technical papers are written using '<-' in pseudocode because the papers likely use '=' elsewhere to assert equality in a mathematical statement and they want to avoid confusion.
Yes, although my intention was to further support their last statement which was left open to a degree.
There is a slight distinction. If you are creating a programming language, you can come up with whatever syntax you'd like for assignment and the user has to learn it. Some choices are better than others if you want your language to be used, but there is a specification for the language that you have written down, either as a human readable document or as the compiler/interpreter. With pseudocode in technical documents, the author typically lacks the space, time, and interest to generate such a specification and leans on mathematical notation to keep things precise.
We read it out loud as "becomes the same as", this being what was actually happening. I've carried the habit over to C despite the bare =, it helps me reason and seems to aid in preventing the =/== error.
My CS professor (~ola) taught it as "gets" and to this day when I talk out code people get confused. I'll say, "x gets 3" and they look at me funny. But I find it helpful for differing between assignment and equality.
Plankalkül had the advantage of not having to run on a real machine.
As far as := is concerned having to use the shift key for an assignment is less than ideal but there really aren't any better options on a modern keyboard. I think C's use of = is a bit braindamaged but it is nice and quick to type.
It didn't actually contribute much to the evolution of other languages though, right? Had anyone even heard of it when Fortran/ALGOL/etc were being designed?
Personally, I don't see why the big fuss. If your assignment and comparison operators are the same the only thing you lose is doing x = (y == 2); kind of expressions (since you would disambiguate from conditional expressions).
Now, assignments are done more frequently than comparisons, hence why it makes sense for it to be the simpler/shorter one.
> And most BASICs, though some (most?) required the keyword LET to introduce an assignment, as well.
LET was virtually always optional; an expression by itself on a line (like 'X <> 10') is a syntax error anyway, so removing LET doesn't introduce any ambiguity.
> Looking at this as a whole, = was never “the natural choice” the assignment operator. Pretty much everybody used := for assignment instead, possibly because = was so associated with equality. Nowadays most languages use = entirely because C uses it, and we can trace C using it to CPL being such a clusterfuck.
And I dispute the "pretty much everybody used" part. In fact, I think the article contradicts that opinion. Algol used it, and so did Pascal. But Pascal was only a year old [Edit: at the time the choice was made for C] - it [Pascal] hadn't set the world on fire yet (to the degree that ever did). That leaves Algol using ":=", and FORTRAN, COBOL, and Lisp not using it. That's not "pretty much everybody used".
Yeah, it's probably more accurate to say that "everybody from the Algol line used := for assignment", with C being the first language of that line to stop.
Too late to edit, but I've gone and done it: scanned through Sammet's Programming Languages: History and Fundamentals for languages with identifiable assignment statements. I skip assembly-style, English-like (COBOL etc.), and functional-style (LISP etc.) syntax, and for summary purposes ignore accompanying keywords like BASIC's LET.
Yeah but this introduced an easy path for errors. You could mean to write `==` and instead write `=` and it'll work fine in C in practically all locations (even as conditions of if-statements). Using `:=` could have likely alleviated this error to a noticeable degree, if not entirely.
Chefs rather use a sharp knife then a blunt one, even if they cut themselves from time to time. If safety is a priority, there are special designed languages like Ada that use :=
I was puzzled by this decision of theirs though. They are so hell bent on keeping the language as simple as possible with a minimum of keywords, but apparently can't be arsed to just write var a = 5 like normal people, but prefer to invent a whole new operator for it. Operators are often demanding even more of a reason to exist than keywords.
Then you also get the operator inconcistency between "assigning to a new variable" and "assigning to an existing variable" and is it so important to tell the two apart from within an operator, of all things?
When writing Go, I have been way more inconvenienced by its error handling mechanics than having to tell that I want a new variable using a few more characters.
A language could reasonably use plain = to let the programmer state that two things are equal. In a debug build, this could be compiled into code that validates the condition. A debug build would abort if the condition is not met. For a performance-optimized build, the compiler could use the information for optimization. It could be like the __assume keyword that Microsoft supports.
It's ...odd... but I've always liked the simple elegance of (Meditech's) MAGIC languages. Strict left-to-right, even in assignment and in math. Instead of "A gets 1" (A = 1), it's "1 goes to A" (1^A).
When I was in 10th grade, I found a Pascal book(I live in Vietnam so no internet at the time), I have no computer too. I though it was a math book since I aware of Pascal.
I was super confuse when I saw:
i := i + 1
I'm not sure how I get used to it but when I discover Erlang I feel so much happy since we nolonger has that in Erlang.
IMHO, it was a mistake for some languages to make "=" mean assignment. Algol and Pascal got this right and used ":=" to mean assignment. Even the use of "variable" is wrong when talking about mutable memory. "Assignable" would be better.
= has contextual meaning in math, why not programming? It's not like we can fool ourselves into thinking that tests for equality don't also have wildly different meaning depending on context. No operator has a universal meaning, nor should they.
Most equivalence relations eventually drop any special notation in favor of =
In general, = is not rigorous. It's used as a stand-in for the word "is".
Another example is when someone writes x = 1, ..., N, meant to imply x is changing, not that x is equal to a tuple. Also the index in summation notation, which really is an assignment as in programming. I could go on.
How about Σ Sigma notation for summation? It's often written with something like i=1 at the bottom and 1000 at the top to denote the summation from 1 to 1000. The Wikipedia article at https://en.wikipedia.org/wiki/Summation has some examples.
And as a teacher of programming to adult students, this causes a small amount of wasted time with the overwhelming majority of students, because they know what "=" means, even the non-mathematical ones. Usually not a huge obstacle, but a perceptible one.
And once they learn that it's a mutating assignment, some (small) fraction of students continue to forget, for an infuriatingly long time, which direction the mutation goes.
Coincidentally I am just reading the <<Programming in Ada 2012>>, and found this in the introduction chapter:
> C history ... The essence of BCPL was the array and pointer model which abandoned any hope of strong typing and (with hindsight) a proper mathematical model of the mapping of the program onto a computing engine. Even the use of := for assignment was lost in this evolution which reverted to the confusing use of = as in Fortran. Having hijacked = for assignment, C uses == for equality thereby conflicting with several hundred years of mathematical usage. About the only feature of the elegant CPL remaining in C is the unfortunate braces {} and the associated compound statement structure which was abandoned by many other languages in favour of the more reliable bracketed form originally proposed by Algol 68. It is again tragic to observe that Java has used the familiar but awful C style.
Implementing a cryptosystem is one of the first things for 10 year olds with no prior programming experience???? That seems intense compared to Scratch, etc.
I don’t mean so much that it’s super difficult, I’m just imagining that 10-year-olds would rather be making pictures move around on the screen over doing a math thing that they might not yet understand the importance of. I could be wrong, it’s been quite a while since I was 10 :) I was trying to imagine 10-year-old me reading that section about RSA in the thing they linked to.
this might not be fact but I like to think that the colon in := might have been dropped just like diacritic symbols were dropped from the English alphabets. I am not a native English speaker so I had similar issues learning the English script. It was particularly hard to grasp things like why 'do' and 'go' are pronounced differently. But overtime I released that eliminating extra characters reduces a lot of unnecessary complexity and makes your writing faster.
As someone that's done a bit of Delphi I think he's missed the crucial issue at play: If you use := as assignment your code just looks like a vertical cascade of penises.
At one level, the syntactic difference between = and := and == is moot. At another, since it expresses intentionality (semantic intent) its quite important.
"takes the value", "is equal" and "becomes equal to" are not saying the same things.
What matters is inside the parser for the language, and expressions (as in spoken or written words) between people about the language. Confusion abounds in the latter, but a well designed language doesn't have moments of confusion in the former case.
Some of the choices incur backtracking cost parsing the input. Some don't but incur more keyboard presses per unit of code expressed.
assuming you are ok with mutation its just syntax. = means assignment so you only have to type one character. == means equal, so you have to type two. it could be the other way around with := and = but its really a non-issue. there's all kinds of weird syntax out there.
in primary school the focus in expressions involving = is reduction more than equality:
2 + 2 = 4
<n terms> = <constant>
I don't recall teacher showing you equalness through stupid but important ideas:
2 + 2 = 2 + 2
2 + 2 = 2 + 1 + 1
all these are equal, and it's important later on when expression, with variables, aren't reducible to a constant but to another expression which is equal to the other one. IMO it would help a lot for people to try to simplify a problem into a known identity or interesting form (for instance the 'trick' of adding and subtracting one to get closer to binomial formula or other)
The arrow always reflects the direction of flow. Difference is just whether you like to have destination first or last. Basically are you Motorola or Intel..
It isn't mathematically unsound. It is just a terminal symbol in a grammar that follows a different (arbitrary) convention than mathematical notation does.
And the errors result from typing "=" when you needed "==" has nothing to do with mathematics. It is a language ergonomics issue resulting from two operators being legal in the same context and one being similar to (a prefix of) the other.
There are hundreds of years of tradition in mathematics to support making up your own definitions of things in whatever way is convenient for the domain you are dealing with. There is no reason to expect programming languages to deviate from that tradition.
Agreed, fix the tools, not the person. Really, any source you have control over should have all error checking possible turned on and breaking immediately, _especially_ with C derivatives.
-Wall can almost always catch the problem as it will insist on redundant parens surrounding an assignment. So you can only screw it up if you typed extra parens for an equality expression.
Variable on left is conventional, it's how people sound it out in their head, so it's just easier to process mentally when the expression becomes more complex.
Yoda syntax catches that single error at the expense of making all equality expressions harder to write, read and maintain.
And, besides, if you're going to use Yoda syntax, you're going to want your linter to enforce it... at which point you may as well turn on -Wall anyway.
I will let clang's amazingly detailed error messages speak for themselves:
ben@burnination ~ $ cat test.c
int main() {
int a = 3;
int b = 2;
if (a=b) {
return 1;
} else {
return 0;
}
}
ben@burnination ~ $ clang -Wall test.c
test.c:4:10: warning: using the result of an assignment as a condition without parentheses [-Wparentheses]
if (a=b) {
~^~
test.c:4:10: note: place parentheses around the assignment to silence this warning
if (a=b) {
^
( )
test.c:4:10: note: use '==' to turn this assignment into an equality comparison
if (a=b) {
^
==
1 warning generated.
The concept of assignment only makes sense in a context with a time axis. Computation has a time axis (location of the instruction pointer) but math does not.
Nope. Logical axioms are required logically, but not associated with time. Comprehension is associated with time, so I can understand how you might arrive where you did.
Sorry but I don't understand how you're using the word 'comprehension' here. I agree that logical axioms aren't associated with time (this is what I mean when I say that math doesn't require an idea of time).
It should be noted that in mathematics equivalence and equality are not the same thing.
Letting ~ represent an equivalence relation on a set X:
Equivalence relations have specific properties:
x ~ x holds for all x in X
x ~ y => y ~ x for all x, y in X
x ~ y /\ y ~ z => x ~ z, for all x, y, z in X
However, they do not have to be equal, merely equivalent by whatever our relationship requires (though they may be equal).
A common example that I would expect most CS students to have encountered is the idea of "congruent modulo n".
We would say that `x ~ y (mod n)` iff `(x mod n) = (y mod n)`.
So over the whole of the integers we can see that every odd number is equivalent/congruent to each other over 2. But we would not say that they are equal to each other.
On the other hand, we would say that 1/4 is equal to 2/8 (or `1/4 = 2/8`).
> In appropriate deference to the manifold ways an object can be presented to us, objects need only be given up to unique isomorphism, this being an enlightened view of what it means for one thing to be equal to some other thing.
I sympathize with this view; I did some simple programming in C-like languages before I learned algebra, and it's sometimes hard to not see = this way.
But if = in math notation denoted "assignment", what could ≠ mean?
Maybe lhs = rhs is the set of assignments implied by a postcondition. So x = 5 implies a single assignment, but x = ±5 could mean two possible assignments. Similarly, x ≠ 5 would mean all the possible assignments but x = 5.
Equality is symmetric. If lhs = rhs then rhs = lhs. x ≠ 5 means exactly that, x is anything that is not 5. So if you are talking about the natural numbers, x could be any element of the set {0, 1, 2, 3, 4, 6, 7, ...}. If x = ±5 then x could be any element of the set {-5, 5}.
No it is not. Have you ever studied logic? Equality is a relation. Assignment does not make much sense if you don't elaborate what you're trying to say. For example, we can assign 1 := S(0), 2 := S(S(0))... in the metatheory so that we can meaningfully use 1, 2 ... in the theory. It is also sometimes denoted with =_{def} and some books use = for this type of assignment and \dot{=} (= with dot on top) for the equality relation inside the theory. Either way, you're not right.
it’s not an assignment it’s expression of a relationship, and you can rewrite it in many equivalent ways. You can’t coherently rewrite an assignment that way.
About once every year I have a brain fart on a sleepy morning and confuse myself by using = for comparitors and or == for assignment... shortly followed by "Nnarhh".
[edit]
:P LOL at downvotes, HN is no place for silly annecdotes "this is serious things we dooo".
> There are better places if you just want to talk about yourself.
hmm, I don't agree with what you are implying defines acceptable comment content on HN. There are many if not a majority of comments highly rated on HN which are highly personal that necessarily talk about themselves, it's part of what makes those comments interesting, if we all commented like wikipedia articles it would be a dull place indeed. My comment maybe flawed in the eyes of other HN readers, but not for the reason you have highlighted.
In my case it's is both personal and relevant but to be honest - not interesting or insightful which is probably why someone mean enough gave me a downvote... but then afterwards I broke the "first rule of fight club" which on HN is equivalent to throwing yourself to the wolves, I accept that, and also can't help it, it's just my nature, I don't self censor for points.
In the mid 60's, when I started programming, most programs had to be keypunched. (There was paper tape entry and Dartmouth's new timesharing system using BASIC, but these weren't in very widespread use.) Until 1964, the keypunch machine in use (IBM 026) had a very limited character set. This is the reason that FORTRAN, COLBOL, and LISP programs were written in upper case. Even the fastest "super" computer of the time, the CDC 6600, was limited to 6-bit characters and didn't have lower case letters.
Naturally, symbols like "←" or "⇐" weren't available, but even the characters ":" and "<" were not on the 026 keypunch keyboard and so ":=" or "<-" were much harder to punch on a card.
These early hardware limitations influenced the design of the early languages, and the early programmers all became accustomed to using "=" (or rarely ":=" or SET) as assignment even though "⇐" might have been more logical. The designers of subsequent programming languages were themselves programmers of earlier languages so most simply continued the tradition.