Is there an excuse for excessively short variable names?

irahul · on Nov 21, 2012

The selected answer from the OP has a reasonable explanation. The code is written by someone who knows the domain, and is using variable names which are commonly used to describe equations in textbooks. I think if the programmer used the short, canonical names but made a top level comment explaining what he is implementing and where to refer for the equation and theory, that should be good enough.

Generally, if the variable is short lived, I prefer short variable names.

    persons.each do |p| 
      p.activate
      p.send_notification_mail
    end

In fact, I think if you renamed the p to person, you convoluted the code. persons and person look the same if you aren't looking carefully. Iterators are definitely one case where short variable names are not only ok, but desired.

Another scenario is where the convention is to use a particular name for a particular type. As another commenter pointed out, golang conventionally calls ResponseWriter as w, and *Request as req. As a general case, if writer are w, it should be all right to name them w. This holds true for Go more than it does for Ruby. Being statically typed, you can make sure that w is indeed io.Writer and that's all the information you need.

Another tip from golang is if you have namespaces/modules, you need not repeat the same information twice. file.Writer(not go; an example) is better than file.FileWriter.

gus_massa · on Nov 21, 2012

An important thing to teach to the students is that when they are solving a (physics) problem with pen and paper and create a new variable they must explain clearly what the variable is. So in this case I agree that it's ok to have short variable names, but wit a comment that explains the meaning of each one.

Jabbles · on Nov 21, 2012

A loaded question - "excessively" implies you think it's wrong. Of course you can obscure meaning by using non-descriptive names, but I use a lot of one-letter identifiers if the context is enough to easily interpret them.

For example, all http handlers in Go are the form of:

    type HandlerFunc func(ResponseWriter, *Request)

And when you code such a function the arguments are normally named "w" and "req", ("writer", "request"). To give them longer names would be wasteful, as their types already tell you all that you need to know. (I think the reason you don't just call it "r" is to prevent confusion with a "reader"/"response".) Furthermore, this idiom is so widely applied that it reduces your cognitive load considerably (naming things is hard!). When the type already expresses the purpose, don't bother thinking of a good name.

loup-vaillant · on Nov 21, 2012

I saw a similar philosophy at work in a dynamically typed language (Lua):

  function foo(String, Float)
    -- Some code
  end

"String" and "Float" aren't type names. "string" and "float" are. But they are just as effective at signalling intent, when the name of the type is enough.

mistercow · on Nov 21, 2012

It's also pretty easy to have a one-letter convention for type-obvious situations. For example, if I name a variable s, f, i, or n, then I always know that it's a string, float, integer or integer, respectively. Beyond type information, I also always know that "n" is going convey the notion of a count.

Obviously again this is only reasonable if you only need to know type information, but if you're writing a function like

    repeatString = (s, n) -> (s for i in [0...n]).join ''

then it's completely pointless to give those more verbose names. You wouldn't actually be conveying additional information until you got to an utterly ridiculous point like

    repeatString = (stringThatWillBeRepeated, numberOfTimesToRepeatTheString) -> (stringThatWillBeRepeated for i in [0...numberOfTimesToRepeatTheString]).join ''

wingo · on Nov 21, 2012

An interesting perspective here:

"MCNP is written in the style of Dr. Thomas N. K. Godfrey, the principal MCNP programmer from 1975-1989 ... All variables local to a routine are no more than two characters in length, and all COMMON variables are between three and six characters in length ... The principal characteristic of Tom Godfrey's style is its terseness. Everything is accomplished in as few lines of code as possible. Thus MCNP does more than some other codes that are more than ten times larger. It was Godfrey's philosophy that anyone can understand code at the highest level by making a flow chart and anyone can understand code at the lowest level (one FORTRAN line); it is the intermediate level that is most difficult. Consequently, by using a terse programming style, subroutines could fit within a few pages and be most easily understood. Tom Godfrey's style is clearly counter to modern computer science programming philosophies, but it has served MCNP well and is preserved to provide stylistic consistency throughout."

This from MCNP4c chapter 2, section B, quoted here: http://wingolog.org/archives/2005/04/09/101. Alas, the link in that article is dead.

FWIW, I neither agree nor disagree with Godfrey's style.

jrajav · on Nov 21, 2012

Here's the whole book that that article quotes: http://public.gettysburg.edu/~bcrawfor/physics/nnscat/C700.P... (~10MB)

And an earlier edition with the same quote: http://mightylib.mit.edu/Student%20Materials/books/mcnp4b.pd... (~5MB)

haberman · on Nov 21, 2012

Short variable names are like pronouns. Their brevity is justified by the fact that their scope is very local and the meaning is well-established by the context.

You wouldn't write "George Walker Bush was the 43rd president of the united states. George Walker Bush served two terms along with George Walker Bush's vice-president, Dick Cheney." Rather we can, after the meaning has been established, refer to him by shorter words like "him," "his," or just "Bush."

This is somewhat easier in strongly-typed languages where the type often gives a strong clue about the initial meaning of the variable. For example, "MutexLock l" seems like a perfectly reasonable variable declaration to me.

csense · on Nov 21, 2012

> "MutexLock l" seems like a perfectly reasonable variable declaration to me

Not to me. I use short variable names a lot, but I never, EVER name a variable "l" -- for the good and simple reason that "l" = chr(0x6c) looks too much like "1" = chr(0x31).

I would call this particular variable MutexLock lock. If I must name a variable "l", I'll spell out the letter name: MutexLock ell.

If I ever make my own language, maybe I'll make "l" a keyword that causes a compile error if it's ever used, to discourage others from this folly.

herge · on Nov 21, 2012

Also, one character variable names are much harder to search for in code review tools and source diffs. I saw code that always used ii, aa, etc. to make searching easier.

haberman · on Nov 21, 2012

Do you avoid using the number 1 for the same reason, falling back on 2/2? "l" and "1" don't look alike in any reasonable programming font.

csense · on Nov 21, 2012

> "l" and "1" don't look alike

They look quite similar in this form submission field. (I'm using Chromium on Ubuntu.)

> Do you avoid using the number 1 for the same reason

Of course not. If you never use ell's, then anything that looks like it might be an ell is actually a one. The fact that these two characters look similar doesn't mean you can't use either of them; rather, you have to pick which one you'll allow yourself to use, and don't use the other one.

Note that I only forgo using the token "l", not the character. So with my convetion, it's fine to call a variable "lock", or to use required names like the keyword "yield" or the "__call__" class method.

haberman · on Nov 21, 2012

> Of course not. If you never use ell's, then anything that looks like it might be an ell is actually a one.

Code is not just for your own eyes, it is for the eyes of future readers and maintainers. Though you might be aware of your "no bare l" rule, your readers are not.

Besides the fact that "1" and "l" are intentionally made to look different in fonts used for programming, syntax highlighting usually makes numbers stand out from everything else. In my editor, "1" is red and "l" is black.

_csoz · on Nov 21, 2012

Agree, one letter variables are only justified in fortran

Millennium · on Nov 21, 2012

Variable names should be clear. Sometimes tradition makes short names clear: for example, "i" and "j" are traditional names for loop indices being iterated over. Sometimes, especially in scientific functions, there are standard notations that do the same thing; the most famous is probably either "F = ma" or "e = mc^2".

When this happens, the traditional/standard short names should be used, because the weight of tradition and standards can actually make these names clearer than anything longer and "more descriptive." But there are many cases where there is no tradition or standard to clarify a variable name, and when that happens, you shouldn't be afraid to use something longer.

epo · on Nov 21, 2012

Long variable names are a crutch espoused by those of mediocre ability. Variable names should be explanatory to those who could be expected to comprehend the code, length has nothing to do with it. The domain of this code is physics, if you don't understand what the code is supposed to be doing you shouldn't be messing with it.

That said, a comment at the top explaining the algorithm used and a refence to the literature is all that is really required. A comment explaining what the variables are could be provided if you really must (i.e. you work with those who prize dogma over clarity).

WhaleBiologist · on Nov 21, 2012

I don't think short variable names are bad per se. To me their context in the surrounding code is more important.

I've seen some beautifully, long, descriptive variable names, but the clarity their names afford counts for nothing if they reside in a huge recursive voodoo function with multiple execution paths and excessive use of loops. Nothing can save you then :)

tzs · on Nov 21, 2012

"You can know the name of a bird in all the languages of the world, but when you're finished, you'll know absolutely nothing whatever about the bird... So let's look at the bird and see what it's doing — that's what counts. I learned very early the difference between knowing the name of something and knowing something" -- Richard Feynman

A name serves to remind you what the variable represents, and serves to represent the variable in expressions.

Reminding you of what it represents generally works best with long names--with the major exception that names that follow domain conventions are often better than longer names that do not.

Representing the variable in expressions, on the other hand, often improves with shorter names because longer names make it harder to see the relationships in the expression.

It would be a big improvement, in my opinion, if variable names did not have to be so linear in most languages. I want subscripts and superscripts. That allows effectively longer names without taking up too much extra space and making it harder to see the relationships in an expression.

hazov · on Nov 21, 2012

Laziness is a really good one for me. But I'm not a developer, I'm actually a mathematician/statistician now working for a bank, my code is generally a couple of functions, almost none of them too long, that I did some time and glued together using a huge amount of shell and glue scripts. If I need something new I do some quick and dirty coding to work around my problem

It's not exactly the ideal but it works for me, when some people need the code I change all the names of the variables.

I did my PhD in Applied Mathematics (CFD, google it) and there almost every code have very short variable names, such as h, k, n, and things like that, it's a common thing in almost every numerical code that I saw out there, if you know something about theory you will be able to find out what these names are, sometimes they are two or three letters. Nothing like Sedgwick algorithms book and their single letters variable names.

EDIT: Of course, CERN probably works different around their ROOT framework, but I never worked with something of comparable size.

boothead · on Nov 21, 2012

I think there is, but only if there's a recognised convention in both the context and the short name. Examples:

for n in [some list of numbers]

def my_view(req):

Monad m => m a

(x, y) = coords

I think this is a case when know if it's a good idea when you see it with enough experience [1]

[1] http://en.wikipedia.org/wiki/I_know_it_when_I_see_it

csense · on Nov 21, 2012

For me, local variable names are short -- usually a single letter. Names that will face external code -- parameters, function names, class members -- should be longer. At least a single word, but often multipleCamelCase or multiple_underscore_delimited words.

Mathematical functions are a special case, in terms of variable naming. Equations usually use short variable names. For something like the parent's situation, I would use the short names (actually shorter names -- prefixing every double with the letter "d" is stupid, as others have already noted, and very confusing since to a mathematically knowledgeable reader dR would denote a delta-radius, i.e. a change in radius).

Also, realize that in mathematics specific letters are often used for specific purposes. I usually use r for radius, p,q for prime numbers, u,v for vectors or lists, s for a set, s,t for time, x,y,z for coordinates, theta,phi for angles, f,g,h for functions, h,i,j,k for indices/loop counters, m,n for counts/lengths, k for a material constant (e.g. spring constant). Since most mathematics textbooks and papers use similar conventions, mathematically trained readers usually have no trouble. Those who have a programming background but little mathematical training may struggle, since as far as I know these conventions aren't documented anywhere -- they're just something you sort of pick up over time as you read mathematics.

But, if it is not obvious, I would write in a function-level or module-level doc comment what each of the variables stands for, even more descriptively than the author. I would include a description of the geometric object the code is dealing with. Perhaps even an ASCII art diagram.

As others have noted, it would definitely also be good to describe what equation or algorithm is being implemented -- not necessarily a full description or proof of correctness, but at least a URL describing the situation more fully (I often cite Wikipedia articles), a Google-able name, and (if appropriate) a literature reference to a paper or textbook. (My preference is to include a citation to something online without a paywall, to make it as easy as possible to get at the information later, but less readily available resources like textbooks or papers can be very helpful as well -- they are sometimes the best or even the only resource that describes a particular algorithm.)

The bottom line is to use whatever's most consistent with existing work, and communicate most effectively to others -- collaborators, future maintainers, and most importantly your future self -- what the code does.

johnchristopher · on Nov 21, 2012

I am pretty sure there was a time when the number of letters in a variable name had a significant impact on memory consumption. Hence the old 8 letters cap. edit: more like a C language-specific thing http://www2.its.strath.ac.uk/courses/c/subsection3_6_2.html "Some old implementations of C only use the first 8 characters of a variable name."

In this specific example I hold the position that the problems doesn't come from small variable names but from a lack of good documentation. Short variable names aren't a problem per se.

For instance, most articles I read tend to define acronyms one time at the beginning of the corpus and then only use acronyms for a given idea/concept/reference.

Writing good documentation is hard.

csense · on Nov 21, 2012

If a function is more than a page long, refactoring should be considered.

It sounds like perhaps the solution to the parent's readability problem might be replacing the function in question with two or more functions: One function to load the variables from the table into a class instance, and another to take that class instance and run the algorithm

It's almost always good to keep code that interfaces with the outside world -- reading information from a database, sending triangle meshes to the GPU, whatever -- from pure mathematical code. It sounds like the parent's codebase doesn't follow this philosophy.

Depending on its complexity, the algorithm might need to itself be refactored into multiple functions.

lnanek2 · on Nov 21, 2012

I do support names as comments to a degree, but for math stuff, I'll often use shorter names. If you have a bit of math with a bunch of variables used in one line, with long descriptive names, it's going to take up multiple lines and be tough to read and understand.

I have a 10 line method for calculating distance from coordinates in one program for example, each line fits on one line and is pretty easy to understand because you can see all the operations. It would be 30 lines with longer variables names and tougher to read and understand, with some lines of code split across multiple.

engtech · on Nov 21, 2012

Short variable names are fine if the variable exists for a short time (eg: if the variable lives more than 5 lines of scope then use a longer variable name).

But don't use single character variable names. We've all gotten in the habit of using i,j,k as loop variables, but ii,jj,kk are absolutely a better choice for the simple reason that you can use any text editors' search function to find "ii" without any false matches, but "i" will have you searching all over the place.

TL;DR - using "ii" instead of "i" makes it easier to maintain code.

actsasbuffoon · on Nov 21, 2012

Do you think it's likely that you're going to want to search for those short-lived variables? I think short variable names being less likely to create false-positives when searching is a feature, not a bug.

jiggy2011 · on Nov 21, 2012

A nice editor/IDE feature might be to be able to use 2 names for a variable.

Short name for ease of typing and understanding when you are writing the code and a longer one for when you come back to it 6 months later or give it to someone else to understand.

Of course you would need to deal with collisions in a sensible way.

Something I have observed is that Java programmers seem to like longer variable names (sometimes hideously long) whereas C programmers often seem to use one, two or three letter names. Python/ruby seem to be somewhere in the middle.

smoyer · on Nov 21, 2012

When we wrote assembly language in fixed columns in an 80 character wide green-screen, it was useful to be able to fit the operands/jump labels into two columns (16 characters), so we tended to keep labels below 12 characters. We tried to be as expressive as we could with the names, but there were obvious limits.

As an aside (and from memory) we used the following tab stops: 0 tabs for labels, 2 tabs to the instruction, 3 tabs to the operands and 5 tabs to the EOL comments. All tabs were 8 characters.

ericHosick · on Nov 21, 2012

Don't want to sound mean, but is there an excuse for excessively long names?

http://javadoc.bugaco.com/com/sun/java/swing/plaf/nimbus/Int...

(link from a recent HN post) (Edit: Just wanted to point out that the concept goes both ways).

brazzy · on Nov 21, 2012

The only excuse for such a monstrosity would be that it's machine-generated and follows some scheme that prevents name collisions.

If you look at the other class names in that package, it seems pretty clear that the name was in fact the result of a bug that duplicated prefixes. It almost certainly should have been InternalFrameTitlePaneMaximizeButtonPainter, which is admittedly pushing it but can be justified.

ericHosick · on Nov 21, 2012

I would argue that the reason why we end up with really long or short names is of no importance: they exist.

The only important factor of variables names is that they help with "documenting" the source code.

I just wanted to point out that it goes both ways.

zxcdw · on Nov 21, 2012

Indeed. However, I personally do see that variable names/naming is really something where some people tend to over-engineer things. Say iterating over an array. Why on earth would anyone bother with using "index" instead of "i"? What does it add, other than unnecessary bytes? Another example would be using "Input" and "Output" as argument names instead of simply "In" and "Out". Unnecessary bytes, repetition, noise for no stronger signal. No new information, just noise.

ericHosick · on Nov 21, 2012

True that.

Now where is that quote (just for fun):

"There are only two hard things in Computer Science: cache invalidation and naming things." - Phil Karlton

mromanuk · on Nov 21, 2012

here you have one: tired fingers. myReallyLongAnTemporaryVariable could be just t :)

Regarding the article, shortening and using a simple name without superfluous information is the best approach:

dR = radius this doesn't make any sense, radius is better. But for(i;i<something;i++) {} everybody knows that i is an iterator.

Context is really important to name your variables.

aroberge · on Nov 21, 2012

Physicist here. In a scientific program I would most likely not use radius. I definitively would not use dR (type should be clear, no need to use Hungarian notation which I find to be an abomination) and would most likely use either R or r, the choice depending on what symbol was used in the equation(s) written in standard form. I probably would include in a comment the relevant equations using TeX notation and a reference to a scientific article or book from where it was taken.

mercurial · on Nov 21, 2012

> here you have one: tired fingers. myReallyLongAnTemporaryVariable could be just t :)

With most editors/IDE, you'll type the variable in full once and autocomplete the rest of time, so not much of an excuse.

agentultra · on Nov 21, 2012

Sure, in certain situations like inside of a LAMBDA one might use a LABELS or LETREC (depending on whether you're in Lisp/Scheme land) form to close over the arguments that don't change between recursions... it's typical to use a short name since no one else will ever see it outside of the function definition.

tych0 · on Nov 21, 2012

I think "A variable name's length should be directly proportional to the size of its scope" is a healthy mantra. As other comments have pointed out: in anonymous functions and other obvious contexts, short variable names are often more clear and much less annoying.

cema · on Nov 21, 2012

Several cases when it makes sense. One is mentioned in the article, copied from a math formula. Another frequent case, in clojure and C#, for example, is a placeholder.

  Expr.SortBy(x => x.Id)

Still, I prefer to comment non-obvious cases.

lhnz · on Nov 21, 2012

An 80 character artificial limit in some coding standards often causes developers to have to use difficult to understand variable names. The 80 character limit is apparently a readability thing which I find quite ironic.

sixothree · on Nov 21, 2012

I really hate dealing with short variable names. A recent project I came across a variable sfile. In some cases it meant the filename, in others it meant the file contents. I wanted to kill.

theorique · on Nov 21, 2012

It's a rotten idea to overload or reuse a variable name, unless it's obvious from context (e.g. i is the loop index for a variety of different loops).

mistermcgruff · on Nov 21, 2012

I use descriptive names unless programming in R in which case I use obscenely short names to make me feel more like a mathematician.

nwmcsween · on Nov 21, 2012

I generally code inline documentation first and code second and usually utilize 1-3 char variable names.

ArekDymalski · on Nov 21, 2012

It's more an explanation than excuse: habits from BASIC childhood. 10 A = 0 20 A = A + 1 30 goto 10 ;)