Quantcast
Channel: prose :: and :: conz by joescii » Software Development
Viewing all articles
Browse latest Browse all 4

On naming

$
0
0

There are only two hard things in Computer Science: cache invalidation and naming things.

– Phil Karlton

At work a teammate and I have a long running joke about how we disagree with identifier naming. He favors long “descriptive” names, and I like short terse names. We never get angry over this disparity in our views, and mostly find good compromises during code review. As long as I’m not working alone, I’m more than willing to lay aside my preferences for the greater good of the team. Yet these past few days, my bias towards shorter names has been triggered into some passionate responses.

The most notable was a recent podcast from Functional Geekery featuring an interview with Adi Bolboaca. In this episode, Adi is discussing his experiences leading code retreats with host Steven Proctor. I really got worked up when Adi was speaking of how he battles functional programmers and their short naming. His claim is that descriptive naming was a principle of clean code which transcends paradigms. I just really felt that he got that one wrong.

But why? Why I do I feel strongly about using shorter names? I’ve put some thought into it and I’ve identified a few factors. Firstly I was a math geek before a software geek. I’m not talking about that math minor they forced you to get with your CS degree where you’re playing the role of the computer and not the mathematician. I’m talking about the “advanced” mathematics courses where the focus is on clearly communicating abstractions. I can’t quite recall names longer than a single character in my math courses. We’d rather reach into the bag of Greek letters to retain the brevity. If a second letter is introduced, it is as a suffix to indicate something else significant. Not a character is wasted. The reader is expected to apply the short names to abstract concepts and roll with it.

Fast-forward to the present where I am now engrossed with functional programming. This paradigm is far more closely related to the discipline of mathematics than traditional imperative programming techniques and theory (if any such thing exists). Given that functional programming is much more like math lends credence to my feeling that “clean code” applied to functional programming doesn’t carry long identifier names with it. Terseness is part of the paradigm. One can even argue that long variable names are an anti-pattern. I think it’s just the unfamiliarity of our Reverse Hungarian notation that led Adi to believe the principle still applies.

However, I feel I can lay aside the relationship to mathematics and it’s use of terse naming in route to understanding the underlying reason for my preference of short names. It all goes back to my attitude towards commenting code. The computer doesn’t care what you name things. The names we use are of importance to human reader of the code. They’re glorified comments. Names which are public and hence get included as part of an API and its respective documentation are certainly important and should carry meaning. Names which are internal to the code are mere comments attached to code artifacts. As I stated previously, I agree that the only useful comment is one that answers the question Why?. The name of an identifier never answers Why?. Rather, it always answers What? and often the context of the code has already answered this question, rendering the name redundant.

Perhaps the most classic example is the good ol’ loop iterator i:

for(int i=0; i<=10; i++) {
    System.out.println(i);
}

Developers worth their salt never use a verbose name like index for an iterating index such as this. The context and structure of the code already tell us that this is an index. By having the shorter name, the operative properties are more pronounced in the construct. That is, the start value 0, the stop condition <=10, and the increment expression i++ are quickly and easily identifiable. Longer names such as index, currentIndex, or indexOfTheLoop clutter this up and distract from these three vital pieces of information. Given that these three dictate what the code actually means versus what the author of the code thinks it should mean, I’m strongly inclined to prefer the more concise version.

In fact, functional languages like Scala take this even further. This construct is so common, we don’t even need as much clutter as the simple example above gives us. We compress it down further to the following.

for { i <- 1 to 10 } {
  println(i)
}

We see that the most important information in this construct is even more prominent: the start value 1 and the stop value 10. Incrementing by 1 is so common, it’s assumed unless you tell it otherwise. Don’t confuse this for more of my Scala evangelism for the Java oppressed. I’m pointing out that the functional paradigm’s embrace of terseness behooves us to reconsider what is regarded as clean code. I believe this results in those who have not adopted this paradigm erroneously believing the best practices transcend. The principles certainly transcend, with code readability being the principle in this discussion, but not necessarily the principles. I’ll go even further and argue that shorter names works well for imperative programming too, just as the for loop above suggests.

So what are my underlying principles with naming?

  • I believe that name length should be roughly proportional to scope size. The i in the above examples is a good specimen of a name with a tiny scope, and hence a tiny length is appropriate.
  • The name length should be inversely proportional to its frequency of use. A name that is repeated constantly through code needs to be short lest we introduce an epidemic of carpal tunnel. A frequently used name even if terse will be well understood. A great example of this is Lift’s S.
  • Don’t repeat the construct in the name. This is why we needn’t include “index” in the above name, neither should “accumulator” be used in a fold or reduce expression.
  • Don’t clutter the name with the context of the name. If you have a class named BankAccount, don’t use accountBalance. “Account” is already there, just let it be balance. Or if you have an actor, don’t name it’s internal state as currentState unless you are distinguishing it from other states such as previousState. Just let it be state.

Chances are good that I’m wrong that longer names are detrimental to readability. At least that was the consensus on twitter. If you take nothing else away from this post, let’s at least agree that longer isn’t always better.

Leave a reply below, or send me a tweet.



Viewing all articles
Browse latest Browse all 4

Latest Images

Trending Articles





Latest Images