Saturday, January 29, 2005 - Posts

Tabs for Indenting Code

Feel free to VOTE on whether you use/prefer tabs or spaces for indenting.

After reading Brad's “Internal Coding Guidelines” again, I notice one of his suggestions reads:

2.1 Tabs & Indenting
Tab characters (\0x09) should not be used in code. All indentation should be done with 4 space characters.

I just don't get it. I never understood this. Some people have said that tabs don't show up well in primitive editors like notepad. Well who's fault is that?

I can see a number of benefits to using tabs over spaces for indenting code:

1. If you choose to use spaces, you're forcing the next person to use the same layout that you're using. If you use tabs, then with the right editor you let the next developer customize the layout of the code to their own preferences.

2. Navigating across indents is 1/4 of the key strokes.

3. Entering indents is 1/4 of the key strokes.

4. Deleting indents is 1/4 of the key strokes.

5. Files end up being smaller.

6. Much easier to vertically align. With spaces, it's easier to make mistakes and have jagged alignment... I see it all the time in space-indented files.

7. It encourages people to use better editors than notepad.

What do you think? Vote here!

Thoughts on Naming Convention

Naming convention is something that all programmers come up against at some time or another, it's a pretty important subject.

So yesterday I was watching the "Designing .Net Class Libraries" series and listening to Brad Abrams talk about the naming conventions they have adopted at Microsoft and that they recommend to others.

A lot has been said on the subject already. But you know, each developer has their own experiences and own views, so I've summarized the points he made and added some of my own views in italics.

1. Don't re-invent the wheel. Microsoft have spent hours upon hours debating naming conventions, so why rehash the same arguments when you can just follow their detailed guidelines. A lot of the arguments are unresolvable, so sometimes it's best to just choose a standard and stick to it.

2. Brad says that "privates are your own business". Microsoft's guidelines only apply to publicly exposed members of your classes. Saying that, the guidelines are detailed and broad enough that they can be easily applied to private members also. [Correction: Brad Abrams has now provided some guidance on naming of privates].

I actually think it's a bad idea to tell people to "do what they like" with privately scoped members. Most programmers work on code that is either owned by their company, or at least shared with other developers. Without any real guidance on convention here, you'll get one developer prefixing fields with underscores, another prefixing with "pv", another prefixing with "m_", another using Pascal casing… it ends up a nightmare. And I've seen it happen too.

3. Two styles of casing: Pascal and caMel. Pascal casing has each word in the identifier starting with a capital letter. Camel casing is similar but the first letter of the identifier is lower cased.

This sounds straightforward enough, but it does imply that there's no need for any other method to delineate words in your identifiers. You don't need underscores. You really don't need prefixes.

4. Public methods, and all classes, properties and events use Pascal casing. They start with a capital initial letter.

Something that really bothers me with Java code is the way people use camel casing for method names. It makes it very difficult to see where the object name ends and the method name starts. Didn't Sun do any readability tests?

5. All parameters, locals and private fields use camel casing. To distinguish between private fields and locals, access the fields through "this.".

Some people take this further and put a "this." prefix on ALL member accesses. I think that's a bit over the top, and in some ways defeats the purpose of object orientation in the first place. Personally I avoid ambiguities like that in the first place -- even if it means being creative in naming parameters.

6. Do not prefix enumerators with any letter.

7. Avoid abbreviations in words. If you must abbreviate (and it's a well known abbreviation) treat the abbreviation as if it were a word in itself, and case it accordingly. (One technique for seeing if it's a well known abbreviation? According to Brad, Google it).

8. One or two letter acronyms should be in upper-case. For example, IO or UI.

They did add it's a bit arbitrary that this rule is applied only for 2 letters, and three letters should use camel casing. But if you look at the difference between two and three letter caps in a word, you'd understand that's a good place to draw the line.

9. Abandon hungarian naming. Even though it's fairly well documented, in a lot of cases the rules used to determine the prefix was arbitrary, so it was difficult for developers to decode which led to inconsistency and a dependency on documentation. With the dawn of more strongly typed languages and intellisense, the need for hugarian notation has diminished.

I definitely agree here. Once you use hungarian though, it's quite difficult to give it up. I sometimes find myself doing this for control names on forms, for example. But it's really not necessary. I stop myself now, and just append the type to the end. For example, NameTextBox rather than txtName.

10. Don't use underscores. Pascal and camel casing allow us to delineate words with the use of casing. It's generally accepted that this style is easier on "readability flow" than the use of underscores to separate words. Underscores break the flow and make code more difficult to read.

Underscores are evil. If you need think underscores help readability then you need glasses. Use a bigger font. Eat carrots. See a doctor. Anything. Just don't use underscores.

11. Don't use all-caps. Again this makes the code more difficult to read, so all-caps should be avoided.

12. Plurals for collections. Methods that return collections, arrays or just multiple items should use the plural. For example, GetMember returns one, GetMembers returns many. Seems obvious but it's important to clarify.

13. Don't prefix classes with C. Microsoft felt dropping the C didn't make for any loss of meaning in the class's name.

14. Do prefix interfaces with I. There are generally fewer interfaces than classes, and it's important to differentiate them from each other, therefore it was chosen to prefix interfaces with the letter 'I'.

15. Consistency within the context of all your software is more important than anything else. One example Brad gives is a time he was working with a load of developers from England and they all wanted to use the Colour spelling because that was natural to them. They finally made the call to adopt the US spelling simply for the sake of consistency with the rest of the framework.

Sometimes I think it's actually better to be consistently wrong than to realize you're wrong 3/4 of the way through and make the last quarter right. The other day someone asked me my advice on something - he'd developed all of his classes to have a DL prefix to indicate it was part of the data-layer. With one class left to write, he realized that it was unnecessary because the namespace already indicated that. He wanted to know whether he should continue with the DL prefix for the last class, or just do it the right way. I actually recommended he put the DL prefix, because without it I think the user of his class library will be confused why that one class is named differently. At a later date he can fix all of them.

16. Consistency with prior art is also important. Look at similar classes in the framework, or similar, leading products in the same industry and see what terminology and naming convention they use, and try to remain consistent with that.

This is very important. A company where I'm consulting right now has developed a loan management system. They developed the system internally, and are pretty consistent on the terminology they use in the application. It wasn't until they started looking at integrating with third-party systems that they determined some of their terminology differed from that used by others in the industry. The result? Increased training required for new employees, confused users, and nightmare coding where you can't decide whether to use one naming system or the other. It's good to research this stuff before you start.

Since I started this I noticed that Brad has posted a short list of internal guidelines they use at Microsoft which covers a lot of what I've summarized. His guidelines include suggestions on code structure too, along with some guidelines for private member naming. You can read that here.

Anyway I'm glad Microsoft provide such detailed guidelines for naming, and that they are constantly clarifying and communicating these conventions through their blogs. I've had many arguments with development teams about naming styles and conventions, and like I said sometimes it's best just to stop debating and follow the crowd. While the convention you end up with may not be perfect, at least it's common and easily understood.