I'm always trying to get people to write better code. Come to think of it, I'm always trying to write better code myself. And the more I analyze how programmers approach writing software, the more I realize that it's often a thoughtless process: it's easy to bang away on the keyboard tackling each challenge almost on a line-by-line basis, disregarding the big-picture of your application's overall design. Call me pessimistic, but it's so easy that finding well thought-out code is quite a rare thing these days.
Recently I've been experimenting with an idea I had to encourage developers to improve their code. The idea involves enforcing an arbitrary line limit on the code they write. For example, I may tell them to implement a certain feature, but that the code they write cannot be longer than, say, 100 lines. That may sound like a stupid idea, but it always ends up with better and more well thought-out code. Let me explain the logic behind this.
Given enough time, any standard of programmer can implement pretty much any application. The result may end up a completely unmaintainable mess, but it'll work for sure. Sometimes. Eventually. But at what point does an application become a mess? Usually when it's too big: when there's too many lines of code.
Have you ever had to take on someone else's code and complained that the code is unmanageable because it's too small? I doubt it. Generally, even unmaintainable code can easily become maintainable if it's small enough, because we can simply rewrite the complicated sections. However when you're faced with an unmaintainable application consisting of millions of lines of code, the likelihood of a bad design increases, and with it the likelihood of a complete rewrite too: By keeping your project line count low, you're helping keep your project maintainable.
Now, I know it's easy to make these blanket statements that the more lines of code, the more complex the application becomes - but it's not as random as you might think. The concept certainly isn't new, but I do think it deserves more attention.
When you're forced to write code in fewer lines you suddenly take on a entirely different mindset and approach to programming. Consider how it changes your approach:
- You're forced to re-use code because it might be the only way you can reduce the number of lines.
- You start thinking twice before copying and pasting code.
- You look at your classes to see how you can use them polymorphically.
- You derive more classes and build a class hierarchy.
- You might be forced to use declarative programming techniques, because by moving the essence of your application's initialization into configuration you might find you've ended up with less duplication of logic.
The bottom line is that there's quite a lot of good programming practices that are forced onto you when you're given a constraint on the size of the code you write.
And the key isn't necessarily to set an arbitrary line limit, but to ask the question: can this code be any smaller than it is? By setting an arbitrary line limit, you're forcing them to make it as small as it can be. The line limit is just a carrot dangling on the end of a stick.
If you're a lazy programmer, which I am, then some of this may come naturally to you. Of course I don't mean mentally lazy, I mean physically lazy. Yes, so lazy you can't be bothered to move your fingers to type out all those lines. Lazy programmers don't need a team lead telling them to write less lines of code, the very thought of having to maintain, let alone type out, all that code is enough to constrain the amount of code they write a day.
Some call this phenomenon "code bloat", and I think it's sufficiently widespread to warrant some major action. You don't have to look very far to find the evidence: it seems the more memory and hard disk space available on average machines, the less care goes into keeping code small. It really wasn't all that long ago, relatively speaking, that a useful operating system could fit on a 180 Kb floppy disk. And now Windows XP on my machine takes up almost 2 gigabytes.
Lets look at an example of what I mean. Take this C# code here:
class Mortgage
{
public bool IsLoanActive;
public float Principal;
public float Interest;
}
class CarLoan
{
public bool IsLoanActive;
public float Principal;
public float Interest;
}
class BusinessLoan
{
public bool IsLoanActive;
public float Principal;
public float Interest;
}
// returns the total interest for a year of all active mortgages.
float GetTotalInterestForThisYear( Mortgage[ ] mortgages, CarLoan[ ] loans1, BusinessLoan[ ] loans2)
{
float totalInterest = 0;
foreach ( Mortgage mort in mortgages)
{
if ( mort.IsLoanActive)
{
totalInterest = totalInterest + mort.Interest * 12;
}
}
foreach ( CarLoan loan1 in loans1)
{
if ( loan1.IsLoanActive)
{
totalInterest = totalInterest + loan1.Interest;
}
}
foreach ( BusinessLoan loan2 in loans2)
{
if ( loan2.IsLoanActive)
{
totalInterest = totalInterest + loan2.Interest;
}
}
return totalInterest;
}
// returns a list of all active loan amounts
float[ ] GetAllLoanAmounts( Mortgage[ ] mortgages, CarLoan[ ] loans1, BusinessLoan[ ] loans2)
{
int totalActiveLoans = 0;
// find out how many active loans there are
foreach ( Mortgage mort in mortgages)
{
if ( mort.IsLoanActive)
{
totalActiveLoans = totalActiveLoans + 1;
}
}
foreach ( CarLoan loan1 in loans1)
{
if ( loan1.IsLoanActive)
{
totalActiveLoans = totalActiveLoans + 1;
}
}
foreach ( BusinessLoan loan2 in loans2)
{
if ( loan2.IsLoanActive)
{
totalActiveLoans = totalActiveLoans + 1;
}
}
float[ ] amounts = new float[ totalActiveLoans] ;
int count = 0;
// put the amounts into the array
foreach ( Mortgage mort in mortgages)
{
if ( mort.IsLoanActive)
{
amounts[ count] = mort.Principal;
count = count + 1;
}
}
foreach ( CarLoan loan1 in loans1)
{
if ( loan1.IsLoanActive)
{
amounts[ count] = loan1.Principal;
count = count + 1;
}
}
foreach ( BusinessLoan loan2 in loans2)
{
if ( loan2.IsLoanActive)
{
amounts[ count] = loan2.Principal;
count = count + 1;
}
}
return amounts;
}
The total amount of code inside the function bodies total about 75 lines. It may not look that bad to the untrained eye, but just look at all that redundancy - the code that almost looks like it was copied and pasted. I hate seeing visual patterns in my source code. If I see a pattern, then that tells me the code needs to be refactored.
You can actually reduce those 75 lines to about 10 lines (while retaining the same method signatures), and maybe less than that, with only a little additional thought.
class Loan
{
public bool IsLoanActive;
public float Principal;
public float Interest;
}
class Mortgage : Loan { }
class CarLoan : Loan { }
class BusinessLoan : Loan { }
float GetTotalInterestForThisYear( Mortgage[ ] mortgages,
CarLoan[ ] loans1,
BusinessLoan[ ] loans2)
{
float total = 0;
foreach ( Loan[ ] loans in new Loan[ ] [ ] { mortgages, loans1, loans2 } )
foreach ( Loan loan in loans)
if ( loan.IsLoanActive) total += loan.Interest * 12;
return total;
}
// returns a list of all active loan amounts
float[ ] GetAllLoanAmounts( Mortgage[ ] mortgages,
CarLoan[ ] loans1,
BusinessLoan[ ] loans2)
{
ArrayList list = new ArrayList( ) ;
foreach ( Loan[ ] loans in new Loan[ ] [ ] { mortgages, loans1, loans2 } )
foreach ( Loan loan in loans)
if ( loan.IsLoanActive) list.Add( loan.Principal) ;
return ( float[ ] ) list.ToArray( typeof( float) ) ;
}
In this code, we're making good use of polymorphism. We need less comments because it's harder to get lost while reading the code. We're also making some innovative use of jagged arrays to eliminate the need to repeat the code for each parameter. And with some additional understanding of .Net's collection types, we can create dynamic arrays that save us from having to determine the size up-front.
Of course there are dangers to telling developers to reduce the number of lines of code. Remember those days when C programmers would write lines so cryptic it was impossible to tell whether it was obfuscated or not? Like 'If' statements that contained multiple assignments, pointer dereferences and increment operators all in one line. Anybody who does this in the name of "code bloat antidote" is missing the point. Reducing the number of lines isn't really about the number of lines - it's about reducing the amount of code. Just because you can write the following to reduce the lines of code:
if ((age2 = age++) > (age3--) ? (age3 = age2++) == 0 : false)
…doesn't mean you should. You're not reducing the code here, you're just putting it all on one line. Plus you're making it more cryptic at the same time. If there's any tenet as useful as reducing the lines of code, it's making your code more readable. Reducing code is important, but never at the cost of clarity. Unfortunately there's no easy way to detect code like this, as it doesn't raise a warning or break any fxcop rules.
You should also consider that, while the code is shorter and more compact, that does not necessarily always mean the code will perform better. Using polymorphism can reduce code but make method calls slower because of extra v-table lookups. However, most of the time the trade off is justified.
So to conclude let me summarize the points I've covered:
- Write less code.
- Think more about the code you do write.
- Learn APIs thoroughly, to save you 'reinventing the wheel'.
- Always consider how readable your code is.
Hope you found this advice useful. So how about the next time you write a function, set yourself a line limit of 20 lines and see what difference is makes to the quality of the code you write.
And if you want to keep track of your "total lines of code" for C# projects, check out my line count utility.