Programming Tip: Don't Write Code That Does Nothing - Island of Sanity

Island of Sanity



Computers

Programming Tip: Don't Write Code That Does Nothing


Like, duh

The title of this article may sound so obvious that you ask, Why bother writing an article about it? But in fact, I see programs every day that include code that does nothing. By "nothing" here I mean "nothing of value". You may well be writing such code without even realizing it. Let me give a few examples.

(Note: I'm writing the examples here in Visual Basic because VB is verbose and thus, I think, it is easier for, say, a Java programmer to understand VB than for a VB programmer to understand Java. If you disagree, well, let's argue about that somewhere else. In some cases I've added short explanations where I think it might be non-obvious to non-VB programmers.)

Simple Mistakes

Some useless code is simply a mistake. For example, the programmer meant to type "x=x+10" but instead he typed "x=x+0". Of course that does nothing, so the line is pointless. But it was supposed to do something; it's just a typo. So let's skip this kind of useless code as boring. Obviously mistakes have to be fixed. Let's move on.

Pre-initialization

Programmers often set a variable to a default value that will never be used. That is, they write code like this:

amount=0
record=read_record(key)
amount=record.value

They assign a default value, and then promptly do something that gets the real value and replace the default.

Sometimes it's a shade more complicated, like:

amount=0
if tx_type = DEPOSIT then
  amount=deposit_amount
else
  amount=withdrawal_amount
endif

But again, whether the IF returns true or false, one way or the other we will override the default value. The default value is never used.

This is bad for several reasons. (1) It wastes CPU time. But that's really the least of what's wrong with it. (2) It is misleading to someone who reads the code in the future. It gives the impression that the default value may be used under some circumstances, and he may then waste time and effort trying to figure out when. In the above examples the code is short enough that it's not a lot of effort to figure out that the default is never used, but in real program there may be hundreds of lines of code to read to figure that out. (3) It bypasses error checking. Many modern compilers will give you an error if you attempt to use a variable that is not set on every possible path through the program. By setting a default at the top, you satisfy the compiler that a value is always set. But this value isn't real. If there is a path through the program where a value is not set, the compiler will not alert you to this error when it could have, if only you hadn't lied to it about what you were up to. Even if your compiler does not have such a feature, it is almost surely better for the program to fail if the value is never set, then to proceed with a meaningless default value.

Of course I am not talking about cases where there really is a legitimate default value. I'm talking about cases where the assigned default is never, or should never, be used.

Post-reset

Perhaps the flip side of pre-initialization is post-reset. This is when the programmer performs some clean-up on data that will never be used again.

Just yesterday I worked on a program that included this:

sub LoadData()
  dim sCollected as string ' define sCollected as a local string variable
  ... do various work to get the value for sCollected ...
  write(sCollected)
  sCollected = ""
end sub

Get it? sCollected is a local variable. It's not visible outside this function. And just before the function exits, it sets it to an empty string. Why? In the very next line the variable goes away. It's like sending in a cleaning crew to vacuum and wax the floors of a building just before you send in the wrecking crew to demolish it. Who cares?

Ignore the results of a calculation or read

I often see programs that do some very long and complex calculation, or worse still a database or file read, and then after doing it, decide that it was unnecessary and ignore the results. For example, I recently saw a block of code that essentially said this (to avoid explaining a complex system for one example, I'm simplifying this):

ship_date=get_shipdate_from_db(stock_number)
if item_status = SOLD then
  status_date = ship_date
elseif item_status = RETURNED then
  status_date = return_date
else
  status_date = purchase_date
end if

Do you see the problem? It reads the shipping date from the database -- which in this system was a complex operation because it had to chase through multiple legs of a shipment to find where it actually left the factory, but it can't just take the earliest date because an item might be sold and shipped to a customer, returned, and then sold and shipped to a different customer. So it does this complex database query ... and then in many cases it never looks at the result! If the status is not "SOLD" it just ignores it and uses a different date. It does a whole bunch of work, and THEN decides if there was any point in doing that work.

In this case, the simple thing to do is to put the database query inside the IF, in the case where it is used.

Skip the calculation

But the lesson from the previous section can be over-learned. The opposite error is cases like this:

if tax<>0 then
  amount=amount+tax
end if

What does the IF statement accomplish? Suppose we left it out, and always added the tax to the amount. If the tax is zero, then adding zero to the amount will leave it unchanged. The IF statement will never change the final result.

I suppose that programmers do this because they think it will make the program faster. In the case where the tax is zero, they skip doing the arithmetic, and thus save CPU time. In fact, this is counter-productive. Think it through: There are two possibilities: the tax is zero or it is non-zero. If the tax is non-zero, then the computer does the test, and then it does the add. So in that case we are doing both the add and the test, which is clearly more work then just doing the add. If the tax is zero, then we do the test and thus determine that we don't need the add. So instead of doing an add, we do a test. On most computer systems, doing a compare takes more CPU time than doing an add. So to save the time it takes to do an add, we are doing a compare which takes more time. We lose either way. Even if, on your particular computer or in the language you are using, the compare is faster, you only gain in the cases where the value is zero, and you lose in the case where it is non-zero. Do you know what percentage of the time the value is zero? You might still have a net waste of time.

Here's a similar example I see all the time:

if flag=true then
  flag=false
end if

What does the compare accomplish? If the flag was true, we change it to false. Presumably if it is not true than it must be false. So we are carefully avoiding changing it to false in the case where it is already false.

Again, maybe the programmer thinks that he is saving time by skipping the assignment when it will not change anything. The same logic applies here as for the "if zero" case. When the flag is true, we are doing the compare and then doing the assignment anyway, so we are clearly taking more time. When the flag is false, we do a compare instead of doing an assignment. But a compare almost surely takes the computer longer to do than an assignment. So we are losing in either case.

Don't try to avoid work by doing something that's more work than the work you're trying to avoid. It's like the classic exchange when parents tell their children to do some household chore, and the child makes a long argument about how it's his brother's turn to do it, or how he has too much homework, etc etc, until finally the parent says in exasperation, "If you had just done it when I asked it would have taken less time than you've spent arguing about it!" Yes, it makes good sense to write a couple of lines of code to avoid a time-consuming task when the test is clearly faster than the task. But it makes no sense at all when the test takes longer than the task: just do it. When in doubt, skip the test. Adding more code makes the program more difficult to read, and if there's no clear benefit, you may be paying for nothing.

Of course, this only applies if it doesn't matter whether or not we do the task when the test fails, like we will be adding zero or looking up a value that we don't use. If we must be sure NOT to do the task when a test fails, if exeucting the task would give incorrect results, that's a different story.

Deja Vu Tests

Another one I just saw yesterday:

if not sql is nothing and sql<>"" then ' is nothing is the equivalent of == NULL in many other languages
  if not sql is nothing and sql<>"" and sql.contains("@") then
  ... etc ...

What is the point of the inner test? The first two conditions are exact duplicates of the outer test. If they weren't true, we wouldn't be here. This example was particularly silly because the two lines came right in a row like that. Sometimes it's not as obvious because there is more code in the middle.

The program may work just fine. The first two terms of the test are always true, so this boils down to "if sql.contains("@")". Maybe that's all that needs to be tested. But we're left to wonder: If the author thought this test was necessary, under what circumstances did he think it might be false? What was he trying to accomplish? At the very least it wastes the reader's time figuring out that it does nothing. At worst it indicates a flaw in the program logic. If the author thought that it was possible for this test to be false, then he doesn't actually understand the flow of the program. It might be an indication of other logic errors.

I've occasionally seen code that does the opposite. I'll simplify this way down to be clear:

if x then
  ... other code ...
  if not x then
    ... do something
  end if
end if

The code inside the "if not X" will never be executed. Whenever I see this, I wonder what it was supposed to do. Presumably the author thought that it would sometimes be executed or he wouldn't have taken the time to type it in. He must have thought that it was possible to reach this point in the program with X true. Again, it calls the whole logic of the program into question.

What's the Big Deal?

So what's wrong with useless code? If it does nothing, it's harmless, right?

  1. It wastes CPU time. Even though it accomplishes nothing, the computer doesn't know that and it still grinds through it. This may or may not be an important issue. Yes, computers are fast today, and some wasted code is often a trivial problem. But they're not infinitely fast. Performance still matters. But this is often the least of the problems with useless code.
  2. It clutters up the program and wastes the reader's time. Sooner or later someone other than you has to read your code. Or you have to come back and read it long after you've forgotten what you were thinking when you wrote it. Every line of useless code is one more line that someone has to read and comprehend as he tries to understand the program. Anyone who's been in this business very long knows that you can often spend hours trying to find the right place to make a one-line change. Every extra line of code is one more line that someone has to hack through.
  3. It confuses the reader. It may take the reader some time to figure out that it does nothing. Then he has to wonder why it is there. Did the writer think that it did something? If not, why did he include it? What did the writer think it was going to accomplish? Or worse, the reader doesn't realize that it is useless, and he gets some mistaken idea about what the program is doing. Then he makes changes to a program based on an incorrect understanding of the program logic.

© 2013 by Jay Johansen


Comments

No comments yet.

Add Comment

Name
E-mail
Comment