Sunday, 26 January 2014

Guideline #4. Avoid fall-through in nonempty cases of "switch" statements

Guideline

Each case label in a switch statement shall be followed by one of the following:

  • A set of one or more statements, each in its own line, terminated by a break instruction.
  • Another case label.
  • A default label.
We may call "empty case" a case label which is followed by either another case label or  default. Empty here does not mean, of course, that no action is executed if the code flows to that label; it is just that the action executed is shared between several labels.

A "nonempty" case is one which is followed directly by a set of one or more statements. The guideline states that these statements shall always be terminated by a break instruction. These statements shall not lead to a fall through another case or default label. This is a legal possibility, of course, but this is why I am writing this guideline in the first place - if it weren't legal, the compiler would take care of it!

Discussion

A switch statement is a controlled jump. Every time the code execution encounters the keyword switch, it will jump to a different place. The question is where.

The answer is:

  • The execution will jump to the line after the case which is equal to the value of the expression in the switch, if there is one;
  • If there isn't one such case, but there is a default case, the execution will jump to the line after the default.
  • If there is none of the above (that is, no case equal to the value of the expression in the switch, and no default label), the execution will jump to the line after the closing bracked of the switch block.
It is interesting to note that this behaviour does not depend on the order of the case and default labels. default could be the first label to appear inside the switch block, some cases could go after it, and the language rules above would still hold. If there is a default , the execution jumps to it when no other case expression is equal to the switch expression, no matter where the other cases are located.

The lines with the case and default keywords are not instructions by themselves. Instead, they are just labels. They mark spots where the code execution may jump, depending on the value of the switch expression.

Ok. So what could go wrong?

One of the few things that could go wrong is that in C++, there is no implicit goto between the end of each case block and the end of the switch. This is good because it allows several cases to lead to the same action. This is done by putting several empty cases in consecutive lines, followed by a nonempty case. But it has one drawback, too: It also allows nonempty cases to fall through and continue executing below any further case labels they encounter, until a break is found or the switch block ends.

"What's wrong with that?", you say. "After all, this is how the language was defined. Every programmer must know how this kind of code works and write their switch blocks accordingly. If they're not aware of the need to write break to jump to the end of the switch block, it's their fault. I won't do such basic mistakes - if I fall through a case label it will be because I wanted to do so! If other people make mistakes when using a simple switch, what will they do with pointers?"

You are right. You are totally entitled to fall through case labels. I have no argument against that. You write correct C++ code and there is nothing I can obect to that.

Or is it?

Please listen to me for a minute. I will make one or two points and then you'll decide what you do with your switches.

First of all: Yes, I admit that you are confident and experienced in C++, and that you write correct code in this language. You never, ever forget to write a break, so whenever you use fall-through it's because you willingly choose to do so. But the next time I see one of them, how will I know it was you who wrote it? How will I know that it wasn't some beginner programmer who was confused about the need to write break and thought that his code would jump right to the end of the switch after the case was "finished", and forgot to test that particular part of the code? How will I, a humble reader, tell one unintended fall-through from your very well informed ones? I'll tell you how: In no way. I'll have to try and guess it from the context, or just trust that the existing code is correct in the absence of evidence to the contrary.

One more thing: Have you ever forgotten to pick your keys? I have, and I know perfectly how my door works. I know I won't be able to open it without them. Nevertheless, a few times in my life I forgot to pick the keys and I got locked out.

I know perfectly how switch works, but even so I have forgotten to write a break statement in certain occasions. I found out shortly after, because I tested the code and it didn't do what I expected. But nevertheless I'd like to prevent such simple mistakes in the future.

In case I haven't convinced you yet, let me quote this StackOverflow answer, which is itself quoting a designer of the C# language, who arguments against fall-through between cases being built into the language in the first place:
This implicit fall-through behavior is often used to reduce the amount of code needed and often isn't an issue the first time that code is written. However, as code moves from the initial development phase into a maintenance phase, the code above can lead to subtle errors that are very hard to debug. These errors result from the very common mistake of the developer adding a case, yet forgetting to put a break at the end of the block.
In C#, the switch statement requires that explicit flow control occur at the end of a case, either a break, goto, return, or throw. If the developer desires fall-through semantics, it can be achieved by an explicit goto at the end of a case statement.
There is not one single truth, except maybe for the fact that you need to know the syntax of your language and use it correctly. But I am strongly inclined to treat C++ as if it was C# in this particular item: do not use  fall-through, and always use break at the end of a nonempty switch statement. This will make your C# and C++ switch statements look the same, and behave in the same way - a robust, less error prone way.

Example

Don't

If write a switch statement with fallthrough, such as:

switch (option)
{
  case 1:
  case 2:
  case 3:
    // do something
  case 4:
  case 5:
    // do something else
    break;
}

You'll leave future readers of your code intrigued as to whether you forgot to write a break after the "do something". It is good that you solve that doubt, just in case they don't succeed at inventing a time machine so that they can come to ask you. 

Do

You could at least add a comment stating that your fallthrough is intended:

switch (option)
{
  case 1:
  case 2:
  case 3:
    // do something
    // FALLTHROUGH
  case 4:
  case 5:
    // do something else
    break;
}

This violates my guideline, but at least makes it clear that you knew what you were doing and that you didn't just forget the break .

A better option is to not use fallthrough at all:

switch (option)
{
  case 1:
  case 2:
  case 3:
    // do something
    // do something else
    break;
  case 4:
  case 5:
    // do something else
    break;
}

Of course, "do something else" should better be a very simple instruction or a function call, because you're now repeating that instruction. If it was originally a set of eight instructions, it is not a good idea to just copy and paste them in the above case, because that introduces code duplication. If it is just a trivial line or a function call, it's ok.

While we're at it, it's not a bad idea to always put a default statement:

switch (option)
{
   case 1:
   case 2:
   case 3:
      // do something
      // do something else
      break;
   case 4:
   case 5:
      // do something else
      break;
   default:
      break;
}

When a new case 6 comes, the more clear, readable and well structured is your switch statement, the less likely you will be to make a mistake.

Given that this is the best way to write a switch statement, why not build it like this from the beginning? For example, you could start by writing:

switch (option)
{
   case 1:
   case 2:
   case 3:
   case 4:
   case 5:
   default:
}

Then consider which cases should be grouped, and divide them by adding the break statements:

switch (option)
{
   case 1:
   case 2:
   case 3:
      // action for 1, 2, 3
      break;
   case 4:
   case 5:
      // action for 4, 5
      break;
   default:
      break;
}

And in a third step, add the instructions before each break, so that your actually does something. If you get used to write switch blocks this way, you'll minimize your mistakes and, more importantly, you will generate readable and well structured code.

Bibliography

[McCONNELL 2004] This book discusses switch statements in Chapter 15: "Using conditionals", Section 15.2 "case statements" (pages 361-365).

No comments:

Post a Comment