Nuclear Fruit Salad loading...
Loading
Never Stop Learning 3653430 1920

Is 'continue' an antipattern?

Why you should avoid using 'continue' in any high-level language

published Wednesday, 28 Jun - 2023  by

'continue' was initially introduced in low-level languages like C, and then moved on to infect other languages like C#, Java etc.

It is essentially a repeat of the older GOTO keyword that was removed from structured languages around 1968 due to the issues that it encouraged such as spaghetti code and hiding bugs. These were re-added later to retain backward compatibility.

Continue is essentially a break-early condition for looping, where you may want to process some code for all objects, but then only do a bit extra on objects that meet certain conditions or where state is set a particular way.

Why is it used?

Normally breaking out of a loop early is a good thing, we avoid doing unnecessary commands or allocating more memory for temporary objects.

Now in those languages, adding extra code in order to exit early from a loop is well worth it and the loops themseleves often do not provide a means to do this in other ways.

But in higher level languages we have things like foreach loops, which on each entry into the code to be looped will declare and assign a variable prior to executing any of the internal code.

Use of continue in these circumstances should be avoided simply to reduce the internal complexity of the loop as well as at least one allocation and object declaration.

 

Why is it an antipattern?

In most languages the structure of a loop has the loop keyword, then the loop control, within the structure of these is where the code you wish to loop through lies.

So in the case of a foreach loop in C#:

foreach(var item in objectList){
    //do stuff here
}

The line at the top is the keyword foreach then the var item in objectList is the control that allocates each item in objectList to a new variable item.

By using continue within the do stuff here area, we remove control from the definition of the loop. Thus breaking the pattern of the loop's control definition.

Add in that over years as code is added to the loop, it becomes less obvious where it is or why it is present.

In 5 legacy projects I have worked on I have seen continue used inside loops cause spaghetti code, and be one of the top 2 causes of long-term bugs.To my mind these 2 reasons make it an antipattern.

How does this cause issues?

Loopcodecontinue (1)

Readability issues

Example: we have a function that only runs through a loop doing 2 things.

As the codebase grows, other programmers may have to add extra code inside this loop prior to the continue statement or after it.

Over a few years this can snowball, and suddenly you have a situation where the continue can be buried in the middle of the code.

Worse yet, you could end up with multiple continues inside the loop or conditions that cause continue could be changed before it is hit.

Changing the loop behaviour from inside the loop

The control pattern of the loop saying, 'here's the objects I want to edit' but then a line in the middle of the loop somewhere saying, 'oh but I don't want to edit this one'.

if you use a continue, you are breaking the control of the loop into 2 different areas, one in the control pattern of the loop, but another somewhere inside the loop itself. If someone unfamiliar with the codebase has to read it, then they will most likely miss the continue condition in the middle.

In the following example, the control zones are red, the inside of the loop is purple. As you can see part of the control is inside the loop, if there is a lot of code in the loop this can be very hard to see as the majority will be both of the processing steps. If you miss it, you will believe every element in the list is having both steps performed on it, yet some of your output will not have the 2nd step performed on it.

In the diagram on the right, we have removed the continue and instead only perform step 2 if the condition meets the criteria, this means all the code doing this will be indented inside an if statement and make it far more obvious that it is conditional.

Overuse of Memory & CPU

Programmer 1653351 1920 (1)

Another issue the use of continue has is the over-allocation and extra cpu cycles and RAM that are used.

By definition continue is inside a loop, usually after the code has called methods or allocated variables, sometimes many times.

This can result in more cpu cycles and memory used, in the following example the diagram on the right gets the item from the array, then checks to see if it is valid, this means memory is already allocated and the cpu time to process steps has already been undertaken.

Extra memory and steps allocation when using continue

The example on the right means shorter and less complex code within the loop, and means if you read the loop declaration (the control) you know exactly what it is processing.

While the loop control will be more complex, it means you do not have to hunt through 40+ lines of code to make sure there is no gotcha moment with a continue and you will know exactly what the loop will be ignoring and what it will include.

Example

In the following code we have a basic example of code I encountered that was converting between user objects, the array we are getting data from only has one name field, the one we are copying to has firstname & lastname fields, so we need to do the following:

  1. Split the name field by space (" ")
  2. Allocate the first part to FirstName, and the last part to LastName.
  3. If a name is null then ignore the record
  4. If the lastname is null or empty on the output then ignore the record.

item is allocated for every entry in the array, even the one we are skipping. I have added in the extra bits before the if statements to mimic examples of how I have seen things done in real legacy code.

foreach(var item in objectArray) {
    var splitName = item.Name.Split(' ');

    if(item.Name.IsNullOrWhitespace()){
       continue; }
   
    var outputObj = new user {
        firstName=splitName.First(),
        lastName=splitName.Last() };

    if(outputObj.LastName.IsNullOrWhitespace()) {
        continue; }

    var address = RetrieveOldAddress(outputObj.LastName);
    outputObj.Address.Street1 = addressStreet;
    outputObj.Address.Town = address.Town
}

So several things happen in the code above simply because we use continue instead of having the loop decide what elements to loop.

  1. The var Item is allocated to in memory (cpu+ram)
  2. The split occurs and splitName is declared and allocated to (cpu+ram), the null/empty check should have been prior to the first line inside the loop, as we can get a null reference error.
  3. First continue can occur
  4. possible null reference error on creating the outputObj
  5. The var outputObj is declared and allocated to (cpu+ram)
  6. Then the check is enacted on the lastName.

So that is 4 declarations + allocations to objects in ram + possible errors. But even if we fixed up the other errors and used the continue on the first line inside the loop, item is still being declared and allocated to.

A better way (although by no means the best) of doing this is to make sure our loop's control is correct in the first place and it only loops through the objects we want.

foreach(var item in objectArray.Where(x => x.Name?.Trim().Contains(' ') ?? false)) {
    var splitName = item.Name.Split(' ');

    var outputObj = new user {
        firstName[0],
        lastName[1]  };
   
    var address = RetrieveOldAddress(outputObj.LastName);
    outputObj.Address.Street1 = addressStreet;
    outputObj.Address.Town = address.Town  
}

In the above example, it is easier to read as the control pattern is back in one place, we also do not need have the allocation to item and the outputObj is never created when not needed. We have saved ourselves complication and 3x the number of object allocations and declarations.

Now in .Net 7, there are efficiencies Microsoft have implemented in C# to remove excess allocations and speed up processing particularly when methods are repeated so this will not make a huge difference, but on .Net 6 and below and in other languages you will see a larger decrease in memory and cpu time when removing the excess in this manner. Below you can see the difference between the continue and not using the continue in .Net 7.