. .
.
Estimation of test coverage metrics & structural complexity
.
.

# Objectives

After completing this experiment you will be able to:

## Time Required

Around 0.00 hours

## Control Flow Graph

A control flow graph describes the sequence in which the different instructions of a program get executed. In other words, a control flow graph describes how the control flows through the program. In order to draw the control flow graph of a program, all the statements of a program must be numbered first. The different numbered statements serve as nodes of the control flow graph. An edge from one node to another node exists if the execution of the statement representing the first node can result in the transfer of control to the other node.

Figure - 01 shows a snippet of code to compute the square of an integer along with it's CFG.

1 int x = 10, x_2 = 0;
2 x_2 = x * x;

3 return x_2;

Figure - 01: A simple program and it's CFG

Figure - 02 shows the CFG for sequence, selection and iteration kind of statements in order.

Figure - 02: CFG for different types of statements

## Path

A path through a program is a node and edge sequence from the starting node to a terminal node of the control flow graph of a program.

There can be more than one terminal node in a program.

Linearly Independent Path

A linearly independent path is any path through the program that introduces at least one new edge that is not included in any other linearly independent paths. If a path has one new node compared to all other linearly independent paths, then the path is also linearly independent. This is because, any path having a new node automatically implies that it has a new edge. Thus, a path that is subpath of another path is not considered to be a linearly independent path. The path-coverage testing of a program does not require coverage of all paths but only coverage of linearly independent paths.

## McCabe's Cyclomatic Complexity

For more complicated programs it is not easy to determine the number of independent paths of the program. McCabe's cyclomatic complexity defines an upper bound for the number of linearly independent paths through a program. Also, the McCabe's cyclomatic complexity is very simple to compute. Thus, the McCabe's cyclomatic complexity metric provides a practical way of determining the maximum number of linearly independent paths in a program. Though the McCabe's metric does not directly identify the linearly independent paths, but it informs approximately how many paths to look for.

There are three different ways to compute the cyclomatic complexity. The answers computed by the three methods are guaranteed to agree.

Method #1:

Given a control flow graph G of a program, the cyclomatic complexity V(G) can be computed as:
V(G) = E - N + 2
where N is the number of nodes of the control flow graph and E is the number of edges in the control flow graph.

Method #2:

An alternative way of computing the cyclomatic complexity of a program from an inspection of its control flow graph is as follows:
V(G) = Total number of bounded areas + 1
In the program's control flow graph G, any region enclosed by nodes and edges can be called as a bounded area. This is an easy way to determine the McCabe's cyclomatic complexity. But, what if the graph G is not planar, i.e. however you draw the graph, two or more edges intersect? Actually, it can be shown that structured programs always yield planar graphs. But, presence of GOTO's can easily add intersecting edges. Therefore, for non-structured programs, this way of computing the McCabe's cyclomatic complexity cannot be used. The number of bounded areas increases with the number of decision paths and loops. Therefore, the McCabe's metric provides a quantitative measure of testing difficulty and the ultimate reliability. This method provides a very easy way of computing the cyclomatic complexity of CFGs, just from a visual examination of the CFG. On the other hand, the other method of computing CFGs is more amenable to automation, i.e. it can be easily coded into a program which can be used to determine the cyclomatic complexities of arbitrary CFGs.

Method #3:

The cyclomatic complexity of a program can also be easily computed by computing the number of decision statements of the program. If N is the number of loops and decision statements of a program, then the McCabe's metric,
V(G) = N + 1

## Limiting Cyclomatic Complexity to 10

There are many good reasons to limit cyclomatic complexity. Overly complex modules are more prone to error, are harder to understand, are harder to test, and are harder to modify. Deliberately limiting complexity at all stages of software development, for example as a departmental standard, helps avoid the pitfalls associated with high complexity software. Many organizations have successfully implemented complexity limits as part of their software programs. The precise number to use as a limit, however, remains somewhat controversial. The original limit of 10 as proposed by McCabe has significant supporting evidence, but limits as high as 15 have been used successfully as well. Limits over 10 should be reserved for projects that have several operational advantages over typical projects, for example experienced staff, formal design, a modern programming language, structured programming, code walkthroughs, and a comprehensive test plan. In other words, an organization can pick a complexity limit greater than 10, but only if it is sure it knows what it is doing and is willing to devote the additional testing effort required by more complex modules.

Somewhat more interesting than the exact complexity limit are the exceptions to that limit. For example, McCabe originally recommended exempting modules consisting of single multiway decision ("switch" or "case") statements from the complexity limit. The multiway decision issue has been interpreted in many ways over the years, sometimes with disastrous results. Some naive developers wondered whether, since multiway decisions qualify for exemption from the complexity limit, the complexity measure should just be altered to ignore them. The result would be that modules containing several multiway decisions would not be identified as overly complex. One developer started reporting a "modified" complexity in which cyclomatic complexity was divided by the number of multiway decision branches. The stated intent of this metric was that multiway decisions would be treated uniformly by having them contribute the average value of each case branch. The actual result was that the developer could take a module with complexity 90 and reduce it to "modified" complexity 10 simply by adding a ten-branch multiway decision statement to it that did nothing.

Consideration of the intent behind complexity limitation can keep standards policy on track. There are two main facets of complexity to consider: the number of tests and everything else (reliability, maintainability, understandability, etc.). Cyclomatic complexity gives the number of tests, which for a multiway decision statement is the number of decision branches. Any attempt to modify the complexity measure to have a smaller value for multiway decisions will result in a number of tests that cannot even exercise each branch, and will hence be inadequate for testing purposes. However, the pure number of tests, while important to measure and control, is not a major factor to consider when limiting complexity. Note that testing effort is much more than just the number of tests, since that is multiplied by the effort to construct each individual test, bringing in the other facet of complexity. It is this correlation of complexity with reliability, maintainability, and understandability that primarily drives the process to limit complexity.

Complexity limitation affects the allocation of code among individual software modules, limiting the amount of code in any one module, and thus tending to create more modules for the same application. Other than complexity limitation, the usual factors to consider when allocating code among modules are the cohesion and coupling principles of structured design: the ideal module performs a single conceptual function, and does so in a self-contained manner without interacting with other modules except to use them as subroutines. Complexity limitation attempts to quantify an "except where doing so would render a module too complex to understand, test, or maintain" clause to the structured design principles. This rationale provides an effective framework for considering exceptions to a given complexity limit.

Rewriting a single multiway decision to cross a module boundary is a clear violation of structured design. Additionally, although a module consisting of a single multiway decision may require many tests, each test should be easy to construct and execute. Each decision branch can be understood and maintained in isolation, so the module is likely to be reliable and maintainable. Therefore, it is reasonable to exempt modules consisting of a single multiway decision statement from a complexity limit. Note that if the branches of the decision statement contain complexity themselves, the rationale and thus the exemption does not automatically apply. However, if all branches have very low complexity code in them, it may well apply. Although constructing "modified" complexity measures is not recommended, considering the maximum complexity of each multiway decision branch is likely to be much more useful than the average. At this point it should be clear that the multiway decision statement exemption is not a bizarre anomaly in the complexity measure but rather the consequence of a reasoning process that seeks a balance between the complexity limit, the principles of structured design, and the fundamental properties of software that the complexity limit is intended to control. This process should serve as a model for assessing proposed violations of the standard complexity limit. For developers with a solid understanding of both the mechanics and the intent of complexity limitation, the most effective policy is: "For each module, either limit cyclomatic complexity to 10 (as discussed earlier, an organization can substitute a similar number), or provide a written explanation of why the limit was exceeded."

Cite this Simulator:

.....
..... .....