GotW #98 Solution: Assertion levels (Difficulty: 5/10)

This special Guru of the Week series focuses on contracts. We covered basic assertions in GotW #97… but not all asserted conditions are created equal. Given some assertion facility that can be used like this: 1. Give one example each of an asserted condition whose run-time evaluation is: a) super cheap Without resorting to constexpr … Continue reading GotW #98 Solution: Assertion levels (Difficulty: 5/10) →

Feb 2, 2025 - 11:46
 0
GotW #98 Solution: Assertion levels (Difficulty: 5/10)

This special Guru of the Week series focuses on contracts. We covered basic assertions in GotW #97… but not all asserted conditions are created equal.

Given some assertion facility that can be used like this:

MyAssert( condition );  // expresses that ‘condition’ must be true

1. Give one example each of an asserted condition whose run-time evaluation is:

a) super cheap

Without resorting to constexpr expressions, it’s hard to find one cheaper than the one we saw in GotW #97 Example 3, which we can simplify down to this:

// Example 1(a): A dirt cheap assertion (from GotW #97 Example 3)

int min = /* some computation */;
int max = /* some other computation */;
MyAssert( min <= max );

This is always going to be dirt cheap. Not only is the integer comparison operation cheap, but min and max are already being accessed by this function and so they’re already “hot” in registers or cache.

b) arbitrarily expensive

“A condition that’s expensive? That sounds pretty easy,” you might say, and you’re right!

One commonly cited example is is_sorted. Just to emphasize how expensive it can be both in absolute terms and also relative to the program’s normal execution, let’s put it inside a well-known function… this is a precondition, but we’ll write it as an assertion in the function body for now which doesn’t affect this question: [1]

// Example 1(b): An arbitrarily expensive assertion

template 
bool binary_search( Iter begin, Iter end, const T& value ) {
    MyAssert( is_sorted(begin, end) );
    // ...
}

Checking that all the container’s elements are in ascending order requires visiting them all, and that O(N) linear complexity is arbitrarily expensive when the container’s size N can be arbitrarily large.

The icing on the cake: In this example, just evaluating the assertion requires doing more work than the entire function it appears in, which is only O(log N) complexity!

On a humorous note, O(N) remains O(N) no matter how hard we try to make it efficient:

MyAssert( std::is_sorted( std::execution::par, begin(s), end(s) ) );
                  // still O(N) arbitrarily expensive, but good try!

2. What does the answer to Question 1 imply for assertion checking? Explain.

We want a way to enable checking for some assertions but not others in the same code, because we may not always be able to afford expensive checks. That’s true whether we’re enabling checking at test time (e.g., on the developer’s machine) or at run time (e.g., in production).

Some conditions are so expensive that we may never check them without a good reason, even in testing. Example 1(b)’s is_sorted is a great example: You probably won’t ever enable it in production, and likely not by default in testing, except by explicitly turning it on during a bug hunt after enabling checking for cheaper assertions wasn’t enough or pointed at this data structure for deeper investigation. [2]

Other conditions are so cheap we’ll probably always check them absent a good reason not to, even in production. Example 1(a)’s min <= max is at this other end of the scale: It’s so dirt cheap to check that it’s unlikely we’ll ever have a performance reason to disable it. [2]

So it makes perfect sense that if Examples 1(a) and 1(b) appear in the same source file, the developer will want to enable checking for 1(b)’s assertion only by some kind of manual override to explicitly request it, and enable checking for 1(a)’s assertion all the time.

3. Give an example of an asserted condition that is in general impossible to evaluate, and so cannot be checked.

One common example is is_reachable for pointers or other iterators, to say that if we increment an iterator enough times we can make it equal to (refer to the same object as) a second iterator:

// Example 3: Very reasonable, but generally not checkable

auto first_iterator = /*...*/;
auto last_iterator  = /*...*/;
MyAssert( is_reachable(first_iterator, last_iterator) );
std::find( first_iterator, last_iterator, value );

In general, there’s no way to write is_reachable. You could try to increment first_iterator repeatedly until it becomes equal to last_iterator, but when the latter is not reachable that might never happen and even just trying would often be undefined behavior.

You might be tempted to test is_reachable using std::distance:

MyAssert( std::distance(first_iterator, last_iterator) >= 0 );

… but that would be horribly wrong. Can you see why?

Take a moment to think about it before continuing…

… okay. The answer is that std::distance itself requires that last_iterator is reachable from first_iterator, otherwise it’s undefined behavior. So this maybe-tempting-looking alternative actually assumes what we want to prove, and so it’s not useful for this purpose. (GotW #100 will consider in detail the general question of preconditions of contract subexpressions, which covers examples like this one.)

Can these kinds of conditions still be useful?

Yes. In practice, these kinds of conditions spell out “this is a formal comment.” Static analyzers and other tools may be able to test such a condition in a subset of cases; for example, at some call sites an analyzer may be able to infer statically that two iterators point into different containers and so one isn’t reachable from the other. Alternatively, the tools might support special pseudofunction names that they recognize when you use them in assertion expressions to give information to the tool. So the conditions can still be useful, even if they can’t generally be checked the normal way, by just evaluating them and inspecting the result.

4. How do these questions help answer:

a) what “levels” of asserted conditions we should be able to express?

There’s a wide spectrum of “expensiveness” of assertion conditions, ranging from cheap to arbitrarily high to even impossible. In the post-C++20 contracts proposal at [3], this is partly captured by the proposed basic levels of default, audit, and axiom, roughly intended to represent “cheap,” “expensive,” and “impossible” respectively.

Because we need to check these with different frequencies (or not at all), we need a way to enable and disable checking for subsets of assertions independently, even when they’re in the same piece of code.

GUIDELINE: Distinguish between (at least) “cheap,” “expensive,” and “impossible” to evaluate assertions. If you develop your own assertion system for in-house use, support enabling/disabling at least these kinds of assertions independently. [1] I say “at least” because what’s “expensive” is subjective and will vary from program to program, from team to team… and even within a program from your code to your third-party library’s code that you’re calling. Having just two preset “cheap” and “expensive” levels is minimal, but useful.

b) why the assertions we can “practically” write are a subset of all the ones we might “ideally” like to write?

It can be useful to distinguish between all ideal assertions, meaning everything that has to be true at various points in the program for it to run correctly, and the practical assertions, meaning the subset of those that can be reasonably expressed as a C++ expression and checked. In GotW #97 question 3, part of the solution says that “if an assertion fails” then…

there is a program bug, possibly in the assertion itself. The first place to look for the bug is in this same function, because if prior contracts were well tested then likely this function created the first unexpected state.

If we could write all ideal assertions, and exercise all control flow and data flow during testing, then a failed assertion would definitely mean a bug in the same function where it was written. Because we realistically can’t write and exercise them all, though, we could be observing a secondary effect from a bug that happened earlier. Still, this function is the first place to start looking for the bug.

Notes

[1] Upcoming GotWs will cover preconditions and violation handling. For handlers, we’ll cover additional distinctions such as categories of violations (e.g., to distinguish safety-related checks vs. other checks).

[2] As always, any checks left on in production would often install a different handler, such as a log-and-continue handler rather than a terminating handler; see GotW #97 Question 4, and note [1].

[3] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B. Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper, June 2018).

Acknowledgments

Thank you to the following for their feedback on this material: Joshua Berne, Guy Davidson, J. Daniel Garcia, Gábor Horváth, Maciej J., Andrzej Krzemieński.