1 | <html> |
---|
2 | <head> |
---|
3 | <title> libsm : Assert and Abort </title> |
---|
4 | </head> |
---|
5 | <body> |
---|
6 | |
---|
7 | <a href="index.html">Back to libsm overview</a> |
---|
8 | |
---|
9 | <center> |
---|
10 | <h1> libsm : Assert and Abort </h1> |
---|
11 | <br> $Id: assert.html,v 1.1.1.1 2003-04-08 15:12:31 zacheiss Exp $ |
---|
12 | </center> |
---|
13 | |
---|
14 | <h2> Introduction </h2> |
---|
15 | |
---|
16 | This package contains abstractions |
---|
17 | for assertion checking and abnormal program termination. |
---|
18 | |
---|
19 | <h2> Synopsis </h2> |
---|
20 | |
---|
21 | <pre> |
---|
22 | #include <sm/assert.h> |
---|
23 | |
---|
24 | /* |
---|
25 | ** abnormal program termination |
---|
26 | */ |
---|
27 | |
---|
28 | void sm_abort_at(char *filename, int lineno, char *msg); |
---|
29 | typedef void (*SM_ABORT_HANDLER)(char *filename, int lineno, char *msg); |
---|
30 | void sm_abort_sethandler(SM_ABORT_HANDLER); |
---|
31 | void sm_abort(char *fmt, ...) |
---|
32 | |
---|
33 | /* |
---|
34 | ** assertion checking |
---|
35 | */ |
---|
36 | |
---|
37 | SM_REQUIRE(expression) |
---|
38 | SM_ASSERT(expression) |
---|
39 | SM_ENSURE(expression) |
---|
40 | |
---|
41 | extern SM_DEBUG_T SmExpensiveRequire; |
---|
42 | extern SM_DEBUG_T SmExpensiveAssert; |
---|
43 | extern SM_DEBUG_T SmExpensiveEnsure; |
---|
44 | |
---|
45 | #if SM_CHECK_REQUIRE |
---|
46 | #if SM_CHECK_ASSERT |
---|
47 | #if SM_CHECK_ENSURE |
---|
48 | |
---|
49 | cc -DSM_CHECK_ALL=0 -DSM_CHECK_REQUIRE=1 ... |
---|
50 | </pre> |
---|
51 | |
---|
52 | <h2> Abnormal Program Termination </h2> |
---|
53 | |
---|
54 | The functions sm_abort and sm_abort_at are used to report a logic |
---|
55 | bug and terminate the program. They can be invoked directly, |
---|
56 | and they are also used by the assertion checking macros. |
---|
57 | |
---|
58 | <dl> |
---|
59 | <dt> |
---|
60 | void sm_abort_at(char *filename, int lineno, char *msg) |
---|
61 | <dd> |
---|
62 | This is the low level interface for causing abnormal program |
---|
63 | termination. It is intended to be invoked from a |
---|
64 | macro, such as the assertion checking macros. |
---|
65 | |
---|
66 | If filename != NULL then filename and lineno specify the line |
---|
67 | of source code on which the logic bug is detected. These |
---|
68 | arguments are normally either set to __FILE__ and __LINE__ |
---|
69 | from an assertion checking macro, or they are set to NULL and 0. |
---|
70 | |
---|
71 | The default action is to print an error message to smioerr |
---|
72 | using the arguments, and then call abort(). This default |
---|
73 | behaviour can be changed by calling sm_abort_sethandler. |
---|
74 | <p> |
---|
75 | <dt> |
---|
76 | void sm_abort_sethandler(SM_ABORT_HANDLER handler) |
---|
77 | <dd> |
---|
78 | Install 'handler' as the callback function that is invoked |
---|
79 | by sm_abort_at. This callback function is passed the same |
---|
80 | arguments as sm_abort_at, and is expected to log an error |
---|
81 | message and terminate the program. The callback function should |
---|
82 | not raise an exception or perform cleanup: see Rationale. |
---|
83 | |
---|
84 | sm_abort_sethandler is intended to be called once, from main(), |
---|
85 | before any additional threads are created: see Rationale. |
---|
86 | You should not use sm_abort_sethandler to |
---|
87 | switch back and forth between several handlers; |
---|
88 | this is particularly dangerous when there are |
---|
89 | multiple threads, or when you are in a library routine. |
---|
90 | <p> |
---|
91 | <dt> |
---|
92 | void sm_abort(char *fmt, ...) |
---|
93 | <dd> |
---|
94 | This is the high level interface for causing abnormal program |
---|
95 | termination. It takes printf arguments. There is no need to |
---|
96 | include a trailing newline in the format string; a trailing newline |
---|
97 | will be printed if appropriate by the handler function. |
---|
98 | </dl> |
---|
99 | |
---|
100 | <h2> Assertions </h2> |
---|
101 | |
---|
102 | The assertion handling package |
---|
103 | supports a style of programming in which assertions are used |
---|
104 | liberally throughout the code, both as a form of documentation, |
---|
105 | and as a way of detecting bugs in the code by performing runtime checks. |
---|
106 | <p> |
---|
107 | There are three kinds of assertion: |
---|
108 | <dl> |
---|
109 | <dt> |
---|
110 | SM_REQUIRE(expr) |
---|
111 | <dd> |
---|
112 | This is an assertion used at the beginning of a function |
---|
113 | to check that the preconditions for calling the function |
---|
114 | have been satisfied by the caller. |
---|
115 | <p> |
---|
116 | <dt> |
---|
117 | SM_ENSURE(expr) |
---|
118 | <dd> |
---|
119 | This is an assertion used just before returning from a function |
---|
120 | to check that the function has satisfied all of the postconditions |
---|
121 | that it is required to satisfy by its contract with the caller. |
---|
122 | <p> |
---|
123 | <dt> |
---|
124 | SM_ASSERT(expr) |
---|
125 | <dd> |
---|
126 | This is an assertion that is used in the middle of a function, |
---|
127 | to check loop invariants, and for any other kind of check that is |
---|
128 | not a "require" or "ensure" check. |
---|
129 | </dl> |
---|
130 | If any of the above assertion macros fail, then sm_abort_at |
---|
131 | is called. By default, a message is printed to stderr and the |
---|
132 | program is aborted. For example, if SM_REQUIRE(arg > 0) fails |
---|
133 | because arg <= 0, then the message |
---|
134 | <blockquote><pre> |
---|
135 | foo.c:47: SM_REQUIRE(arg > 0) failed |
---|
136 | </pre></blockquote> |
---|
137 | is printed to stderr, and abort() is called. |
---|
138 | You can change this default behaviour using sm_abort_sethandler. |
---|
139 | |
---|
140 | <h2> How To Disable Assertion Checking At Compile Time </h2> |
---|
141 | |
---|
142 | You can use compile time macros to selectively enable or disable |
---|
143 | each of the three kinds of assertions, for performance reasons. |
---|
144 | For example, you might want to enable SM_REQUIRE checking |
---|
145 | (because it finds the most bugs), but disable the other two types. |
---|
146 | <p> |
---|
147 | By default, all three types of assertion are enabled. |
---|
148 | You can selectively disable individual assertion types |
---|
149 | by setting one or more of the following cpp macros to 0 |
---|
150 | before <sm/assert.h> is included for the first time: |
---|
151 | <blockquote> |
---|
152 | SM_CHECK_REQUIRE<br> |
---|
153 | SM_CHECK_ENSURE<br> |
---|
154 | SM_CHECK_ASSERT<br> |
---|
155 | </blockquote> |
---|
156 | Or, you can define SM_CHECK_ALL as 0 to disable all assertion |
---|
157 | types, then selectively define one or more of SM_CHECK_REQUIRE, |
---|
158 | SM_CHECK_ENSURE or SM_CHECK_ASSERT as 1. For example, |
---|
159 | to disable all assertions except for SM_REQUIRE, you can use |
---|
160 | these C compiler flags: |
---|
161 | <blockquote> |
---|
162 | -DSM_CHECK_ALL=0 -DSM_CHECK_REQUIRE=1 |
---|
163 | </blockquote> |
---|
164 | |
---|
165 | After <sm/assert.h> is included, the macros |
---|
166 | SM_CHECK_REQUIRE, SM_CHECK_ENSURE and SM_CHECK_ASSERT |
---|
167 | are each set to either 0 or 1. |
---|
168 | |
---|
169 | <h2> How To Write Complex or Expensive Assertions </h2> |
---|
170 | |
---|
171 | Sometimes an assertion check requires more code than a simple |
---|
172 | boolean expression. |
---|
173 | For example, it might require an entire statement block |
---|
174 | with its own local variables. |
---|
175 | You can code such assertion checks by making them conditional on |
---|
176 | SM_CHECK_REQUIRE, SM_CHECK_ENSURE or SM_CHECK_ASSERT, |
---|
177 | and using sm_abort to signal failure. |
---|
178 | <p> |
---|
179 | Sometimes an assertion check is significantly more expensive |
---|
180 | than one or two comparisons. |
---|
181 | In such cases, it is not uncommon for developers to comment out |
---|
182 | the assertion once the code is unit tested. |
---|
183 | Please don't do this: it makes it hard to turn the assertion |
---|
184 | check back on for the purposes of regression testing. |
---|
185 | What you should do instead is make the assertion check conditional |
---|
186 | on one of these predefined debug objects: |
---|
187 | <blockquote> |
---|
188 | SmExpensiveRequire<br> |
---|
189 | SmExpensiveAssert<br> |
---|
190 | SmExpensiveEnsure |
---|
191 | </blockquote> |
---|
192 | By doing this, you bring the cost of the assertion checking code |
---|
193 | back down to a single comparison, unless expensive assertion checking |
---|
194 | has been explicitly enabled. |
---|
195 | By the way, the corresponding debug category names are |
---|
196 | <blockquote> |
---|
197 | sm_check_require<br> |
---|
198 | sm_check_assert<br> |
---|
199 | sm_check_ensure |
---|
200 | </blockquote> |
---|
201 | What activation level should you check for? |
---|
202 | Higher levels correspond to more expensive assertion checks. |
---|
203 | Here are some basic guidelines: |
---|
204 | <blockquote> |
---|
205 | level 1: < 10 basic C operations<br> |
---|
206 | level 2: < 100 basic C operations<br> |
---|
207 | level 3: < 1000 basic C operations<br> |
---|
208 | ... |
---|
209 | </blockquote> |
---|
210 | |
---|
211 | <p> |
---|
212 | Here's a contrived example of both techniques: |
---|
213 | <blockquote><pre> |
---|
214 | void |
---|
215 | w_munge(WIDGET *w) |
---|
216 | { |
---|
217 | SM_REQUIRE(w != NULL); |
---|
218 | #if SM_CHECK_REQUIRE |
---|
219 | /* |
---|
220 | ** We run this check at level 3 because we expect to check a few hundred |
---|
221 | ** table entries. |
---|
222 | */ |
---|
223 | |
---|
224 | if (sm_debug_active(&SmExpensiveRequire, 3)) |
---|
225 | { |
---|
226 | int i; |
---|
227 | |
---|
228 | for (i = 0; i < WIDGET_MAX; ++i) |
---|
229 | { |
---|
230 | if (w[i] == NULL) |
---|
231 | sm_abort("w_munge: NULL entry %d in widget table", i); |
---|
232 | } |
---|
233 | } |
---|
234 | #endif /* SM_CHECK_REQUIRE */ |
---|
235 | </pre></blockquote> |
---|
236 | |
---|
237 | <h2> Other Guidelines </h2> |
---|
238 | |
---|
239 | You should resist the urge to write SM_ASSERT(0) when the code has |
---|
240 | reached an impossible place. It's better to call sm_abort, because |
---|
241 | then you can generate a better error message. For example, |
---|
242 | <blockquote><pre> |
---|
243 | switch (foo) |
---|
244 | { |
---|
245 | ... |
---|
246 | default: |
---|
247 | sm_abort("impossible value %d for foo", foo); |
---|
248 | } |
---|
249 | </pre></blockquote> |
---|
250 | Note that I did not bother to guard the default clause of the switch |
---|
251 | statement with #if SM_CHECK_ASSERT ... #endif, because there is |
---|
252 | probably no performance gain to be had by disabling this particular check. |
---|
253 | <p> |
---|
254 | Avoid including code that has side effects inside of assert macros, |
---|
255 | or inside of SM_CHECK_* guards. You don't want the program to stop |
---|
256 | working if assertion checking is disabled. |
---|
257 | |
---|
258 | <h2> Rationale for Logic Bug Handling </h2> |
---|
259 | |
---|
260 | When a logic bug is detected, our philosophy is to log an error message |
---|
261 | and terminate the program, dumping core if possible. |
---|
262 | It is not a good idea to raise an exception, attempt cleanup, |
---|
263 | or continue program execution. Here's why. |
---|
264 | <p> |
---|
265 | First of all, to facilitate post-mortem analysis, we want to dump core |
---|
266 | on detecting a logic bug, disturbing the process image as little as |
---|
267 | possible before dumping core. We don't want to raise an exception |
---|
268 | and unwind the stack, executing cleanup code, before dumping core, |
---|
269 | because that would obliterate information we need to analyze the cause |
---|
270 | of the abort. |
---|
271 | <p> |
---|
272 | Second, it is a bad idea to raise an exception on an assertion failure |
---|
273 | because this places unacceptable restrictions on code that uses |
---|
274 | the assertion macros. |
---|
275 | The reason is this: the sendmail code must be written so that |
---|
276 | anywhere it is possible for an assertion to be raised, the code |
---|
277 | will catch the exception and clean up if necessary, restoring |
---|
278 | data structure invariants and freeing resources as required. |
---|
279 | If an assertion failure was signalled by raising an exception, |
---|
280 | then every time you added an assertion, you would need to check |
---|
281 | both the function containing the assertion and its callers to see |
---|
282 | if any exception handling code needed to be added to clean up properly |
---|
283 | on assertion failure. That is far too great a burden. |
---|
284 | <p> |
---|
285 | It is a bad idea to attempt cleanup upon detecting a logic bug |
---|
286 | for several reasons: |
---|
287 | <ul> |
---|
288 | <li>If you need to perform cleanup actions in order to preserve the |
---|
289 | integrity of the data that the program is handling, then the |
---|
290 | program is not fault tolerant, and needs to be redesigned. |
---|
291 | There are several reasons why a program might be terminated unexpectedly: |
---|
292 | the system might crash, the program might receive a signal 9, |
---|
293 | the program might be terminated by a memory fault (possibly as a |
---|
294 | side effect of earlier data structure corruption), and the program |
---|
295 | might detect a logic bug and terminate itself. Note that executing |
---|
296 | cleanup actions is not feasible in most of the above cases. |
---|
297 | If the program has a fault tolerant design, then it will not lose |
---|
298 | data even if the system crashes in the middle of an operation. |
---|
299 | <p> |
---|
300 | <li>If the cause of the logic bug is earlier data structure corruption, |
---|
301 | then cleanup actions intended to preserve the integrity of the data |
---|
302 | that the program is handling might cause more harm than good: they |
---|
303 | might cause information to be corrupted or lost. |
---|
304 | <p> |
---|
305 | <li>If the program uses threads, then cleanup is much more problematic. |
---|
306 | Suppose that thread A is holding some locks, and is in the middle of |
---|
307 | modifying a shared data structure. The locks are needed because the |
---|
308 | data structure is currently in an inconsistent state. At this point, |
---|
309 | a logic bug is detected deep in a library routine called by A. |
---|
310 | How do we get all of the running threads to stop what they are doing |
---|
311 | and perform their thread-specific cleanup actions before terminating? |
---|
312 | We may not be able to get B to clean up and terminate cleanly until |
---|
313 | A has restored the invariants on the data structure it is modifying |
---|
314 | and releases its locks. So, we raise an exception and unwind the stack, |
---|
315 | restoring data structure invariants and releasing locks at each level |
---|
316 | of abstraction, and performing an orderly shutdown. There are certainly |
---|
317 | many classes of error conditions for which using the exception mechanism |
---|
318 | to perform an orderly shutdown is appropriate and feasible, but there |
---|
319 | are also classes of error conditions for which exception handling and |
---|
320 | orderly shutdown is dangerous or impossible. The abnormal program |
---|
321 | termination system is intended for this second class of error conditions. |
---|
322 | If you want to trigger orderly shutdown, don't call sm_abort: |
---|
323 | raise an exception instead. |
---|
324 | </ul> |
---|
325 | <p> |
---|
326 | Here is a strategy for making sendmail fault tolerant. |
---|
327 | Sendmail is structured as a collection of processes. The "root" process |
---|
328 | does as little as possible, except spawn children to do all of the real |
---|
329 | work, monitor the children, and act as traffic cop. |
---|
330 | We use exceptions to signal expected but infrequent error conditions, |
---|
331 | so that the process encountering the exceptional condition can clean up |
---|
332 | and keep going. (Worker processes are intended to be long lived, in |
---|
333 | order to minimize forking and increase performance.) But when a bug |
---|
334 | is detected in a sendmail worker process, the worker process does minimal |
---|
335 | or no cleanup and then dies. A bug might be detected in several ways: |
---|
336 | the process might dereference a NULL pointer, receive a signal 11, |
---|
337 | core dump and die, or an assertion might fail, in which case the process |
---|
338 | commits suicide. Either way, the root process detects the death of the |
---|
339 | worker, logs the event, and spawns another worker. |
---|
340 | |
---|
341 | <h2> Rationale for Naming Conventions </h2> |
---|
342 | |
---|
343 | The names "require" and "ensure" come from the writings of Bertrand Meyer, |
---|
344 | a prominent evangelist for assertion checking who has written a number of |
---|
345 | papers about the "Design By Contract" programming methodology, |
---|
346 | and who created the Eiffel programming language. |
---|
347 | Many other assertion checking packages for C also have "require" and |
---|
348 | "ensure" assertion types. In short, we are conforming to a de-facto |
---|
349 | standard. |
---|
350 | <p> |
---|
351 | We use the names <tt>SM_REQUIRE</tt>, <tt>SM_ASSERT</tt> |
---|
352 | and <tt>SM_ENSURE</tt> in preference to to <tt>REQUIRE</tt>, |
---|
353 | <tt>ASSERT</tt> and <tt>ENSURE</tt> because at least two other |
---|
354 | open source libraries (libisc and libnana) define <tt>REQUIRE</tt> |
---|
355 | and <tt>ENSURE</tt> macros, and many libraries define <tt>ASSERT</tt>. |
---|
356 | We want to avoid name conflicts with other libraries. |
---|
357 | |
---|
358 | </body> |
---|
359 | </html> |
---|