This document describes guidelines for code which is to be incorporated into the Athena source tree. Here are the areas covered: Conventions for C code Conventions for shell scripts Software engineering practices Symbol namespaces Portability considerations for code we maintain Portability considerations for code we release Portability considerations for code we get from third parties Conventions for C code ---------------------- If you are working on an existing module which does not follow these conventions, be consistent with the existent source code rather than these conventions. You should follow the section on "Writing C" in the Gnu coding standards (available in emacs info under "standards"), with the following changes: * When making a function call, do not put a space before the open parenthesis. Do put spaces before the open parenthesis after "while", "if", or "for", though. * When defining a function, do not drop to the next line for the name of the function; put the function name on the same line as the return type. * Don't insert form feed characters into source files to divide programs into pages. * Ignore the sections on "Portability between System Types" and "Portability between CPUs"; see the sections on portability in this file instead. Also ignore the section on "Internationalization." * In the section "Calling System Functions," ignore everything about string functions. These days you can assume that using "#include ", strchr(), and strrchr() will work on just about all platforms. You can use "M-x set-c-style" "GNU" in emacs to have emacs take care of the gnu-style indentation for you. (This is the default on Athena if you haven't set it to something else in your .emacs.) Remember to drop to the next line before each open brace. When writing return statements, do not put parentheses around the return value; just write "return foo;". (This convention is not specified in the Gnu coding standards, but is consistent with the examples in the standards.) Put braces around the bodies of control constructs (if, while, for, do, switch) if they are more than one line long. Each C source file should begin with an MIT copyright statement, a comment describing what the file does, and a line: static char rcsid[] = "$Id$"; Each C header file should begin with: /* $Id$ */ followed by an MIT copyright statement and a comment describing what the file does. MIT copyright statements may be omitted for source code which is not distributed outside of MIT. The preferred convention for comments is /* comment text */ for one line comments, and /* comment text line 1 * comment text line 2 */ for multi-line comments. (To make "M-q" in Emacs leave the closing "*/" on a line by itself, set the variable c-hanging-comment-ender-p to nil.) As recommended in the Gnu coding standards, your comments should normally be complete, capitalized sentences ending with a period. Write prototypes for all of your functions. Include argument names in your prototypes, to help the reader understand what they mean. Conventions for shell scripts ----------------------------- Use /bin/sh to write shell scripts, not csh. Begin each shell script with: #!/bin/sh # $Id$ followed by an MIT copyright if the shell script will be distributed outside of MIT. Use two spaces for indentation. Avoid putting the result or alternative parts of an if statement on the same line as the "if" or "else". Avoid creating lines longer than 79 characters. Write case statements according to the format: case expression in pattern) statements ;; esac Environment variable names should be in all capitals; shell variable names should be in all lowercase. Avoid writing ${varname} for variable expansion except when necessary; just write $varname. In most cases, if you intend to expand a variable as a single word, you should write "$varname" with double quotes. This is not necessary in the switch expression of a case statement or in variable assignments, though. Test expressions in shell scripts can be a mine field of ungraceful failures if the variables being tested are empty, contain spaces or glob characters, or have a value of "!". To handle all such corner cases gracefully without adding unnecessary hair, use the following test expressions: To test if a variable is... Write --------------------------- ----- empty -z "$varname" non-empty -n "$varname" equal to a literal literal = "$varname" equal to another variable "x$varname" = "x$othervarname" set "${varname+set}" = set not set "${varname+set}" != set Software engineering practices ------------------------------ The following software engineering practices are strongly encouraged: * Restricting the operations which can access a given type of data object, and grouping them together. * Documenting data invariants (i.e. conditions on the representations of data objects which are assumed to always be true) and the meaning of data representations. * Documenting non-obvious requirements and effects of procedures. * Use of prototypes for all functions. * Automated testing of both program components ("unit testing") and whole programs ("integration testing"). The following software engineering practices are discouraged: * Use of global variables. Remember that the complexity of any function's interface is increased by the global variables it uses, even if that doesn't show up in the function prototype. Global variables are marginally acceptable to represent configuration parameters determined by command-line arguments or a configuration file, but even then it is preferable to find an alternative. * Use of global process state. You should avoid using alarm(), and you should avoid using getuid() or getenv() inside libraries or other "deep" interfaces. Code that uses global process state tends to interact poorly with other code in large programs. Symbol namespaces ----------------- If you are writing a library, you should pick a prefix for your library. You should ensure that all symbols which interact with the application's namespace (both at link time and on inclusion of the header file) begin with your prefix (or the all-caps form of your prefix, in the case of preprocessor symbols and enumeration constants) followed by an underscore. Symbols which are not intended for use by the application should begin with your prefix followed by two underscores. For instance, if your prefix is "hes", then functions exported to the user should begin with "hes_" (e.g. "hes_resolve"). Functions used internally by the library should begin with "hes__" (e.g. "hes__resolve" for the internal resolver function). Preprocessor symbols intended for use by the application should begin with "HES_" (e.g. "HES_ER_OK"); preprocessor symbols not intended for use by the application should begin with "HES__" (e.g. "HES__HESIOD_H", used to protect hesiod.h from multiple inclusion). Names of structures should begin with your prefix, but structure fields don't need to. Strictly speaking, structure fields interact with the user's namespace because the user might have "#define"'d them to something before including your header file, but applications generally shouldn't be "#define"ing lots of lowercase symbols, so this is not a real worry. Portability considerations for code we maintain ----------------------------------------------- In general, your code should assume a POSIX-compliant and ANSI C-compliant development environment, except to work around specific bugs in platforms we care about. You should use Autoconf to handle portability concerns; see the file "build-system" in this directory for how a package in the athena hierarchy should use autoconf. If you must perform an operating system test (because you are using a plain Makefile, typically for something in the packs hierarchy), do it in two steps; for instance: #define HAVE_STRERROR #if defined(sun) && !defined(SOLARIS) /* SunOS 4.1.3_U1 doesn't have strerror(). Use sys_errlist * instead. */ #undef HAVE_STRERROR #endif #ifndef HAVE_STRERROR extern const char *const sys_errlist[]; #define strerror(x) (sys_errlist[x]) #endif This way, if the source tree is ever converted to use feature tests, the person porting the code will know exactly what needs to be tested for. If you can anticipate the preprocessor symbol which would be used with Autoconf (as in this example), that's even better. Note particularly the comment instead the operating system test; it should specify: * What special consideration is needed for that operating system. * The version number of the operating system for which this approach was determined to be necessary. This will help future maintainers determine if one can eliminate the special consideration altogether when an OS upgrade has happened. Following is a list of appropriate preprocessor symbols to use to detect platforms we care about, when using plain Makefiles: SunOS: #if defined(sun) && !defined(SOLARIS) Solaris: #ifdef SOLARIS IRIX #ifdef sgi Linux: #ifdef linux NetBSD: #ifdef __NetBSD__ or #include and #ifdef BSD4_4 if applicable to all BSD 4.4 systems. SOLARIS is not automatically defined by the compiler on Solaris systems; we make sure it's defined when we build code not using autoconf. There are no reliable automatically defined constants for Solaris systems. For some highly non-portable operations you may need to do a platform test even when using Autoconf. In that case, use AC_CANONICAL_HOST rather than the above ifdefs. Portability considerations for code we release ---------------------------------------------- All of the standards in the previous section apply; however, we generally care about more platforms for code we release to the outside works. It is discouraged, but acceptable, to care about platforms which are not POSIX-compliant. Code that cares about such platforms should determine whether the platform supports POSIX interfaces by using AC_CHECK_HEADERS(unistd.h) to determine whether it can #include , and then checking whether _POSIX_VERSION is defined. Such code should always use POSIX interfaces rather than less portable interfaces when they are available. Portability considerations for code we get from third parties ------------------------------------------------------------- The overriding principle for code obtained from third parties is to make as few changes as possible. A lot of third-party code has a very bad approach to portability, or cares about a lot of platforms we don't care about. You should attempt to follow the portability approach used by the rest of the program, such as it may be. Ideally, any changes you make should be made in such a manner that they can be incorporated into the source tree maintained by the third party.