This document is an updated version of the Indian Hill C Style and Coding Standards paper, with modifications by the last three authors. It describes a recommended coding standard for C programs. The scope is coding style, not functional organization.
This document is a modified version of a document from a committee formed at AT&T's Indian Hill labs to establish a common set of coding standards and recommendations for the Indian Hill community. The scope of this work is C coding style. Good style should encourage consistent layout, improve portability, and reduce errors. This work does not cover functional organization, or general issues such as the use of gotos. We[see following footnote] have tried to combine previous work [1,6,8] on C style into a uniform set of standards that should be appropriate for any project using C, although parts are biased towards particular systems. Of necessity, these standards cannot cover all situations. Experience and informed judgement count for much. Programmers who encounter unusual situations should consult either experienced C programmers or code written by experienced C programmers (preferably following these rules).
[Footnote: The opinions in this document do not reflect the opinions of all authors. This is still an evolving document. Please send comments and suggestions to pardo@cs.washington.edu or {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo]
The standards in this document are not of themselves required, but individual institutions or groups may adopt part or all of them as a part of program acceptance. It is therefore likely that others at your institution will code in a similar style. Ultimately, the goal of these standards is to increase portability, reduce maintenance, and above all improve clarity.
Many of the style choices here are somewhat arbitrary. Mixed coding style is harder to maintain than bad coding style. When changing existing code it is better to conform to the style (indentation, spacing, commenting, naming conventions) of the existing code than it is to blindly follow this document.
``To be clear is professional; not to be clear is unprofessional.'' - Sir Ernest Gowers.
2. File Organization
A file consists of various sections that should be separated by several blank lines. Although there is no maximum length limit for source files, files with more than about 1000 lines are cumbersome to deal with. The editor may not have enough temp space to edit the file, compilations will go more slowly, etc. Many rows of asterisks, for example, present little information compared to the time it takes to scroll past, and are discouraged. Lines longer than 79 columns are not handled well by all terminals and should be avoided if possible. Excessively long lines which result from deep indenting are often a symptom of poorly-organized code.
2.1. File Naming Conventions
File names are made up of a base name, and an optional period and suffix. The first character of the name should be a letter and all characters (except the period) should be lower-case letters and numbers. The base name should be eight or fewer characters and the suffix should be three or fewer characters (four, if you include the period). These rules apply to both program files and default files used and produced by the program (e.g., ``rogue.sav'').
Some compilers and tools require certain suffix conventions for names of files [5]. The following suffixes are required:
C++ has compiler-dependent suffix conventions, including .c, ..c, .cc, .c.c, and .C. Since much C code is also C++ code, there is no clear solution here.
In addition, it is conventional to use ``Makefile'' (not ``makefile'') for the control file for make (for systems that support it) and ``README'' for a summary of the contents of the directory or directory tree.
2.2. Program Files
The suggested order of sections for a program file is as follows:
2.3. Header Files
Header files are files that are included in other files prior to compilation by the C preprocessor. Some, such as stdio.h, are defined at the system level and must included by any program using the standard I/O library. Header files are also used to contain data declarations and defines that are needed by more than one program. Header files should be functionally organized, i.e., declarations for separate sub-systems should be in separate header files. Also, if a set of declarations is likely to change when code is ported from one machine to another, those declarations should be in a separate header file.
Avoid private header filenames that are the same as
library header filenames. The statement #include "math.h"
will include the standard library math header file if the
intended one is not found in the current directory. If this
is what you want to happen, comment this fact. Don't use
absolute pathnames for header files. Use the
Header files that declare functions or external variables
should be included in the file that defines the function
or variable. That way, the compiler can do type checking
and the external declaration will always agree with the
definition.
Defining variables in a header file is often a poor
idea. Frequently it is a symptom of poor partitioning of
code between files. Also, some objects like typedefs and
initialized data definitions cannot be seen twice by the
compiler in one compilation. On some systems, repeating
uninitialized declarations without the extern
keyword also causes problems. Repeated declarations can
happen if include files are nested and will cause the
compilation to fail.
Header files should not be nested. The prologue for a
header file should, therefore, describe what other headers
need to be #included for the header to be functional. In
extreme cases, where a large number of header files are to
be included in several different source files, it is acceptable
to put all common #includes in one include file.
It is common to put the following into each .h file to
prevent accidental double-inclusion.
2.4. Other Files
It is conventional to have a file called ``README'' to
document both ``the bigger picture'' and issues for the program
as a whole. For example, it is common to include a
list of all conditional compilation flags and what they
mean. It is also common to list files that are machine
dependent, etc.
3. Comments
``When the code and the comments disagree, both
are probably wrong.'' - Norm Schryer
The comments should describe what is happening, how it
is being done, what parameters mean, which globals are used
and which are modified, and any restrictions or bugs.
Avoid, however, comments that are clear from the code, as
such information rapidly gets out of date. Comments that
disagree with the code are of negative value. Short comments
should be what comments, such as ``compute mean
value'', rather than how comments such as ``sum of values
divided by n''. C is not assembler; putting a comment at
the top of a 3-10 line section telling what it does overall
is often more useful than a comment on each line describing
micrologic.
Comments should justify offensive code. The justification
should be that something bad will happen if unoffensive
code is used. Just making code faster is not enough to rationalize
a hack; the performance must be shown to be unacceptable
without the hack. The comment should explain the
unacceptable behavior and describe why the hack is a
``good'' fix.
Comments that describe data structures, algorithms,
etc., should be in block comment form with the opening /* in
columns 1-2, a * in column 2 before each line of comment
text, and the closing */ in columns 2-3. An alternative is
to have ** in columns 1-2, and put the closing */ also in
1-2.
Note that grep '^.\*' will catch all block comments in
the file[see following footnote].
Very long block comments such as drawn-out discussions
and copyright notices often start with /* in
columns 1-2, no leading * before lines of text, and the
closing */ in columns 1-2. Block comments inside a function
are appropriate, and they should be tabbed over to the same
tab setting as the code that they describe. One-line comments
alone on a line should be indented to the tab setting
of the code that follows.
Very short comments may appear on the same line as the
code they describe, and should be tabbed over to separate
them from the statements. If more than one short comment
appears in a block of code they should all be tabbed to the
same tab setting.
4. Declarations
Global declarations should begin in column one. All
external data declaration should be preceded by the extern
keyword. If an external variable is an array that is defined
with an explicit size, then the array bounds must be
repeated in the extern declaration unless the size is always
encoded in the array (e.g., a read-only character array that
is always null-terminated). Repeated size declarations are
particularly beneficial to someone picking up code written
by another.
The ``pointer'' qualifier, `*', should be with the
variable name rather than with the type.
Unrelated declarations, even of the same type, should
be on separate lines. A comment describing the role of the
object being declared should be included, with the exception
that a list of #defined constants do not need comments if
the constant names are sufficient documentation. The names,
values, and comments are usually tabbed so that they line up
underneath each other. Use the tab character rather than
blanks (spaces). For structure and union template declarations,
each element should be alone on a line with a comment
describing it. The opening brace ({) should be on the same
line as the structure tag, and the closing brace (}) should
be in column one.
These defines are sometimes put right after the declaration
of type, within the struct declaration, with enough tabs
after the `#' to indent define one level more than the
structure member declarations. When the actual values are
unimportant, the enum facility is better [see following footnote].
Any variable whose initial value is important should be
explicitly initialized, or at the very least should be commented
to indicate that C's default initialization to zero
is being relied upon. The empty initializer, ``{}'', should
never be used. Structure initializations should be fully
parenthesized with braces. Constants used to initialize
longs should be explicitly long. Use capital letters; for
example two long ``2l'' looks a lot like ``21'', the number
twenty-one.
In any file which is part of a larger whole rather than
a self-contained program, maximum use should be made of the
static keyword to make functions and variables local to single
files. Variables in particular should be accessible
from other files only when there is a clear need that cannot
be filled in another way. Such usage should be commented to
make it clear that another file's variables are being used;
the comment should name the other file. If your debugger
hides static objects you need to see during debugging, declare
them as STATIC and #define STATIC as needed.
The most important types should be highlighted by
typedeffing them, even if they are only integers, as the
unique name makes the program easier to read (as long as
there are only a few things typedeffed to integers!).
Structures may be typedeffed when they are declared. Give
the struct and the typedef the same name.
The return type of functions should always be declared.
If function prototypes are available, use them. One common
mistake is to omit the declaration of external math functions
that return double. The compiler then assumes that
the return value is an integer and the bits are dutifully
converted into a (meaningless) floating point value.
``C takes the point of view that the programmer is always right.''
- Michael DeCorte
5. Function Declarations
Each function should be preceded by a block comment
prologue that gives a short description of what the function
does and (if not clear) how to use it. Discussion of non-trivial
design decisions and side-effects is also appropriate.
Avoid duplicating information clear from the code.
The function return type should be alone on a line,
(optionally) indented one stop[see following footnote].
Do not default to int; if
the function does not return a value then it should be given
return type void[see second following footnote].
If the value returned requires a long
explanation, it should be given in the prologue; otherwise
it can be on the same line as the return type, tabbed over.
The function name (and the formal parameter list) should be
alone on a line, in column one. Destination (return value)
parameters should generally be first (on the left). All
formal parameter declarations, local declarations and code
within the function body should be tabbed over one stop.
The opening brace of the function body should be alone on a
line beginning in column one.
Each parameter should be declared (do not default to
int). In general the role of each variable in the function
should be described. This may either be done in the function
comment or, if each declaration is on its own line, in
a comment on that line. Loop counters called ``i'', string
pointers called ``s'', and integral types called ``c'' and
used for characters are typically excluded. If a group of
functions all have a like parameter or local variable, it
helps to call the repeated variable by the same name in all
functions. (Conversely, avoid using the same name for different
purposes in related functions.) Like parameters
should also appear in the same place in the various argument
lists.
Comments for parameters and local variables should be
tabbed so that they line up underneath each other. Local
variable declarations should be separated from the
function's statements by a blank line.
Be careful when you use or declare functions that take
a variable number of arguments (``varargs''). There is no
truly portable way to do varargs in C. Better to design an
interface that uses a fixed number of arguments. If you
must have varargs, use the library macros for declaring
functions with variant argument lists.
If the function uses any external variables (or functions)
that are not declared globally in the file, these
should have their own declarations in the function body using
the extern keyword.
Avoid local declarations that override declarations at
higher levels. In particular, local variables should not be
redeclared in nested blocks. Although this is valid C, the
potential confusion is enough that lint will complain
about it when given the -h option.
6. Whitespace
Use vertical and horizontal whitespace generously. Indentation
and spacing should reflect the block structure of
the code; e.g., there should be at least two blank lines
between the end of one function and the comments for the
next.
A long string of conditional operators should be split
onto separate lines.
Keywords that are followed by expressions in parentheses
should be separated from the left parenthesis by a blank.
(The sizeof operator is an exception.) Blanks should also
appear after commas in argument lists to help separate the
arguments visually. On the other hand, macro definitions
with arguments must not have a blank between the name and
the left parenthesis, otherwise the C preprocessor will not
recognize the argument list.
7. Examples
8. Simple Statements
There should be only one statement per line unless the
statements are very closely related.
The null body of a for or while loop should be alone on a
line and commented so that it is clear that the null body is
intentional and not missing code.
Do not default the test for non-zero, i.e.
The non-zero test is often defaulted for predicates and
other functions or expressions which meet the following restrictions:
It is common practice to declare a boolean type
``bool'' in a global include file. The special names improve
readability immensely.
Even with these declarations, do not check a boolean value
for equality with one (TRUE, YES, etc.); instead test for inequality
with zero (FALSE, NO, etc.). Most functions are
guaranteed to return zero if false, but only non-zero if true.
Thus,
There is a time and a place for embedded assignment
statements. In some constructs there is no better way to
accomplish the results without making the code bulkier and
less readable.
The ++ and -- operators count as assignment statements. So,
for many purposes, do functions with side effects. Using
embedded assignment statements to improve run-time performance
is also possible. However, one should consider the
tradeoff between increased speed and decreased maintainability
that results when embedded assignments are used in artificial
places. For example,
Goto statements should be used sparingly, as in any
well-structured code. The main place where they can be usefully
employed is to break out of several levels of switch,
for, and while nesting, although the need to do such a thing
may indicate that the inner constructs should be broken out
into a separate function, with a success/failure return
code.
When a goto is necessary the accompanying label should be
alone on a line and tabbed one stop to the left of the code
that follows. The goto should be commented (possibly in the
block header) as to its utility and purpose. Continue
should be used sparingly and near the top of the loop.
Break is less troublesome.
Parameters to non-prototyped functions sometimes need
to be promoted explicitly. If, for example, a function expects
a 32-bit long and gets handed a 16-bit int instead,
the stack can get misaligned. Problems occur with pointer,
integral, and floating-point values.
9. Compound Statements
A compound statement is a list of statements enclosed
by braces. There are many common ways of formatting the
braces. Be consistent with your local standard, if you have
one, or pick one and use it consistently. When editing
someone else's code, always use the style used in that code.
The style above is called ``K&R style'', and is preferred if
you haven't already got a favorite. With K&R style, the
else part of an if-else statement and the
while part of a do-while statement should appear
on the same line as the close brace. With most other styles, the
braces are always alone on a line.
When a block of code has several labels (unless there are a lot of
them), the labels are placed on separate lines. The fall-through
feature of the C switch statement, (that is, when there is
no break between a code segment and the next case statement) must
be commented for future maintenance. A lint-style comment/directive
is best.
Here, the last break is unnecessary, but is required
because it prevents a fall-through error if another case is
added later after the last one. The default case, if used,
should be last and does not require a break if it is last.
Whenever an if-else statement has a compound statement
for either the if or else section, the statements of both
the if and else sections should both be enclosed in braces
(called fully bracketed syntax).
Braces are also essential in if-if-else sequences with
no second else such as the following, which will be parsed
incorrectly if the brace after (ex1) and its mate are omitted:
An if-else with else if statements should be
written with the else conditions left-justified.
The format then looks like a generalized switch statement
and the tabbing reflects the switch between exactly one of
several alternatives rather than a nesting of statements.
Do-while loops should always have braces around the
body.
The following code is very dangerous:
Note that on systems where CIRCUIT is not defined the statement
``++i;'' will only get executed when expr is false!
This example points out both the value of naming macros with
CAPS and of making code fully-bracketed.
Sometimes an if causes an unconditional control
transfer via break, continue,
goto, or return. The else
should be implicit and the code should not be indented.
10. Operators
Unary operators should not be separated from their single
operand. Generally, all binary operators except `.' and
`->' should be separated from their operands by blanks.
Some judgement is called for in the case of complex expressions,
which may be clearer if the ``inner'' operators are
not surrounded by spaces and the ``outer'' ones are.
If you think an expression will be hard to read, consider
breaking it across lines. Splitting at the lowest-precedence
operator near the break is best. Since C has
some unexpected precedence rules, expressions involving
mixed operators should be parenthesized. Too many
parentheses, however, can make a line harder to read
because humans aren't good at parenthesis-matching.
There is a time and place for the binary comma operator,
but generally it should be avoided. The comma operator
is most useful to provide multiple initializations or operations,
as in for statements. Complex expressions, for instance
those with nested ternary ?: operators, can be
confusing and should be avoided if possible. There are some
macros like getchar where both the ternary operator and comma
operators are useful. The logical expression operand before
the ?: should be parenthesized and both return values
must be the same type.
11. Naming Conventions
Individual projects will no doubt have their own naming
conventions. There are some general rules however.
In general, global names (including enums) should have
a common prefix identifying the module that they belong
with. Globals may alternatively be grouped in a global
structure. Typedeffed names often have ``_t'' appended to
their name.
Avoid names that might conflict with various standard
library names. Some systems will include more library code
than you want. Also, your program may be extended someday.
12. Constants
Numerical constants should not be coded directly. The
#define feature of the C preprocessor should be used to give
constants meaningful names. Symbolic constants make the
code easier to read. Defining the value in one place also
makes it easier to administer large programs since the constant
value can be changed uniformly by changing only the
define. The enumeration data type is a better way to declare
variables that take on only a discrete set of values,
since additional type checking is often available. At the
very least, any directly-coded numerical constant must have
a comment explaining the derivation of the value.
Constants should be defined consistently with their
use; e.g. use 540.0 for a float instead of 540 with an implicit
float cast. There are some cases where the constants
zero and one may appear as themselves instead of as defines.
For example if a for loop indexes through an array, then
Simple character constants should be defined as character
literals rather than numbers. Non-text characters are
discouraged as non-portable. If non-text characters are
necessary, particularly if they are used in strings, they
should be written using a escape character of three octal
digits rather than one (e.g., '\007'). Even so, such usage
should be considered machine-dependent and treated as such.
13. Macros
Complex expressions can be used as macro parameters,
and operator-precedence problems can arise unless all occurrences
of parameters have parentheses around them. There
is little that can be done about the problems caused by side
effects in parameters except to avoid side effects in expressions
(a good idea anyway) and, when possible, to write
macros that evaluate their parameters exactly once. There
are times when it is impossible to write macros that act exactly
like functions.
Some macros also exist as functions (e.g., getc and
fgetc). The macro should be used in implementing the function
so that changes to the macro will be automatically reflected
in the function. Care is needed when interchanging
macros and functions since function parameters are passed by
value, while macro parameters are passed by name substitution.
Carefree use of macros requires that they be declared
carefully.
Macros should avoid using globals, since the global
name may be hidden by a local declaration. Macros that
change named parameters (rather than the storage they point
at) or may be used as the left-hand side of an assignment
should mention this in their comments. Macros that take no
parameters but reference variables, are long, or are aliases
for function calls should be given an empty parameter list,
e.g.,
Macros save function call/return overhead, but when a
macro gets long, the effect of the call/return becomes
negligible, so a function should be used instead.
In some cases it is appropriate to make the compiler
insure that a macro is terminated with a semicolon.
If the semicolon is omitted after the call to SP3, then the
else will (silently!) become associated with the if in the
SP3 macro. With the semicolon, the else doesn't match any
if! The macro SP3 can be written safely as
Writing out the enclosing do-while by hand is awkward and
some compilers and tools may complain that there is a constant
in the ``while'' conditional. A macro for declaring
statements may make programming easier.
Declare SP3 with
Using STMT will help prevent small typos from silently
changing programs.
Except for type casts, sizeof, and hacks such as the
above, macros should contain keywords only if the entire
macro is surrounded by braces.
14. Debugging
If you use enums, the first enum constant should have a
non-zero value, or the first constant should indicate an error.
Check for error return values, even from functions that
``can't'' fail. Consider that close() and fclose() can and
do fail, even when all prior file operations have succeeded.
Write your own functions so that they test for errors and
return error values or abort the program in a well-defined
way. Include a lot of debugging and error-checking code and
leave most of it in the finished product. Check even for
``impossible'' errors. [8]
Use the assert facility to insist that each function is
being passed well-defined values, and that intermediate
results are well-formed.
Build in the debug code using as few #ifdefs as possible.
For instance, if ``mm_malloc'' is a debugging memory
allocator, then MALLOC will select the appropriate allocator,
avoids littering the code with #ifdefs, and makes clear
the difference between allocation calls being debugged and
extra memory that is allocated only during debugging.
Check bounds even on things that ``can't'' overflow. A
function that writes on to variable-sized storage should
take an argument maxsize that is the size of the destination.
If there are times when the size of the destination
is unknown, some `magic' value of maxsize should mean ``no
bounds checks''. When bound checks fail, make sure that the
function does something useful such as abort or return an
error status.
In all, remember that a program that produces wrong
answers twice as fast is infinitely slower. The same is
true of programs that crash occasionally or clobber valid
data.
``C Code. C code run. Run, code, run... PLEASE!!!'' - Barbara Tongue
15. Conditional Compilation.
Conditional compilation is useful for things like
machine-dependencies, debugging, and for setting certain options
at compile-time. Beware of conditional compilation.
Various controls can easily combine in unforeseen ways. If
you #ifdef machine dependencies, make sure that when no
machine is specified, the result is an error, not a default
machine. (Use ``#error'' and indent it so it works with
older compilers.) If you #ifdef optimizations, the default
should be the unoptimized code rather than an uncompilable
program. Be sure to test the unoptimized code.
Note that the text inside of an #ifdeffed section may
be scanned (processed) by the compiler, even if the #ifdef
is false. Thus, even if the #ifdeffed part of the file never
gets compiled (e.g., #ifdef COMMENT), it cannot be arbitrary
text.
Put #ifdefs in header files instead of source files
when possible. Use the #ifdefs to define macros that can be
used uniformly in the code. For instance, a header file for
checking memory allocation might look like (omitting definitions
for REALLOC and FREE):
Conditional compilation should generally be on a
feature-by-feature basis. Machine or operating system
dependencies should be avoided in most cases.
16. Portability
``C combines the power of assembler with the portability
of assembler.''
- Anonymous, alluding to Bill Thacker.
The advantages of portable code are well known. This
section gives some guidelines for writing portable code.
Here, ``portable'' means that a source file can be compiled
and executed on different machines with the only change being
the inclusion of possibly different header files and the
use of different compiler flags. The header files will contain
#defines and typedefs that may vary from machine to
machine. In general, a new ``machine'' is different
hardware, a different operating system, a different compiler,
or any combination of these. Reference [1] contains
useful information on both style and portability. The following
is a list of pitfalls to be avoided and recommendations
to be considered when designing portable code:
type
pdp11 vax
68000 Cray-2
Unisys Harris 80386
char 8 8 8 8 9 8 8
Some machines have more than one possible size for a
given type. The size you get can depend both on the
compiler and on various compile-time flags. The following
table shows ``safe'' type sizes on the majority
of systems. Unsigned numbers are the same bit size as
signed numbers.
Type
Minimum No char 8
If you need `magic' pointers other than NULL, either
allocate some storage or treat the pointer as a machine
dependence.
This example has lots of problems. The stack may grow
up or down (indeed, there need not even be a stack!).
Parameters may be widened when they are passed, so a
char might be passed as an int, for instance. Arguments
may be pushed left-to-right, right-to-left, in
arbitrary order, or passed in registers (not pushed at
all). The order of evaluation may differ from the order
in which they are pushed. One compiler may use
several (incompatible) calling conventions.
17. ANSI C
Modern C compilers support some or all of the ANSI proposed
standard C. Whenever possible, write code to run
under standard C, and use features such as function prototypes,
constant storage, and volatile storage. Standard C
improves program performance by giving better information to
optimizers. Standard C improves portability by insuring
that all compilers accept the same input language and by
providing mechanisms that try to hide machine dependencies
or emit warnings about code that may be machine-dependent.
17.1. Compatibility
Write code that is easy to port to older compilers.
For instance, conditionally #define new (standard) keywords
such as const and volatile in a global .h file.
Standard compilers pre-define the preprocessor symbol __STDC__[see
following footnote].
The void* type is hard to get right simply, since some older
compilers understand void but not void*. It is easiest to
create a new (machine- and compiler-dependent) VOIDP type,
usually char* on older compilers.
Note that under ANSI C, the `#' for a preprocessor
directive must be the first non-whitespace character on a
line. Under older compilers it must be the first character
on the line.
When a static function has a forward declaration, the
forward declaration must include the storage class. For
older compilers, the class must be ``extern''. For ANSI
compilers, the class must be ``static'' but global functions
must still be declared as ``extern''. Thus, forward
declarations of static functions should use a #define such
as FWD_STATIC that is #ifdeffed as appropriate.
An ``#ifdef NAME'' should end with either ``#endif'' or
``#endif /* NAME */'', not with ``#endif NAME''. The comment
should not be used on short #ifdefs, as it is clear
from the code.
ANSI trigraphs may cause programs with strings containing
``??'' may break mysteriously.
17.2. Formatting
The style for ANSI C is the same as for regular C, with
two notable exceptions: storage qualifiers and parameter
lists.
Because const and volatile have strange binding rules,
each const or volatile object should have a separate declaration.
Prototyped functions merge parameter declaration and
definition in to one list. Parameters should be commented
in the function comment.
17.3. Prototypes
Function prototypes should be used to make code more
robust and to make it run faster. Unfortunately, the prototyped
declaration
The prototype says that c is to be passed as the most natural
type for the machine, possibly a byte.
The non-prototyped (backwards-compatible) definition implies that c
is always passed as an int[see following footnote].
If a function has promotable parameters then the caller and callee
must be compiled identically.
Either both must use function prototypes or neither can use prototypes.
The problem can be avoided if parameters are promoted
when the program is designed.
For example, bork can be defined to take an int parameter.
The above declaration works if the definition is prototyped.
Unfortunately, the prototyped syntax will cause non-ANSI
compilers to reject the program.
It is easy to write external declarations that work
with both prototyping and with older compilers[see
following footnote].
In the end, it may be best to write in only one style
(e.g., with prototypes). When a non-prototyped version is
needed, it is generated using an automatic conversion tool.
17.4. Pragmas
Pragmas are used to introduce machine-dependent code in
a controlled way. Obviously, pragmas should be treated as
machine dependencies. Unfortunately, the syntax of ANSI
pragmas makes it impossible to isolate them in machine-dependent
headers.
Pragmas are of two classes. Optimizations may safely
be ignored. Pragmas that change the system behavior (``required
pragmas'') may not. Required pragmas should be #ifdeffed
so that compilation will abort if no pragma is
selected.
Two compilers may use a given pragma in two very different
ways. For instance, one compiler may use ``haggis''
to signal an optimization. Another might use it to indicate
that a given statement, if reached, should terminate the
program. Thus, when pragmas are used, they must always be
enclosed in machine-dependent #ifdefs. Pragmas must always
be #ifdefed out for non-ANSI compilers. Be sure to indent
the `#' character on the #pragma, as older preprocessors
will halt on it otherwise.
``The `#pragma' command is specified in the ANSI
standard to have an arbitrary implementation-defined
effect. In the GNU C preprocessor,
`#pragma' first attempts to run the game `rogue';
if that fails, it tries to run the game `hack'; if
that fails, it tries to run GNU Emacs displaying
the Tower of Hanoi; if that fails, it reports a
fatal error. In any case, preprocessing does not
continue.''
- Manual for the GNU C preprocessor for GNU CC 1.34.
18. Special Considerations
This section contains some miscellaneous dos and
don'ts.
19. Lint
Lint is a C program checker [2][11] that examines C
source files to detect and report type incompatibilities,
inconsistencies between function definitions and calls, potential
program bugs, etc. The use of lint on all programs
is strongly recommended, and it is expected that most projects
will require programs to use lint as part of the official
acceptance procedure.
It should be noted that the best way to use lint is not
as a barrier that must be overcome before official acceptance
of a program, but rather as a tool to use during and
after changes or additions to the code. Lint can find obscure
bugs and insure portability before problems occur.
Many messages from lint really do indicate something wrong.
One fun story is about is about a program that was missing
an argument to `fprintf'.
Most options are worth learning. Some options may complain
about legitimate things, but they will also pick up
many botches. Note that -p[see following footnote]
checks function-call type-consistency
for only a subset of library routines, so programs
should be linted both with and without -p for the best
``coverage''.
Lint also recognizes several special comments in the
code. These comments both shut up lint when the code
otherwise makes it complain, and also document special code.
20. Make
One other very useful tool is make[7]. During
development, make recompiles only those modules that have
been changed since the last time make was used. It can be
used to automate other tasks, as well. Some common conventions
include:
all
always makes all binaries
clean
remove all intermediate files
debug
make a test binary 'a.out' or 'debug'
depend
make transitive dependencies
install
install binaries
lint
run lint
print/list
make a hard copy of all source files
shar
make a shar of all source files
spotless
make clean, use revision control to put away sources sources
undo what spotless did
tags
run ctags (using the -t flag is suggested)
rdist
distribute sources to other hosts
file.c
check out the named file
In addition, command-line defines can be given to define either
Makefile values (such as ``CFLAGS'') or values in the
program (such as ``DEBUG'').
21. Project-Dependent Standards
Individual projects may wish to establish additional
standards beyond those given here. The following issues are
some of those that should be addressed by each project program
administration group.
22. Conclusion
A set of standards has been presented for C programming
style. Among the most important points are:
As with any standard, it must be followed if it is to
be useful. If you have trouble following any of these standards
don't just ignore them. Talk with your local guru, or
an experienced programmer at your institution.
References
#ifndef EXAMPLE_H
#define EXAMPLE_H
... /* body of example.h file */
#endif /* EXAMPLE_H */
This double-inclusion mechanism should not be relied upon,
particularly to perform nested includes.
/*
* Here is a block comment.
* The comment text should be tabbed or spaced over uniformly.
* The opening slash-star and closing star-slash are alone on a line.
*/
/*
** Alternate format for block comments
*/
if (argc > 1) {
/* Get input file from command line. */
if (freopen(argv[1], "r", stdin) == NULL) {
perror(argv[1]);
}
}
[Footnote:
Some automated program-analysis packages use different
characters before comment lines as a marker for lines
with specific items of information. In particular, a
line with a `-' in a comment preceding a function is
sometimes assumed to be a one-line summary of the
function's purpose.]
if (a == EXCEPTION) {
b = TRUE; /* special case */
} else {
b = isprime(a); /* works only for odd a */
}
char *s, *t, *u;
instead of
char* s, t, u;
which is wrong, since `t' and `u' do not get declared as
pointers.
struct boat {
int wllength; /* water line length in meters */
int type; /* see below */
long sailarea; /* sail area in square mm */
};
/* defines for boat.type */
#define KETCH (1)
#define YAWL (2)
#define SLOOP (3)
#define SQRIG (4)
#define MOTOR (5)
enum bt_t { KETCH=1, YAWL, SLOOP, SQRIG, MOTOR };
struct boat {
int wllength; /* water line length in meters */
enum bt_t type; /* what kind of boat */
long sailarea; /* sail area in square mm */
};
[Footnote: enums might be better anyway.]
int x = 1;
char *msg = "message";
struct boat winner[] = {
{ 40, YAWL, 6000000L },
{ 28, MOTOR, 0L },
{ 0 },
};
typedef struct splodge_t {
int sp_count;
char *sp_name, *sp_alias;
} splodge_t;
[Footnote:
``Tabstops'' can be blanks (spaces) inserted by your
editor in clumps of two, four, or eight. Use actual tabs where
possible.]
[Footnote:
#define void or #define void int for compilers without
the void keyword.]
int i;main(){for(;i["]<i;++i){--i;}"];read('-'-'-',i+++"hell\
o, world!\n",'/'/'/'));}read(j,i,p){write(j/p+p,i---j,i/i);}
Dishonorable mention, Obfuscated C Code Contest, 1984. Author
requested anonymity.
if (foo->next==NULL && totalcount<needed && needed<=MAX_ALLOT
&& server_active(current_input)) { ...
Might be better as
if (foo->next == NULL
&& totalcount < needed
&& needed <= MAX_ALLOT
&& server_active(current_input)) { ...
Similarly, elaborate for loops should be split onto different
lines.
for (curr = *listp, trail = listp;
curr != NULL;
trail = &(curr->next), curr = curr->next )
{
...
Other complex expressions, particularly those using the ternary
(?:) operator, are best split on to several lines, too.
c = (a == b)
? d + f(a)
: f(b) - d;
/*
* Determine if the sky is blue by checking that it isn't night.
* CAVEAT: Only sometimes right. May return TRUE when the answer
* is FALSE. Consider clouds, eclipses, short days.
* NOTE: Uses `hour' from `hightime.c'. Returns `int' for
* compatibility with the old version.
*/
int /* TRUE or FALSE */
skyblue()
{
extern int hour; /* current hour of the day */
if (hour < MORNING && hour > EVENING) {
return FALSE; /* black */
} else {
return TRUE; /* blue */
}
}
/*
* Find the last element in the linked list
* pointed to by nodep and return a pointer to it.
* Return NULL if there is no last element.
*/
node_t *
tail(nodep)
node_t *nodep; /* pointer to head of list */
{
register node_t *np; /* advances to NULL */
register node_t *lp; /* follows one behind np */
if (nodep == NULL)
return NULL;
np = lp = nodep;
while ((np = np->next) != NULL) {
lp = np;
}
return lp;
}
case FOO: oogle(zork); boogle(zork); break;
case BAR: oogle(bork); boogle(zork); break;
case BAZ: oogle(gork); boogle(bork); break;
while (*dest++ = *src++)
; /* VOID */
if (f() != FAIL)
is better than
if (f())
even though FAIL may have the value zero which C considers to
be false. An explicit test will help you out later when
somebody decides that a failure return should be -1 instead
of zero. Explicit comparison should be used even if the comparison
value will never change; e.g., ``if (!(bufsize % sizeof(int)))''
should be written instead as ``if ((bufsize % sizeof(int)) == 0)''
to reflect the numeric (not boolean)
nature of the test. A frequent trouble spot is using strcmp
to test for string equality, where the result should never
ever be defaulted. The preferred approach is to define a
macro STREQ.
#define STREQ(a, b) (strcmp((a), (b)) == 0)
typedef int bool;
#define FALSE 0
#define TRUE 1
or
typedef enum { NO=0, YES } bool;
if (func() == TRUE) { ...
must be written
if (func() != FALSE) { ...
It is even better (where possible) to rename the
function/variable or rewrite the expression so that the
meaning is obvious without a comparison to true or false
(e.g., rename to isvalid()).
while ((c = getchar()) != EOF) {
process the character
}
a = b + c;
d = a + r;
should not be replaced by
d = (a = b + c) + r;
even though the latter may save one cycle. In the long run
the time difference between the two will decrease as the optimizer
gains maturity, while the difference in ease of
maintenance will increase as the human memory of what's going
on in the latter piece of code begins to fade.
for (...) {
while (...) {
...
if (disaster)
goto error;
}
}
...
error:
clean up the mess
control {
statement;
statement;
}
switch (expr) {
case ABC:
case DEF:
statement;
break;
case UVW:
statement;
/*FALLTHROUGH*/
case XYZ:
statement;
break;
}
if (expr) {
statement;
} else {
statement;
statement;
}
if (ex1) {
if (ex2) {
funca();
}
} else {
funcb();
}
if (STREQ(reply, "yes")) {
statements for yes
...
} else if (STREQ(reply, "no")) {
...
} else if (STREQ(reply, "maybe")) {
...
} else {
statements for default
...
}
#ifdef CIRCUIT
# define CLOSE_CIRCUIT(circno) { close_circ(circno); }
#else
# define CLOSE_CIRCUIT(circno)
#endif
...
if (expr)
statement;
else
CLOSE_CIRCUIT(x)
++i;
if (level > limit)
return OVERFLOW;
normal();
return level;
The ``flattened'' indentation tells the reader that the
boolean test is invariant over the rest of the enclosing
block.
for (i = 0; i < ARYBOUND; i++)
is reasonable while the code
door_t *front_door = opens(door[i], 7);
if (front_door == 0)
error("can't open %s\n", door[i]);
is not. In the last example front_door is a pointer. When
a value is a pointer it should be compared to NULL instead
of zero. NULL is available either as part of the standard I/O
library's header file stdio.h or in stdlib.h
for newer systems.
Even simple values like one or zero are often better expressed
using defines like TRUE and FALSE (sometimes YES and
NO read better).
#define OFF_A() (a_global+OFFSET)
#define BORK() (zork())
#define SP3() if (b) { int x; av = f(&x); bv += x; }
if (x==3)
SP3();
else
BORK();
#define SP3() \
do { if (b) { int x; av = f(&x); bv += x; }} while (0)
#ifdef lint
static int ZERO;
#else
# define ZERO 0
#endif
#define STMT( stuff ) do { stuff } while (ZERO)
#define SP3() \
STMT( if (b) { int x; av = f(&x); bv += x; } )
enum { STATE_ERR, STATE_START, STATE_NORMAL, STATE_END } state_t;
enum { VAL_NEW=1, VAL_NORMAL, VAL_DYING, VAL_DEAD } value_t;
Uninitialized values will then often ``catch themselves''.
#ifdef DEBUG
# define MALLOC(size) (mm_malloc(size))
#else
# define MALLOC(size) (malloc(size))
#endif
/*
* INPUT: A null-terminated source string `src' to copy from and
* a `dest' string to copy to. `maxsize' is the size of `dest'
* or UINT_MAX if the size is not known. `src' and `dest' must
* both be shorter than UINT_MAX, and `src' must be no longer than
* `dest'.
* OUTPUT: The address of `dest' or NULL if the copy fails.
* `dest' is modified even when the copy fails.
*/
char *
copy(dest, maxsize, src)
char *dest, *src;
unsigned maxsize;
{
char *dp = dest;
while (maxsize-- > 0)
if ((*dp++ = *src++) == '\0')
return dest;
return NULL;
}
#ifdef DEBUG
extern void *mm_malloc();
# define MALLOC(size) (mm_malloc(size))
#else
extern void *malloc();
# define MALLOC(size) (malloc(size))
#endif
#ifdef BSD4
long t = time((long *)NULL);
#endif
The preceding code is poor for two reasons: there may be
4BSD systems for which there is a better choice, and there
may be non-4BSD systems for which the above is the best
code. Instead, use define symbols such as TIME_LONG and
TIME_STRUCT and define the appropriate one in a configuration
file such as config.h.
series
family
1100
H800
short
int
long
char *
int *
int (*)
16
16
32
16
16
16
16
32
32
32
32
32
8/16
16/32
32
32
32
32
64(32)
64(32)
64
64
64(24)
64
18
36
36
72
72
576
24
24
48
24
24
24
8/16
16/32
32
16/32/48
16/32/48
16/32/48
# Bits
Smaller
Than
short
int
long
float
double
any *
char *
void *
16
16
32
24
38
14
15
15
char
short
int
float
any *
any *
int *p = (int *)malloc(sizeof(int));
free(p);
((int *) 2)
((int *) 3)
[Footnote:
The code may also fail to compile, fault on pointer
creation, fault on pointer comparison, or fault on
pointer dereferences.]
extern int x_int_dummy; /* in x.c */
#define X_FAIL (NULL)
#define X_BUSY (&x_int_dummy)
#define X_FAIL (NULL)
#define X_BUSY MD_PTR1 /* MD_PTR1 from "machdep.h" */
c = foo(getchar(), getchar());
char
foo(c1, c2, c3)
char c1, c2, c3;
{
char bar = *(&c1 + 1);
return bar; /* often won't return c2 */
}
s = "/dev/tty??";
strcpy(&s[8], ttychars);
[Footnote:
Some libraries attempt to modify and then restore
read-only string variables. Programs sometimes won't
port because of these broken libraries. The libraries
are getting better.]
x &= 0177770
Use instead
x &= ~07
which works properly on all machines. Bitfields do not
have these problems.
a[i] = b[i++];
In the above example, we know only that the subscript
into b has not been incremented. The index into a
could be the value of i either before or after the increment.
struct bar_t { struct bar_t *next; } bar;
bar->next = bar = tmp;
In the second example, the address of ``bar->next'' may
be computed before the value is assigned to ``bar''.
bar = bar->next = tmp;
In the third example, bar can be assigned before bar->next. Although this appears to violate the rule that
``assignment proceeds right-to-left'', it is a legal
interpretation. Consider the following example:
long i;
short a[N];
i = old
i = a[i] = new;
The value that ``i'' is assigned must be a value that
is typed as if assignment proceeded right-to-left.
However, ``i'' may be assigned the value
``(long)(short)new'' before ``a[i]'' is assigned.
Compilers do differ.
#define FOO(string) (printf("string = %s",(string)))
...
FOO(filename);
Will only sometimes be expanded to
(printf("filename = %s",(filename)))
Be aware, however, that tricky preprocessors may cause
macros to break accidentally on some machines.
Consider the following two versions of a macro.
#define LOOKUP(chr) (a['c'+(chr)]) /* Works as intended. */
#define LOOKUP(c) (a['c'+(c)]) /* Sometimes breaks. */
The second version of LOOKUP can be expanded in two
different ways and will cause code to break mysteriously.
#if __STDC__
typedef void *voidp;
# define COMPILER_SELECTED
#endif
#ifdef A_TARGET
# define const
# define volatile
# define void int
typedef char *voidp;
# define COMPILER_SELECTED
#endif
#ifdef ...
...
#endif
#ifdef COMPILER_SELECTED
# undef COMPILER_SELECTED
#else
{ NO TARGET SELECTED! }
#endif
[Footnote:
Some compilers predefine __STDC__ to be zero, in an attempt
to indicate partial compliance with the ANSI C
standard. Unfortunately, it is not possible to determine
which ANSI facilities are provided. Thus, such
compilers are broken. See the rule about ``don't write
around a broken compiler unless you are forced to.'']
int const *s; /* YES */
int const *s, *t; /* NO */
/*
* `bp': boat trying to get in.
* `stall': a list of stalls, never NULL.
* returns stall number, 0 => no room.
*/
int
enter_pier(boat_t const *bp, stall_t *stall)
{
...
extern void bork(char c);
is incompatible with the definition
void
bork(c)
char c;
...
[Footnote:
Such automatic type promotion is called widening. For
older compilers, the widening rules require that all
char and short parameters are passed as ints and that
float parameters are passed as doubles.]
void
bork(char c)
{
...
#if __STDC__
# define PROTO(x) x
#else
# define PROTO(x) ()
#endif
extern char **ncopies PROTO((char *s, short times));
Note that PROTO must be used with double parentheses.
[Footnote:
Note that using PROTO violates the rule ``don't change
the syntax via macro substitution.'' It is regrettable
that there isn't a better solution.]
#if defined(__STDC__) && defined(USE_HAGGIS_PRAGMA)
#pragma (HAGGIS)
#endif
abool = bbool;
if (abool) { ...
When embedded assignment is used, make the test explicit
so that it doesn't get ``fixed'' later.
while ((abool = bbool) != FALSE) { ...
while (abool = bbool) { ... /* VALUSED */
while (abool = bbool, abool) { ...
fprintf("Usage: foo -bar <file>\n");
The author never had a problem. But the program dumped
core every time an ordinary user made a mistake on the command
line. Many versions of lint will catch this.
[Footnote: Flag names may vary.]
Note: doesn't remove Makefile, although it is a source file
Henry Spencer