In 1983, the American National Standards Institute (ANSI) commissioned a committee, X3J11, to standardize the C language. After a long, arduous process, including several widespread public reviews, the committee's work was finally ratified as ANS X3.159-1989, on December 14, 1989, and published in the spring of 1990. For the most part, ANSI C standardizes existing practice, with a few additions from C++ (most notably function prototypes) and support for multinational character sets (including the much-lambasted trigraph sequences). The ANSI C standard also formalizes the C run-time library support routines.
The published Standard includes a "Rationale," which explains many of its decisions, and discusses a number of subtle points, including several of those covered here. (The Rationale is "not part of ANSI Standard X3.159-1989, but is included for information only.")
The Standard has been adopted as an international standard, ISO/IEC 9899:1990, although the sections are numbered differently (briefly, ANSI sections 2 through 4 correspond roughly to ISO sections 5 through 7), and the Rationale is currently not included.
ANSI X3.159 has been officially superseded by ISO 9899. Copies are available in the United States from
American National Standards Instituteor
11 W. 42nd St., 13th floor
New York, NY 10036 USA
(+1) 212 642 4900
Global Engineering DocumentsIn other countries, contact the appropriate national standards body, or ISO in Geneva at:
2805 McGaw Avenue
Irvine, CA 92714 USA
(+1) 714 261 1455
(800) 854 7179 (U.S. & Canada)
ISO Sales
Case Postale 56
CH-1211 Geneve 20
Switzerland
The cost is $130.00 from ANSI or $162.50 from Global. Copies of the original X3.159 (including the Rationale) are still available at $205.00 from ANSI or $200.50 from Global. Note that ANSI derives revenues to support its operations from the sale of printed standards, so electronic copies are not available.
The mistitled Annotated ANSI C Standard, with annotations by Herbert Schildt, contains all but a few pages of ISO 9899; it is published by Osborne/McGraw-Hill, ISBN 0-07-881952-0, and sells in the U.S. for approximately $40. (It has been suggested that the price differential between this work and the official standard reflects the value of the annotations.)
The text of the Rationale (not the full Standard) is now available for anonymous ftp from ftp.uu.net (see question 17.12) in directory doc/standards/ansi/X3.159-1989 . The Rationale has also been printed by Silicon Press, ISBN 0-929306-07-4.
Two programs, protoize and unprotoize, convert back and forth between prototyped and "old style" function definitions and declarations. (These programs do not handle full-blown translation between "Classic" C and ANSI C.) These programs were once patches to the FSF GNU C compiler, gcc, but are now part of the main gcc distribution; look in pub/gnu at prep.ai.mit.edu (18.71.0.38), or at several other FSF archive sites.
The unproto program (/pub/unix/unproto5.shar.Z on ftp.win.tue.nl) is a filter which sits between the preprocessor and the next compiler pass, converting most of ANSI C to traditional C on-the-fly.
The GNU GhostScript package comes with a little program called ansi2knr.
Several prototype generators exist, many as modifications to lint. Version 3 of CPROTO was posted to comp.sources.misc in March, 1992. There is another program called "cextract." See also question 17.12.
Finally, are you sure you really need to convert lots of old code to ANSI C? The old-style function syntax is still acceptable.
#
to insert the value of a symbolic constant into a message, but
it keeps stringizing the macro's name rather than its value.
You must use something like the following two-step procedure to force the macro to be expanded as well as stringized:
#define str(x) #x #define xstr(x) str(x) #define OP plus char *opname = xstr(OP);
This sets opname to "plus
" rather than "OP
".
An equivalent circumlocution is necessary with the token-pasting
operator ##
when the values (rather than the names) of two
macros are to be concatenated.
References: ANSI Sec. 3.8.3.2, Sec. 3.8.3.5 example p. 93.
const
values in initializers
and array dimensions, as in
const int n = 5; int a[n];
The const
qualifier really means "read-only;" an object so
qualified is a normal run-time object which cannot (normally) be
assigned to. The value of a const-qualified object is therefore
not a constant expression in the full sense of the term. (C
is unlike C++ in this regard.) When you need a true compile-
time constant, use a preprocessor #define
.
References: ANSI Sec. 3.4 .
char
const
*p
" and
"char
*
const
p
"?
"char const
*p
" is a pointer to a constant character (you can't
change the character); "char
* const p
" is a constant pointer to
a (variable) character (i.e. you can't change the pointer).
(Read these "inside out" to understand them. See question
10.4.)
References: ANSI Sec. 3.5.4.1 .
char
**
to a function which expects a
const char
**
?
You can use a pointer-to-T (for any type T) where a pointer-to- const-T is expected, but the rule (an explicit exception) which permits slight mismatches in qualified pointer types is not applied recursively, but only at the top level.
You must use explicit casts (e.g. (const char
**)
in this case)
when assigning (or passing) pointers which have qualifier
mismatches at other than the first level of indirection.
References: ANSI Sec. 3.1.2.6 p. 26, Sec. 3.3.16.1 p. 54, Sec. 3.5.3 p. 65.
extern int func(float); int func(x) float x; {...
You have mixed the new-style prototype declaration
"extern int func(float);
" with the old-style definition
"int func(x) float x;
".
It is usually safe to mix the two
styles (see question 5.9), but not in this case. Old C (and
ANSI C, in the absence of prototypes, and in variable-length
argument lists) "widens" certain arguments when
they are passed to functions. floats
are
promoted to double
, and characters and short integers are
promoted to int
s. (For old-style function
definitions, the values are automatically converted back
to the corresponding narrower types within the body of the
called function, if they are declared that way there.)
This problem can be fixed either by using new-style syntax consistently in the definition:
int func(float x) { ... }
or by changing the new-style prototype declaration to match the old-style definition:
extern int func(double);
(In this case, it would be clearest to change the old-style definition to use double as well, as long as the address of that parameter is not taken.)
It may also be safer to avoid "narrow" (char
, short int
, and
float
) function arguments and return types.
References: ANSI Sec. 3.3.2.2 .
Doing so is perfectly legal, as long as you're careful (see especially question 5.8). Note however that old-style syntax is marked as obsolescent, and support for it may be removed some day.
References: ANSI Secs. 3.7.1, 3.9.5 .
extern f(struct x {int s;} *p);
In a quirk of C's normal block scoping rules, a struct declared only within a prototype cannot be compatible with other structs declared in the same source file, nor can the struct tag be used later as you'd expect (it goes out of scope at the end of the prototype).
To resolve the problem, precede the prototype with the vacuous- looking declaration
struct x;
, which will reserve a place at file scope for struct x
's
definition, which will be completed by the struct declaration
within the prototype.
References: ANSI Sec. 3.1.2.1 p. 21, Sec. 3.1.2.6 p. 26, Sec. 3.5.2.3 p. 63.
Under ANSI C, the text inside a "turned off" #if
, #ifdef
, or
#ifndef
must still consist of "valid preprocessing tokens."
This means that there must be no unterminated comments or quotes
(note particularly that an apostrophe within a contracted word
could look like the beginning of a character constant), and no
newlines inside quotes. Therefore, natural-language comments
and pseudocode should always be written between the "official"
comment delimiters /*
and */
. (But see also question 17.14, and
6.7.)
References: ANSI Sec. 2.1.1.2 p. 6, Sec. 3.1 p. 19 line 37.
main
as void
, to shut off these annoying "main
returns no value" messages? (I'm calling exit()
, so main
doesn't return.)
No. main
must be declared as returning an int
, and as taking
either zero or two arguments (of the appropriate type). If
you're calling exit()
but still getting warnings, you'll have to
insert a redundant return
statement (or use some kind of
"notreached" directive, if available).
Declaring a function as void
does not merely silence warnings;
it may also result in a different function call/return sequence,
incompatible with what the caller (in main
's case, the C run-
time startup code) expects.
References: ANSI Sec. 2.1.2.2.1 pp. 7-8.
exit(status)
truly equivalent to returning status from main
?
Formally, yes, although discrepancies arise under a few older,
nonconforming systems, or if data local to main()
might be needed
during cleanup (due perhaps to a setbuf
or
atexit
call), or if main()
is
called recursively.
References: ANSI Sec. 2.1.2.2.3 p. 8.
The problem is older linkers which are neither under the control of the ANSI standard nor the C compiler developers on the systems which have them. The limitation is only that identifiers be significant in the first six characters, not that they be restricted to six characters in length. This limitation is annoying, but certainly not unbearable, and is marked in the Standard as "obsolescent," i.e. a future revision will likely relax it.
This concession to current, restrictive linkers really had to be made, no matter how vehemently some people oppose it. (The Rationale notes that its retention was "most painful.") If you disagree, or have thought of a trick by which a compiler burdened with a restrictive linker could present the C programmer with the appearance of more significance in external identifiers, read the excellently-worded section 3.1.2 in the X3.159 Rationale (see question 5.1), which discusses several such schemes and explains why they could not be mandated.
References: ANSI Sec. 3.1.2 p. 21, Sec. 3.9.1 p. 96, Rationale Sec. 3.1.2 pp. 19-21.
memcpy
and memmove
?
memmove
offers guaranteed behavior if the source and destination
arguments overlap. memcpy
makes no such guarantee, and may
therefore be more efficiently implementable. When in doubt,
it's safer to use memmove
.
References: ANSI Secs. 4.11.2.1, 4.11.2.2, Rationale Sec. 4.11.2.
Perhaps it is a pre-ANSI compiler, unable to accept function prototypes and the like. See also questions 5.17 and 17.2.
It's not unusual to have a compiler available which accepts ANSI syntax, but not to have ANSI-compatible header files or run-time libraries installed. See also questions 5.16 and 17.2.
Most compilers support a few non-Standard extensions, gcc more so than most. Are you sure that the code being rejected doesn't rely on such an extension? It is usually a bad idea to perform experiments with a particular compiler to determine properties of a language; the applicable standard may permit variations, or the compiler may be wrong. See also question 4.4.
void
*
pointer?
The compiler doesn't know the size of the pointed-to objects.
Before performing arithmetic, cast the pointer either to char
*
or to the type you're trying to manipulate (but see question 2.18).
char a[3] = "abc";
legal? What does it mean?
It is legal in ANSI C (and perhaps in a few pre-ANSI systems), though questionably useful. It declares an array
of size three, initialized with the three characters 'a'
, 'b',
and 'c'
, without the usual terminating '\0'
character; the array
is therefore not a true C string and cannot be used with strcpy
,
printf
%s
, etc.
References: ANSI Sec. 3.5.7 pp. 72-3.
#pragmas
and what are they good for?
The #pragma
directive provides a single, well-defined "escape
hatch" which can be used for all sorts of implementation-
specific controls and extensions: source listing control,
structure packing, warning suppression (like the old lint
/*
NOTREACHED
*/
comments), etc.
References: ANSI Sec. 3.8.6 .
#pragma once
" mean? I found it in some header files.It is an extension implemented by some preprocessors to help make header files idempotent; it is essentially equivalent to the #ifndef trick mentioned in question 6.4.
Briefly: implementation-defined means that an implementation must choose some behavior and document it. Unspecified means that an implementation should choose some behavior, but need not document it. Undefined means that absolutely anything might happen. In no case does the Standard impose requirements; in the first two cases it occasionally suggests (and may require a choice from among) a small set of likely behaviors.
If you're interested in writing portable code, you can ignore the distinctions, as you'll want to avoid code that depends on any of the three behaviors.
References: ANSI Sec. 1.6, especially the Rationale.