[prev] [up] [overview] [next]

Section 5. ANSI C

5.1: What is the "ANSI C Standard?"

In 1983, the American National Standards Institute (ANSI) commissioned a committee, X3J11, to standardize the C language.  After a long, arduous process, including several widespread public reviews, the committee's work was finally ratified as ANS X3.159-1989, on December 14, 1989, and published in the spring of 1990.  For the most part, ANSI C standardizes existing practice, with a few additions from C++ (most notably function prototypes) and support for multinational character sets (including the much-lambasted trigraph sequences).  The ANSI C standard also formalizes the C run-time library support routines.

The published Standard includes a "Rationale," which explains many of its decisions, and discusses a number of subtle points, including several of those covered here.  (The Rationale is "not part of ANSI Standard X3.159-1989, but is included for information only.")

The Standard has been adopted as an international standard, ISO/IEC 9899:1990, although the sections are numbered differently (briefly, ANSI sections 2 through 4 correspond roughly to ISO sections 5 through 7), and the Rationale is currently not included.

5.2: How can I get a copy of the Standard?

ANSI X3.159 has been officially superseded by ISO 9899. Copies are available in the United States from

American National Standards Institute
11 W. 42nd St., 13th floor
New York, NY  10036 USA
(+1) 212 642 4900
or
Global Engineering Documents
2805 McGaw Avenue
Irvine, CA  92714 USA
(+1) 714 261 1455
(800) 854 7179  (U.S. & Canada)
In other countries, contact the appropriate national standards body, or ISO in Geneva at:

ISO Sales
Case Postale 56
CH-1211  Geneve 20
Switzerland

The cost is $130.00 from ANSI or $162.50 from Global.  Copies of the original X3.159 (including the Rationale) are still available at $205.00 from ANSI or $200.50 from Global.  Note that ANSI derives revenues to support its operations from the sale of printed standards, so electronic copies are not available.

The mistitled Annotated ANSI C Standard, with annotations by Herbert Schildt, contains all but a few pages of ISO 9899; it is published by Osborne/McGraw-Hill, ISBN 0-07-881952-0, and sells in the U.S. for approximately $40.  (It has been suggested that the price differential between this work and the official standard reflects the value of the annotations.)

The text of the Rationale (not the full Standard) is now available for anonymous ftp from ftp.uu.net (see question 17.12) in directory doc/standards/ansi/X3.159-1989 .  The Rationale has also been printed by Silicon Press, ISBN 0-929306-07-4.

5.3: Does anyone have a tool for converting old-style C programs to ANSI C, or vice versa, or for automatically generating prototypes?

Two programs, protoize and unprotoize, convert back and forth between prototyped and "old style" function definitions and declarations.  (These programs do not handle full-blown translation between "Classic" C and ANSI C.)  These programs were once patches to the FSF GNU C compiler, gcc, but are now part of the main gcc distribution; look in pub/gnu at prep.ai.mit.edu (18.71.0.38), or at several other FSF archive sites.

The unproto program (/pub/unix/unproto5.shar.Z on ftp.win.tue.nl) is a filter which sits between the preprocessor and the next compiler pass, converting most of ANSI C to traditional C on-the-fly.

The GNU GhostScript package comes with a little program called ansi2knr.

Several prototype generators exist, many as modifications to lint.  Version 3 of CPROTO was posted to comp.sources.misc in March, 1992.  There is another program called "cextract."  See also question 17.12.

Finally, are you sure you really need to convert lots of old code to ANSI C?  The old-style function syntax is still acceptable.

5.4: I'm trying to use the ANSI "stringizing" preprocessing operator # to insert the value of a symbolic constant into a message, but it keeps stringizing the macro's name rather than its value.

You must use something like the following two-step procedure to force the macro to be expanded as well as stringized:

	#define str(x) #x
	#define xstr(x) str(x)
	#define OP plus
	char *opname = xstr(OP);

This sets opname to "plus" rather than "OP".

An equivalent circumlocution is necessary with the token-pasting operator ## when the values (rather than the names) of two macros are to be concatenated.

References: ANSI Sec. 3.8.3.2, Sec. 3.8.3.5 example p. 93.

5.5: I don't understand why I can't use const values in initializers and array dimensions, as in

                const int n = 5;
                int a[n];

The const qualifier really means "read-only;" an object so qualified is a normal run-time object which cannot (normally) be assigned to.  The value of a const-qualified object is therefore not a constant expression in the full sense of the term.  (C is unlike C++ in this regard.) When you need a true compile- time constant, use a preprocessor #define .

References: ANSI Sec. 3.4 .

5.6: What's the difference between "char const *p" and "char * const p"?

"char const *p" is a pointer to a constant character (you can't change the character); "char * const p" is a constant pointer to a (variable) character (i.e. you can't change the pointer). (Read these "inside out" to understand them.  See question 10.4.)

References: ANSI Sec. 3.5.4.1 .

5.7: Why can't I pass a char ** to a function which expects a const char **?

You can use a pointer-to-T (for any type T) where a pointer-to- const-T is expected, but the rule (an explicit exception) which permits slight mismatches in qualified pointer types is not applied recursively, but only at the top level.

You must use explicit casts (e.g. (const char **) in this case) when assigning (or passing) pointers which have qualifier mismatches at other than the first level of indirection.

References: ANSI Sec. 3.1.2.6 p. 26, Sec. 3.3.16.1 p. 54, Sec. 3.5.3 p. 65.

5.8: My ANSI compiler complains about a mismatch when it sees

	extern int func(float);

	int func(x)
	float x;
	{...

You have mixed the new-style prototype declaration "extern int func(float);" with the old-style definition "int func(x) float x;".  It is usually safe to mix the two styles (see question 5.9), but not in this case.  Old C (and ANSI C, in the absence of prototypes, and in variable-length argument lists) "widens" certain arguments when they are passed to functions.  floats are promoted to double, and characters and short integers are promoted to ints.  (For old-style function definitions, the values are automatically converted back to the corresponding narrower types within the body of the called function, if they are declared that way there.)

This problem can be fixed either by using new-style syntax consistently in the definition:

	int func(float x) { ... }

or by changing the new-style prototype declaration to match the old-style definition:

	extern int func(double);

(In this case, it would be clearest to change the old-style definition to use double as well, as long as the address of that parameter is not taken.)

It may also be safer to avoid "narrow" (char, short int, and float) function arguments and return types.

References: ANSI Sec. 3.3.2.2 .

5.9: Can you mix old-style and new-style function syntax?

Doing so is perfectly legal, as long as you're careful (see especially question 5.8).  Note however that old-style syntax is marked as obsolescent, and support for it may be removed some day.

References: ANSI Secs. 3.7.1, 3.9.5 .

5.10: Why does the declaration

	extern f(struct x {int s;} *p);

give me an obscure warning message about "struct x introduced in prototype scope"?

In a quirk of C's normal block scoping rules, a struct declared only within a prototype cannot be compatible with other structs declared in the same source file, nor can the struct tag be used later as you'd expect (it goes out of scope at the end of the prototype).

To resolve the problem, precede the prototype with the vacuous- looking declaration

	struct x;

, which will reserve a place at file scope for struct x's definition, which will be completed by the struct declaration within the prototype.

References: ANSI Sec. 3.1.2.1 p. 21, Sec. 3.1.2.6 p. 26, Sec. 3.5.2.3 p. 63.

5.11: I'm getting strange syntax errors inside code which I've #ifdeffed out.

Under ANSI C, the text inside a "turned off" #if, #ifdef, or #ifndef must still consist of "valid preprocessing tokens." This means that there must be no unterminated comments or quotes (note particularly that an apostrophe within a contracted word could look like the beginning of a character constant), and no newlines inside quotes.  Therefore, natural-language comments and pseudocode should always be written between the "official" comment delimiters /* and */ (But see also question 17.14, and 6.7.)

References: ANSI Sec. 2.1.1.2 p. 6, Sec. 3.1 p. 19 line 37.

5.12: Can I declare main as void, to shut off these annoying "main returns no value" messages?  (I'm calling exit(), so main doesn't return.)

No.  main must be declared as returning an int, and as taking either zero or two arguments (of the appropriate type).  If you're calling exit() but still getting warnings, you'll have to insert a redundant return statement (or use some kind of "notreached" directive, if available).

Declaring a function as void does not merely silence warnings; it may also result in a different function call/return sequence, incompatible with what the caller (in main's case, the C run- time startup code) expects.

References: ANSI Sec. 2.1.2.2.1 pp. 7-8.

5.13: Is exit(status) truly equivalent to returning status from main?

Formally, yes, although discrepancies arise under a few older, nonconforming systems, or if data local to main() might be needed during cleanup (due perhaps to a setbuf or atexit call), or if main() is called recursively.

References: ANSI Sec. 2.1.2.2.3 p. 8.

5.14: Why does the ANSI Standard not guarantee more than six monocase characters of external identifier significance?

The problem is older linkers which are neither under the control of the ANSI standard nor the C compiler developers on the systems which have them.  The limitation is only that identifiers be significant in the first six characters, not that they be restricted to six characters in length.  This limitation is annoying, but certainly not unbearable, and is marked in the Standard as "obsolescent," i.e. a future revision will likely relax it.

This concession to current, restrictive linkers really had to be made, no matter how vehemently some people oppose it.  (The Rationale notes that its retention was "most painful.")  If you disagree, or have thought of a trick by which a compiler burdened with a restrictive linker could present the C programmer with the appearance of more significance in external identifiers, read the excellently-worded section 3.1.2 in the X3.159 Rationale (see question 5.1), which discusses several such schemes and explains why they could not be mandated.

References: ANSI Sec. 3.1.2 p. 21, Sec. 3.9.1 p. 96, Rationale Sec. 3.1.2 pp. 19-21.

5.15: What is the difference between memcpy and memmove?

memmove offers guaranteed behavior if the source and destination arguments overlap.  memcpy makes no such guarantee, and may therefore be more efficiently implementable.  When in doubt, it's safer to use memmove.

References: ANSI Secs. 4.11.2.1, 4.11.2.2, Rationale Sec. 4.11.2.

5.16: My compiler is rejecting the simplest possible test programs, with all kinds of syntax errors.

Perhaps it is a pre-ANSI compiler, unable to accept function prototypes and the like.  See also questions 5.17 and 17.2.

5.17: Why are some ANSI/ISO Standard library routines showing up as undefined, even though I've got an ANSI compiler?

It's not unusual to have a compiler available which accepts ANSI syntax, but not to have ANSI-compatible header files or run-time libraries installed.  See also questions 5.16 and 17.2.

5.18: Why won't the Frobozz Magic C Compiler, which claims to be ANSI compliant, accept this code?  I know that the code is ANSI, because gcc accepts it.

Most compilers support a few non-Standard extensions, gcc more so than most.  Are you sure that the code being rejected doesn't rely on such an extension?  It is usually a bad idea to perform experiments with a particular compiler to determine properties of a language; the applicable standard may permit variations, or the compiler may be wrong.  See also question 4.4.

5.19: Why can't I perform arithmetic on a void * pointer?

The compiler doesn't know the size of the pointed-to objects. Before performing arithmetic, cast the pointer either to char * or to the type you're trying to manipulate (but see question 2.18).

5.20: Is char a[3] = "abc"; legal?  What does it mean?

It is legal in ANSI C (and perhaps in a few pre-ANSI systems), though questionably useful.  It declares an array of size three, initialized with the three characters 'a', 'b', and 'c', without the usual terminating '\0' character; the array is therefore not a true C string and cannot be used with strcpy, printf %s, etc.

References: ANSI Sec. 3.5.7 pp. 72-3.

5.21: What are #pragmas and what are they good for?

The #pragma directive provides a single, well-defined "escape hatch" which can be used for all sorts of implementation- specific controls and extensions: source listing control, structure packing, warning suppression (like the old lint /* NOTREACHED */ comments), etc.

References: ANSI Sec. 3.8.6 .

5.22: What does "#pragma once" mean?  I found it in some header files.

It is an extension implemented by some preprocessors to help make header files idempotent; it is essentially equivalent to the #ifndef trick mentioned in question 6.4.

5.23: People seem to make a point of distinguishing between implementation-defined, unspecified, and undefined behavior. What's the difference?

Briefly: implementation-defined means that an implementation must choose some behavior and document it.  Unspecified means that an implementation should choose some behavior, but need not document it.  Undefined means that absolutely anything might happen.  In no case does the Standard impose requirements; in the first two cases it occasionally suggests (and may require a choice from among) a small set of likely behaviors.

If you're interested in writing portable code, you can ignore the distinctions, as you'll want to avoid code that depends on any of the three behaviors.

References: ANSI Sec. 1.6, especially the Rationale.


[prev] [up] [overview] [next]