The Committee decided that empty declarations are invalid
(except for a special case with tags, see §3.5.2.3,
and the case of enumerations such as
enum
{zero,one};
,
see §3.5.2.2).
While many seemingly silly constructs are tolerated
in other parts of the language
in the interest of facilitating the
machine generation of C,
empty declarations were considered sufficiently easy to avoid.
The practice of placing the storage class specifier other than first in a declaration has been branded as obsolescent. (See §3.9.3.) The Committee feels it desirable to rule out such constructs as
enum { aaa, aab, /* etc */ zzy, zzz } typedef a2z;in some future standard.
Because the address of a register
variable cannot be taken,
objects of storage class register
effectively exist in a space distinct from other objects.
(Functions occupy yet a third address space).
This makes them candidates for optimal placement,
the usual reason for declaring registers,
but it also makes them candidates for more aggressive optimization.
The practice of representing register variables as wider types
(as when register char
is quietly changed to register int
)
is no longer acceptable.
Several new type specifiers have been added:
signed
,
enum
,
and
void
.
long float
has been retired and
long double
has been added,
along with a plethora of integer types.
The Committee's reasons for each of these additions, and the one deletion,
are given in section §3.1.2.5 of this document.
Three types of bit fields are now defined:
``plain'' int
calls for
implementation-defined
signedness (as in the Base Document),
signed int
calls for assuredly signed fields,
and
unsigned int
calls for unsigned fields.
The old constraints on bit fields crossing
word boundaries have been relaxed,
since so many properties of bit fields are implementation
dependent anyway.
The layout of structures is determined only to a limited extent:
To clarify what is meant by the notion that ``all the fields of a union occupy the same storage,'' the Standard specifies that a pointer to a union, when suitably cast, points to each member (or, in the case of a bit-field member, to the storage unit containing the bit field).
As with all block structured languages that also permit forward references, C has a problem with structure and union tags. If one wants to declare, within a block, two mutually referencing structures, one must write something like:
struct x { struct y *p; /*...*/ }; struct y { struct x *q; /*...*/ };But if
struct
y
is already defined in a containing block,
the first field of struct
x
will refer to the older declaration.
Thus special semantics has been given to the form:
struct y;It now hides the outer declaration of
y
,
and ``opens'' a new instance in the current block.
struct
x;
is no longer innocuous.
The Committee has added to C two type qualifiers:
const
and volatile
.
Individually and in combination they specify the assumptions a
compiler can and must make when accessing an object through an lvalue.
The syntax and semantics of const
were adapted from C++;
the concept itself has appeared in other languages.
volatile
is an invention of the Committee;
it follows the syntactic model of const
.
Type qualifiers were introduced in part to provide greater control over optimization. Several important optimization techniques are based on the principle of ``cacheing'': under certain circumstances the compiler can remember the last value accessed (read or written) from a location, and use this retained value the next time that location is read. (The memory, or ``cache'', is typically a hardware register.) If this memory is a machine register, for instance, the code can be smaller and faster using the register rather than accessing external memory.
The basic qualifiers can be characterized by the restrictions they impose on access and cacheing:
const
volatile
A translator design with no cacheing optimizations can effectively ignore the type qualifiers, except insofar as they affect assignment compatibility.
It would have been possible, of course,
to specify a nonconst
keyword instead of const
,
or nonvolatile
instead of volatile
.
The senses of these concepts in the Standard were
chosen to assure that
the default, unqualified, case was the most common,
and
that it corresponded most clearly to traditional practice in the
use of lvalue expressions.
Four combinations of the two qualifiers is possible; each defines a useful set of lvalue properties. The next several paragraphs describe typical uses of these qualifiers.
The translator may assume, for an unqualified lvalue, that it may read or write the referenced object, that the value of this object cannot be changed except by explicitly programmed actions in the current thread of control, but that other lvalue expressions could reference the same object.
const
is specified in such a way that an implementation is at
liberty to put const
objects in read-only storage,
and is encouraged to diagnose obvious attempts to modify them,
but is not required to track down all
the subtle ways that such checking can be subverted.
If a function parameter is declared const
,
then the referenced object is not changed (through that lvalue)
in the body of the function --- the parameter is read-only.
A static volatile
object is an appropriate model for a
memory-mapped I/O register.
Implementors of C translators should take into account relevant hardware
details on the target systems when implementing accesses to volatile
objects.
For instance, the hardware logic of a system may require that a
two-byte memory-mapped register not be accessed with byte operations;
a compiler for such a system would have to assure that no such
instructions were generated, even if the source code only accesses one
byte of the register.
Whether read-modify-write instructions can be used on such device
registers must also be considered.
Whatever decisions are adopted on such issues must be documented,
as volatile access is implementation-defined.
A volatile
object is an appropriate model for a variable shared
among multiple processes.
A static const
volatile
object appropriately models
a memory-mapped input port, such as a real-time clock.
Similarly, a const
volatile
object models a variable which
can be altered by another process but not by this one.
Although the type qualifiers are formally treated as defining new types they actually serve as modifiers of declarators. Thus the declarations
const struct s {int a,b;} x; struct s y;declare
x
as a const
object, but not y
.
The const
property can be associated with the aggregate type
by means of a type definition:
typedef const struct s {int a,b;} stype; stype x; stype y;In these declarations the
const
property is associated with the
declarator stype
, so x
and y
are both const
objects.
The Committee considered
making const
and volatile
storage classes,
but this would have ruled out any number of desirable constructs,
such as const
members
of structures and variable pointers to const
types.
A cast of a value to a qualified type has no effect; the qualification
(volatile
, say) can have no effect on the access since it has occurred
prior to the cast. If it is necessary to access a non-volatile object
using volatile semantics, the technique is to cast the address of the
object to the appropriate pointer-to-qualified type, then dereference
that pointer.
The function prototype syntax was adapted from C++. (See §3.3.2.2 and §3.5.4.3)
Some current implementations have a limit of six type modifiers (function returning, array of, pointer to), the limit used in Ritchie's original compiler. This limit has been raised to twelve since the original limit has proven insufficient in some cases; in particular, it did not allow for FORTRAN-to-C translation, since FORTRAN allows for seven subscripts. (Some users have reported using nine or ten levels, particularly in machine-generated C code.)
A pointer declarator may have its own type qualifiers, to specify the attributes of the pointer itself, as opposed to those of the reference type. The construct is adapted from C++.
const
int
*
means (variable) pointer to constant int
,
and int
*
const
means constant pointer to (variable) int
,
just as in C++, from which these constructs were adopted.
(And mutatis mutandis for the other type qualifiers.)
As with other aspects of C type declarators,
judicious use of typedef
statements can clarify the code.
The concept of composite types (§3.1.2.6) was introduced to provide for the accretion of information from incomplete declarations, such as array declarations with missing size, and function declarations with missing prototype (argument declarations). Type declarators are therefore said to specify compatible types if they agree except for the fact that one provides less information of this sort than the other.
The declaration of 0-length arrays is invalid, under the general principle of not providing for 0-length objects. The only common use of this construct has been in the declaration of dynamically allocated variable-size arrays, such as
struct segment { short int count; char c[N]; };
struct segment * new_segment( const int length ) { struct segment * result; result = malloc( sizeof segment + (length-N) ); result->count = length; return result; }In such usage,
N
would be 0
and (length-N)
would be written
as length
. But this paradigm works just as well, as written,
if N
is 1
. (Note, by the by, an alternate way of
specifying the size of result
:
result = malloc( offsetof(struct segment,c) + length );This illustrates one of the uses of the
offsetof
macro.)
The function prototype mechanism is one of the most useful additions to the C language. The feature, of course, has precedent in many of the Algol-derived languages of the past 25 years. The particular form adopted in the Standard is based in large part upon C++.
Function prototypes provide a powerful translation-time error detection capability. In traditional C practice without prototypes, it is extremely difficult for the translator to detect errors (wrong number or type of arguments) in calls to functions declared in another source file. Detection of such errors has either occurred at runtime, or through the use of auxiliary software tools.
In function calls not in the scope of a function prototype,
integral arguments have the
integral widening conversions applied and
float
arguments are widened to double
.
It is thus impossible in such a call to pass an unconverted
char
or float
argument.
Function prototypes give the programmer
explicit control over the function argument type conversions,
so that the
often inappropriate and sometimes inefficient default widening rules
for arguments can be suppressed by the implementation.
Modifications of function interfaces
are easier in cases where the actual arguments
are still assignment compatible with the new formal parameter type ---
only the function definition and its prototype need to be rewritten
in this case;
no function calls need be rewritten.
Allowing an optional identifier to appear in a function prototype serves two purposes:
extern int compare(const char * string1, const char * string2) ; void func2(int x) { char * str1, * str2 ; /* ... */ x = compare(str1, str2) ; /* ... */ }The optimizer knows that the pointers passed to
compare
are not used to assign
new values to any objects that the pointers reference.
Hence the optimizer can make less conservative assumptions about
the side effects of
compare
than would otherwise be necessary.
The Standard requires that calls to functions taking a variable
number of arguments must occur in the presence of a prototype
(using the trailing ellipsis notation ,...
).
An implementation may thus assume that all other functions are
called with a fixed argument list,
and may therefore use possibly more efficient calling sequences.
Programs using old-style headers in which the number of arguments
in the calls and the definition differ may not work in implementations
which take advantage of such optimizations.
This is not a Quiet Change, strictly speaking, since the program
does not conform to the Standard. A word of warning is in order,
however, since the style is not uncommon in extant code, and since
a conforming translator is not required to diagnose such mismatches
when they occur in separate translation units.
Such trouble spots can be made manifest
(assuming an implementation provides reasonable diagnostics)
by providing new-style
function declarations in the translation units with
the non-matching calls. Programmers who currently rely on being
able to omit trailing arguments are advised to recode using the
<stdarg.h>
paradigm.
Function prototypes may be used to define function types as well:
typedef double (*d_binop) (double A, double B); struct d_funct { d_binop f1; int (*f2)(double, double); };The structure
d_funct
has two fields, both of which hold pointers
to functions taking two double arguments; the function types differ
in their return type.
Empty parentheses within a type name are always taken as meaning function with unspecified arguments and never as (unnecessary) parentheses around the elided identifier. This specification avoids an ambiguity by fiat.
A typedef
may only be redeclared in an inner block with a declaration
that explicitly contains a type name.
This rule avoids the ambiguity about whether to take the
typedef
as the type name or the candidate for redeclaration.
Some implementations of C
have allowed type specifiers to be added to a type defined using
typedef
.
Thus
typedef short int small ; unsigned small x ;would give
x
the type unsigned short int
.
The Committee decided that since this interpretation may be
difficult to provide in many implementations,
and since it defeats much of the utility of
typedef
as a data abstraction mechanism,
such type modifications are invalid.
This decision is incorporated in the rules of §3.5.2.
A proposed typeof
operator was rejected on the grounds of insufficient utility.
An implementation might conceivably have codes for floating zero and/or null pointer other than all bits zero. In such a case, the implementation must fill out an incomplete initializer with the various appropriate representations of zero; it may not just fill the area with zero bytes.
The Committee considered proposals for permitting automatic aggregate initializers to consist of a brace-enclosed series of arbitrary (execute-time) expressions, instead of just those usable for a translate-time static initializer. However, cases like this were troubling:
int x[2] = { f(x[1]), g(x[0]) };Rather than determine a set of rules which would avoid pathological cases and yet not seem too arbitrary, the Committee elected to permit only static initializers. Consequently, an implementation may choose to build a hidden static aggregate, using the same machinery as for other aggregate initializers, then copy that aggregate to the automatic variable upon block entry.
A structure expression, such as a call to a function returning the appropriate structure type, is permitted as an automatic structure initializer, since the usage seems unproblematic.
For programmer convenience, even though it is a minor irregularity in initializer semantics, the trailing null character in a string literal need not initialize an array element, as in:
char mesg[5] = "help!" ;(Some widely used implementations provide precedent.)
The Base Document allows a trailing comma in an initializer at the end of an initializer-list. The Standard has retained this syntax, since it provides flexibility in adding or deleting members from an initializer list, and simplifies machine generation of such lists.
Various implementations have parsed aggregate initializers with partially elided braces differently. The Standard has reaffirmed the (top-down) parse described in the Base Document. Although the construct is allowed, and its parse well defined, the Committee urges programmers to avoid partially elided initializers: such initializations can be quite confusing to read.
This rule has a parallel with the initialization of structures. Members of structures are initialized in the sequence in which they are declared. The same can now be said of unions, with the significant difference that only one union member (the first) can be initialized.