.tab stops 8,16,24,32,40,48,56,64,72,80 .number chapter 3 .right margin 65 .fill .chapter C Style Guide .subtitle Disclaimer .page .c Disclaimer .c ---------- .bl 1 This document is intended to make C code under CP/M more portable and more maintainable. There are some hard and fast rules which must be followed to acheive these objectives. Rules, however, should not be used as substitutes for rational thought. From time to time, there will be situations which call for bending or breaking of the rules. The rules should be modified as necessary in these situations. .bl 1 This manual is not a tutorial on either the C programming language or on the software engineering discipline. The reader is assumed to be a professional programmer with some knowledge of the C language. We will deal primarily with aspects of the C language as they relate to a software engineering environment. .subtitle Goals .hl 1 Goals The goals of this document fall into three areas: Portability, Maintainability, and Readability. .hl 2 Portability Software written in C should be easily transported from machine to machine without conversion. Unfortunately, due to the differences in compilers and machines, it is possible to write C code which is hopelessly tied to a given machine or compiler. .bl 1 The guidelines presented here are intended to help the novice (and experienced!) C programmer from falling into the pitfalls that exist. We will also present ways in which efficiency may be gained using the C constructs. .hl 2 Maintainability Portable software tends to have a long lifespan. The author has seen programs written in COBOL in 1964, for instance, which underwent their third conversion to different hardware in 1979, and are still being used today. .bl 1 Maintenance accounts for as much as 80% of software expenditures in some companies. The microprocessor community is waking up to the need for more maintainable software as old products mature and require revamping. Consider the cost to applications software houses, for instance, to recode payroll programs to account for the 81-83 income tax cuts. Consider again that Congress tried to repeal the 83 tax cut, and it becomes obvious that programs must be written in a maintainable manner. .bl 1 This document will present guidelines which will enhance maintainability. A great deal of maintenance efficiency can be gained just from having all programs written in the same style. .hl 2 Readability It is often said that software source code has two classes of readers: humans and machines. The hardware / software cost ratio is approaching the point at which the emphasis in writing software must be on the human audience rather than the machine audience. .subtitle Modularity .hl 1 Modularity Properly modular programs can go a long way in reducing porting and maintenance costs. Programs should be modularized such that all the routines which perform a given function (say symbol table management) are grouped into a single module. This has two benefits: the maintenance programmer is allowed to treated most modules as "black boxes" for modification purposes, and the nature of data structures is "hidden" from the rest of the program. In a properly modular program, it should be possible to completely revamp any major data structure with changes to only one module. .hl 2 Module Size Conventional software engineering wisdom states that 500 lines is a good maximum size for modules. Perhaps a more reasonable limitation is that a module should be as big as necessary to perform a given function AND NO BIGGER. .hl 2 Inter-Module Communication Modules should communicate via procedure calls wherever possible. Global data areas should be avoided where possible. Where one or more compilations requires the same data structure, a header file should be used. .hl 2 Header Files Header files are used for files which are compiled separately, to define types, symbolic constants, and data structures in exactly the same fashion for all modules. .bl 1 Some specific rules for using header files are as follows: .ls .leb;Use the '_#include "file.h"' format for header files which are project-specific. Use '_#include ' for system-wide files. NEVER hard-code device and/or filenames in an include statement. .leb;Do not nest include files. .leb;Do not define variables (other than global data references) in a header file. NEVER initialize a global variable in a header file. .leb;In writing macro definitions, include parenthesis around each use of the parameters, to avoid possible precedence mixups. .els .subtitle Mandatory Coding Conventions .hl 1 Mandatory Coding Conventions Following is a list of practices which MUST be followed to preserve program portability. Specific items which these conventions are targeted to prevent are: .ls .leb;The length of a C 'int' variable varies from machine to machine. This leads to representation problems, and to problems with binary I/O involving 'int' quantities. .leb;The byte order of multi-byte binary variables differs from machine to machine. This leads to problems if a binary variable is ever viewed as a byte stream by any piece of code. .leb;Naming conventions and the maximum length of identifiers differ from machine to machine. Upper case / lower case distinction is not always present. .leb;Character and short variables are sign-extended to 'int' by some compilers during arithmetic operations. Some compilers don't do this. .leb;The data length of constants is a problem; Some compilers view a hex or octal constant as an 'unsigned int', some do not. In particular, the following sequence does not always work as expected: .bl 1 .lm +5 .nofill LONG data; . . . printf("%ld_\n",(data _& 0xffff)); .bl 1 .fill .lm -5 The purpose of the "printf" statement is to print the lower 16 bits of the long data item "data". However, some compilers SIGN EXTEND the hex constant "0xffff" (!). .leb;One must be careful of evaluation order dependencies, particularly in compound BOOLEAN conditions. .els In addition, much grief can be avoided simply by having everyone use the same coding style. .hl 2 Variable and constant names Local variable names should be unique in 8 characters. Global variable names and procedure names should be unique in 6 characters. All variable and procedure names should be completely lower case. .bl 1 Names defined via a "_#define" statement should normally be entirely upper case. The only exceptions are functions which are defined as macros; e.g., "getc", "isascii", etc. These names should also be unique in 8 characters. .bl 1 Global names should not be redefined as local variables within a procedure. .hl 2 Variable Typing Due to the differences in C compiler standard type definitions, using these standard types is unsafe in programs which are designed to be portable. A set of types and storage classes defined via "typedef" or "_#define" must be used instead: .bl 1 .test page 20 .lm 8 .nofill Type C Base Type ---- ----------- LONG signed long (32 bits) WORD signed short (16 bits) UWORD unsigned short (16 bits) BOOLEAN short (16 bits) BYTE signed char ( 8 bits) UBYTE unsigned char ( 8 bits) VOID void (function return) DEFAULT int (16/32 bits) .bl 1 Class C Base Class ----- ------------ REG register variable LOCAL auto variable MLOCAL Module static variable GLOBAL Global variable definition EXTERN Global variable reference .bl 1 .lm 0 .fill In addition, global variables must be declared at the beginning of the module. Local variables must be defined at the beginning of the function in which they are used. The storage class and type must always be specified, even though the C language does not require this. .hl 2 Expressions and Constants All expressions and constants should be written in an implementation independent manner. Parenthesis must be used to remove any possible ambiguities. In particular, the construct: .bl 1 .i 5 if(c = getchar() == '_\n') .bl 1 does NOT assign the value returned by "getchar" to c. Rather, the value returned by "getchar" is compared to '_\n', and c receives the value 0 or 1 (the true / false output of the comparison). The value returned by "getchar" is lost. Parenthesizing the assignment: .bl 1 .i 5 if((c = getchar()) == '_\n') .bl 1 fixes the problem. .bl 1 Constants used for masking should be written in a way such that the underlying 'int' size is irrelevant. In the 'printf' example used previously, specifying .bl 1 .i 5 printf("%ld_\n",(data _& 0xffffL); .bl 1 (a long masking constant), fixes the problem for all compilers. Specifying the "one's complement" often yields this effect: "_~0xff" instead of "0xff00". .bl 1 Character constants should consist of a single character for portability. Multi-character constants must be placed in string variables. .bl 1 Commas used to separate arguments in functions are not operators. Evaluation order is not guaranteed. In particular, function calls like: .bl 1 .i 5 printf("%d %d_\n",i++,i++); .bl 1 may perform differently on different machines. .hl 2 Pointer Arithmetic Pointers should never be manipulated as 'int's or other arithmetic variables. C allows the addition / subtraction of an integer to / from a pointer variable. No logical operations (e.g. ANDing or ORing) should be attempted on pointers. A pointer to one type of object may be converted to a pointer to another (smaller size) data type with complete generality. Conversion in the reverse direction may yield alignment problems. .bl 1 Pointers may be tested for equality with other pointer variables and constants (notably NULL). Arithmetic comparisons (e.g., ">=") will not work on all compilers and can generate machine dependent code. .bl 1 In evaluating the size of a data structure, bear in mind that the compiler may leave holes in a data structure to allow for proper alignment. ALWAYS use the "sizeof" operator. .hl 2 String Constants Strings must be allocated in such a way as to facilitate foreign language conversions. The preferred method is to use an array of pointers to constant strings which is initialized in a separate file. Each string reference would then reference the proper element of the pointer array. .bl 1 NEVER modify a specific location in a constant string, e.g.: .bl 1 .nofill BYTE string_[_] = "BDOS Error On x:" . . . string[14] = 'A'; .bl 1 .fill Foreign language equivalents are likely to be longer or shorter than the English version of the message. .bl 1 Never use the high-order bit of an ASCII string for bit flags, etc. Extended character sets make extensive use of the characters above 0x7F. .hl 2 Data and BSS Sections .bl 1 Normally, C programs have three sections, ".text" (program instructions), ".data" (Initialized data), and ".bss" (uninitialized data). Modification of initialized data is to be avoided if at all possible. Programs which do not modify the ".data" segment can aid the swapping performance and disk utilization of a multi-user system. Also, if a program does not modify the ".data" segment, the program can be placed in ROM with no conversion. .bl 1 This means that the program should not modify static variables which are initialized. Modification of initialized automatic variables is not subject to this restriction. .hl 2 Module Layout The layout of a module (compilation unit) should follow this basic general organization: .ls .le;A comment paragraph or two at the very beginning of the file which describe: .ls .la;The purpose of the module. .la;The major (which should be the only) entry points to the module from the outside world. .la;Any global data areas required by the module. .la;Any machine or compiler dependencies. .els .le;Include file statements. .le;Module-specific "_#define" statements. .le;Global variable references and definitions. Each variable should have a comment which describes the variable's purpose. .le;Procedure definitions. Each procedure definition should contain: .ls .la;A comment paragraph which describes the procedure's function, input parameters, and return parameters. Any unusual coding techniques should be mentioned here. .la;The procedure header. The procedure return type MUST be explicitly specified. Use VOID when no value is returned. .la;Argument definitions. All arguments MUST have the storage class and variable type EXPLICITLY declared. .la;Local variable definitions. All local variables must be defined before any executable code. Storage class and variable type MUST be explicitly declared. .la;Procedure Code. .els .els Refer to "Appendix A" for a sample program. .subtitle Suggested Coding Conventions .hl 1 Suggested Coding Conventions The following recommendations are actually personal prejudices of the author, and not strictly required for program portability and maintenance. .ls .le;Keep source code within 80 character margins, for easier screen editing. .le;Use a standard indention technique. The suggested standard is: .ls .la;Statements within a procedure should begin one tab stop (column 8) from the left margin. .la;Statements controlled by an "if", "else", "while", "for", or "do" should be indented one tab stop. If multiple nested indentions are required, use 2 spaces for each nesting level. Avoid going more than 5 levels deep. .la;The bracket characters surrounding each compound statement should appear on a separate line, using the indention level of the controlling statement. For instance: .bl 1 .lm +8 .test page 9 .nofill for(i=0;i UPPER) j = UPPER; output(j); _} .bl 1 .lm -8 .fill .la;A null statement controlled by a "for", "while", etc. should be placed on a separate, indented line for readability purposes. .els .le;Use plenty of comments, particularly for code which must be obtuse. (If code need not be obtuse, don't make it that way). There is no such thing as "self-documenting code". This is just wishful thinking on the part of language designers. .le;Put all the maintenance documentation in the source code itself. This is the only way it has any prayer of being updated when the code is changed. .le;Use blank lines, form feeds, etc. to improve readability. .els