Thoughts on Programming Style

File Closing


When opening a file, immediately add a statement to close it, to avoid bugs caused by forgetting to close files. Add the file-processing statements between the file open-close pair (á lá begin - end).

Loop Counting


When implementing a count for a loop (repeat until, while):

1. Start count at 0

2. Increment count at start of loop process

3. Test for count = max.-count


Note: Could increment after loop process. Use count + 1 in loop-process. Only increment count if process was successful. This method provides a form of error-reporting; count is the number of processes successfully completed.

Data Files Compatibility/Extendibility


How to make (flat ASCII) data files extendible to allow future programs (which expect the data file to contain extra items) to handle data files from earlier versions which lack the extra items? Separate data-blocks with blank lines or include the block size at head of each block?

Data File Format


Record number of values following (within a block) at start of block. This allows items to be added to end of value-block without affecting value-reading. Value-reading process is terminated by number of values expected. Offers extra means of separating items and data-integrity checking (generate error if number does not match quantity of values).

Procedures and Parameters


Consider the following:

procedure SaveFiles (const filename : string)

procedure SaveInWordFormat( const filename : string;
                            const data     : array[] of integer);

Procedure SaveInTextFormat( const filename : string;
                            const data     : array[] of integer);

var data : array[] of integer;

   SaveInWordFormat(filename, data);
   SaveInTextFormat(filename, data);

The Word format files have the extender .DOC, while the Text format files have the extender .TXT.

How should the extenders be declared/implemented? Should they be declared by SaveFiles and passed to the sub-procedures (SaveIn~Format). Or, should they be declared in the sub-procedures, with SaveFiles only passing the path (without the extender)?

Filename Conventions and Procedure Structure


Second thoughts on previous entry (13/10/95).

If any of the potential filenames already exist, a renamed-filename can be created to apply to all child procedures. For instance, say one wished to create the files example.txt and example.doc. If example.txt (but not example.doc) already exists, then it is necessary to rename to example1.txt (say).

However, this presents a problem for the .DOC saving. If the procedures name files independently, this will create EXAMPLE.DOC which does not correspond to EXAMPLE1.TXT. For consistency, the filenames produced by each child procedure should be the same (apart from the extender).

In order to avoid inconsistent filenames, the parent procedure should detect the presence of the specified filename. However, in order to do this, the procedure must test the filename with all specified extenders, this implies that the parent procedure must know the extenders.

Alternatively, the parent procedure could ignore the extender and simply test for the existence of a filename corresponding to the specified name (regardless of the extender). This solution, however is not very elegant and can lead to unnecessary renaming (if a matching filename with an unused extender exists).

As the parent procedure already knows the extenders (assuming the proposed solution) it may as well pass the full filename to the child procedures. Also, the parent procedure should test for existing filenames given the specified extenders.

Currently, there seem to be no disadvantages to controlling the filename extenders from the parent, rather than the child, procedure.

Conditional Compilation


Conditional-compile switches should be created so that they are DEFINED for debugging, and UNDEFined for release version. This is to ensure consistency - when creating a release-version, it is only necessary to check that all conditional-compile switches are UNDEFined, rather than checking the sense of each one individually.

Identifier Naming


For time lengths, use _dur_ instead of length. _dur_ (for DURation) is more appropriate and specific to a time identifier than _len_ (for LENgth).

Identifier Naming for Units of Measurement


When an identifier holds a measurement value, suffix the name with a two-char. unit, thus:

_us = microseconds;
_ms = milliseconds;
_sc = second;
_mn = minute;
_hr = hour

Note: Using _min for minute is ambiguous as _min can be (and usually is) used to mean minimum.

File Saving


For projects, use extender to denote project (program) rather than the file-type. For example (.hik for HICK; .hpr for HPR; rather than .txt or subject number).

When saving files, reporting if file exists, and offer choice of overwriting. Don't overwrite or rename without giving a warning.

Conditional Compilation


When creating compiler directives for program-test code, it may be best to include all of these within a single IFDEF DEBUG switch, so that all can be turned off with a single switch. Also, compiler-options (e.g. run-time error checking) could be set-off in this section, but only if testing has been completed.

Conditional Procedures


Where a procedure contains code which all gets executed if a condition is satisfied, should that conditional test be called inside the procedure (after it is called) or outside it (before it is called, and stopping the call if not satisfied).

If inside the procedure, the advantage is all code relevant to the procedure is contained within it (also simplifies the calling procedure) the disadvantage is that it is less efficient (procedure gets called even if the condition is not satisfied, and no code gets executed).

procedure ControlExperiment
   if not stimulus_finished 


procedure ControlStimulus
   if not stimulus_finished
      {code here}



Only optimise if it's too slow. "If it ain't broke, don't fix it"

There's no point in turning off the compiler's run-time error checking if the program runs satisfactorily with them on, even if all bugs have been fixed (as if!). This also applies to optimising the algorithms and data for speed or size gains if the program already works.

See McConnell, Code Complete

Stats Analysis


Write the stats. Analysis and features as a separate program from the data collection program. This increases modularity (the collection and analysis programs can still be integrated by using a batch file). The major advantage is that raw data can be re-analysed if the stats. Results file is damaged, but the raw data is still intact. This also eases the source-code testing process.

Prototype/Testing Filenames


When creating files, which are just for testing purposes, and (thus) are temporary - to be deleted when testing is complete, the first three letters of the filenames should be "TST".

Procedure Nesting


Nesting procedures within procedures can be confusing. Beyond about 1 or 2 levels, the nesting makes the parent procedure more large, cumbersome and textually fragmented (the procedure header, including the parameter list gets separated from its body).

Although nesting procedures within main procedures (one level deep) is usually ok, nesting within nested procedures (two levels deep) complicates the source code.

MainProc1 MainProc1 MainProc1
MainProc2 Nest1Proc1 Nest1Proc1
MainProc3 Nest1Proc2 Nest1Proc2
  MainProc2 Nest2Proc1
Begin {main} MainProc3 Nest2Proc2
...   MainProc2
End; {main} Begin {main} MainProc3
  End {main} Begin {main}
    End; {main}
No Nesting One-level Nesting Two-level Nesting

Using only one-level nesting would cause a conflict between consistency and needing to nest a procedure within a nested procedure. If it is physically nested at the first level, but locally subordinate to a procedure at the same physical nesting level, the situation becomes misleading.

It would seem better to no nest at all. This is already the convention for C programs. An advantage of this is that local variables can only be used within their specific routine. Although it is possible to declare local identifiers after the declaration of nested routines (but before the begin-end) the formal parameters must first be declared. This means that the formal parameter identifiers may inadvertently (and incorrectly) be used by a nested routine. Keeping all routines un-nested would prevent this problem arising.


Although nesting deeply can create difficulties, it may be sensible to nest to one level. Any procedures which are logically subordinate to a nested procedure are nested only at the first level (the same level as the procedure they are subordinate to). Whilst inconsistent, a limited degree of nesting may be the best compromise between demonstrating (and ensuring via. The compiler) logical subordinacy by nesting, and the readability and maintainability benefits of not nesting.

Loop Variable Incrementing


For consistency, use WHILE loops and set the index variable to zero and increment at the start of the loop.

Control Variable Naming

15/04/93 (back-dated from 23/06/97)

In fixed-length loops (FOR-NEXT) use _loop as a suffix to the identifier for the index.

In variable-length loops (REPEAT-UNTIL or WHILE) use _count as a suffix to the index identifier.

This convention emphasises the fixed-number-of-iterations that a FOR-NEXT loop offers. In REPEAT-UNTIL and WHILE loops, the index is less important, as the loop is usually controlled by a condition, which is independent of the index number.

Suggested Project Documentation

06/01/93 (back-dated from 23/06/97)

Software update list

Array Coding Strategy for Types

08/01/93 (back-dated from 23/06/97)

To ease source-code changes of the types of array variables, the following approaches are suggested:


First, declare the maximum and minimum values of the array elements as constants:

For example,

   min_temperature_c = 10;
   max_temperature_c = 100;

Then, use these limits to create a type for the array index.

For example,

   t_temperature_c = min_temperature_c..max_temperature_c;

When declaring a variable to index the array, if it is of the type declared, the compiler can generate run-time error checking to flag out-of-index array actions.

NOTE: When declaring minimum values for measures, zero values may be a special case. For instance, if a program analyses a set of files, zero files would be a special case in that it would be an unexpected (perhaps-faulty) condition. Whether or not a zero value should be assigned to a range type will depend on the specific circumstances.


The array element's type should be declared as a custom-type. Using a built-in type prevents find-and-replace actions on that type because it would change all other instances where that type is used. For example, instead of declaring all variables holding a subject's age to be BYTE, declare them to be type t_age, and declare t_age to be byte.

Using this strategy has two advantages. Firstly, it improves the readability of the source-code by abstracting the variables as ages rather than bytes. Also, it enables all age-containing variables to be re-assigned to a different range if the ages were changed. For instance, human ages could be contained in bytes if in units of years, but if fossil ages in years or human ages in days were to be supported, a byte-type would be inadequate.


Having created abstract types for the array's elements and index, it is a simple matter to declare the array:

For example,

   max_month = 12;
   min_month = 1;
   max_temperature_c = 100;
   min_temperature_c = -100;

   t_month = min_month.. max_month;
   t_temperature_c = min_temperature_c.. max_temperature_c;
   ta_average_temperature_c = array[t_month] of t_temperature_c;

The value of creating specific custom types for the elements and index of an array is that these types can be used throughout a program to define sub-arrays and variables for use as index and element buffers for the array.

Although used many times in a program, as they are defined only once at the top in the declarations, it is easy to change the type.

For instance, suppose an array of 100 integers elements is declared, and two variables are declared to act as index and element buffers.

Now, suppose we wanted to change the array to 1000 elements of real. Problems would arise in that index buffer would overflow and the element buffer would mismatch the array element. Linking these types together in the declarations links together any changes made in the two.

Obscure Logical Bug


if first_flag = true and second_flag then …

This will be true even if both flags are false because the expression is interpreted as:

if first_flag = (true and second_flag) then …

To avoid this problem, use parentheses to enclose each condition, thus any of the following would be correct:

if first_flag and second_flag then …

if (first_flag = true) and second_flag then …

if (first_flag = true) and (second_flag = true) then …

This bug is caused because the condition is evaluated thus:

if first_flag = (true and second_flag) then …

If the parentheses are omitted (the compiler uses order-of-preference which places the and before the =) in the source code, then they will be evaluated as shown above.

File Format Notes

05/01/95 (back-dated from 24/06/97)

Source-Code Unit Changes




Source Code Changes




Conditional Compilation Problems


When using many, or complex, condition-compile features, it can be very easy to miss a bug. If all combinations of the compile conditions are not tested, some part of the sourced code will not be compiled. If an error lies within uncompiled code, it may be missed until much later in the development process. The later the problem is discovered, the bigger a problem it becomes.

File Output Format for Debugging


If the program produces output files, which are difficult to read, implement a command-line option, which changes the normal format to a more readable format. Make it a command-line option instead of a compile condition so that the user can have the option.

Home About Me
Copyright © Neil Carter

Content last updated: 2000-01-21