Main Contents Page
CTEC1401 Programming in C
Quick Reference
Week 23: Data Structures
Week 24: Practice and Consolidation
Main Contents Page

Traditional Design Considerations

At this stage all the main building blocks for writing programs have been put into place. We have

  1. Baisc types, variables, expressions, and assignment
  2. Control Structures (sequence, selection, iteration)
  3. Data Structures (arrays and structs)
  4. Functional Abstraction (functions and paramter passing)
With these structures we can build small to medium sized programs using a monolithic design. (All of the features required for the program to work appear in a main program file (with the exception of standard library calls).

Developing larger programs means that we need to consider a number of important design issues:

  1. Top-down design is a traditional approach to decomposing a large problem into a series of smaller problems. It is an approach that is particularly effective when using so-called traditional computer programming languages of which C is an prime example. A high-level design will be written that concentrates on the overall controlling logic and relegates lower level design decisions to individual functions. The high-level control logic is written using the standard control abstractions. Throughout the 1960s and 1970s eminent computer scientists were realising that the hitherto typical unstructured control flow in programs (written in FORTRAN, BASIC, and COBOL) was leading to poorly designed, error prone software. A classic paper of the time was Edsger Dijkstra's famous 1968 letter to the ACM entitled "Goto statement considered harmful". Computer Scientists including Tony Hoare and Niklaus Wirth strived to incorporate structured programming concepts into new programming languages such as Algol, Pascal, Modula, and others. However, the techniques are independent of any particular language and can be applied successfully in, for example, C.

    A high-level design for a program is given below. (We have written it as a C main program). The problem involves initialising a (virtual) shopping basket; populating it with items; writing them out to the screen; then calculating and printing the cost of the basket:

    int main(void) {
       Basket b;
       clearBasket(&b);
       readItems(&b);
       writeItems(b);
       printf("Total cost is %i\n", cost(b));
       return 0;
    }
    
    In this high-level design you will notice that it is independent of the actual implementation details of the type Basket. No assumption has been made about its internal representation (a point we will return to shortly). At this level of design the actual implementation of the basket is not important - all that matters is that we can manipulate an instance of a basket through the use of function calls. Furthermore, you will see that the implementation of each of the functions is not needed at this level either - all that is important is that their signatures are correct so that we can see how the data is passed around the system. Precisely how a basket of items will be cleared, or written out, etc., will be considered in the next level of refinement. For example, we may design the writeItems() function using structured English in the first instance:
    void writeItems(Basket b)
        loop through each item in the basket
        for each item, b[i], format it and print to the screen
    
    We may then decide to delegate the process of formatting and printing an individual item to another lower level function:
    void writeItems(Basket b)
        loop through each item in the basket
        for each item, writeItem(b[i])
    
    This process of successively refining the sequence of events into smaller, self-contained chunks is what we call stepwise refinement. In C this is achieved by delegating chunks of work to (sub)functions, and, in turn, these functions may also delegate work to other functions. The use of parameters is therefore crucial as a communication mechanism - passing data back and forth between a function and its subordinates.

  2. Data visibility. In 1972 the concept of data hiding was published by David Parnas (Communications of the ACM). Structured programming techniques allowed for stepwise refinement of a problem and for "clean" (i.e. single entry - single exit (mostly)) control structures to be used. However, there remained the problem of where to declare the data types and where to store the data structures. There are three options:
    • Option 1: Declare types and variables at the top of the main program file. This has the advantage of making visible the implementation of the data and its instances to the whole program. This, of course, means that every function in the main program file has access to this information. Given the shopping basket example we could, for example, decide upon an implementation for the data structures
        typedef struct { char name[40]; int price; } ShopItem;
        typedef struct { ShopItem items[SIZE]; int nextItem; } Basket;
        
      and then create a global instance of the data structure towards the top of the main program file (after the typedefs):
        Basket b;
        
      The intention here is that there is only one shopping basket being processed at any one time so all functions have access to that single instance. Therefore we might as well declare it as global to the program. Our main program would no longer need to pass the basket around because it would be available globally.
        int main(void) {
           clearBasket();
           readItems();
           writeItems();
           printf("Total cost is %i\n", cost());
           return 0;
        }
        
      Compare this with our previous high-level design. The basket no longer needs to be passed between the functions and the parameter lists are empty.

      This approach is acceptable to a point. However, in any nontrivial program there will be other functions responsible for other aspects of a system. I.e. the whole program will not be solely concerned with the maintenance of one single data structure. For example, as well as the shopping basket we may wish to model the payment process which would involve other data structures for handling money. As soon as the program starts to use multiple data structures in this way making them global becomes a problem: e.g. why should a function that handles the payment process have any access to the shopping basket structure? It doesn't need to access it. At best this is just an unnecessary visibility, at worst it is dangerous if the function deliberately or accidentally modifies the basket (and don't think this won't happen!) The best protection against such inappropriate access is to hide data structures from functions that do not need to see them.

    • Option 2: Declare types at the top of the program file but declare instances locally. This approach makes the implementation of the data structure visible to every function in the file but does not make a global instance visible to all functions. Thus, some protection is afforded to that variable. This requires that the variable is passed between function calls. This is the approach adopted here (we saw this earlier) in which the single instance of a shopping basket is declared in the main function and passed between functions that are delegated to implement the second level design.
            int main(void) {
               Basket b;
               clearBasket(&b);
               readItems(&b);
               writeItems(b);
               printf("Total cost is %i\n", cost(b));
               return 0;
            }
            
      This approach is safer than putting the single instance globally when there are other functions around that do not need to access or modify the data structure.

      NB: Such an approach encourages the development of functions that manipulate single instances of a data structure and for the particular instance to be passed as a parameter. Of course, if it were known that, say, 20 shopping baskets were needed then these could be declared globally - either individually or, more likely, using an array. However, the problems associated with unconstrained access to one instance of a global data structure would now be multiplied by 20 (or possibly much much more given the potential for complex interactions between functions accessing global data).

    • Option 3: Declare the data types and functions in a separate file. The best approach is to develop a library to handle instances of a particular data structure. The library can be developed in a separate C file and will only contain functions that have legitimate access to the data structure and its components. The main program (or any other calling function) must only manipulate instances of the data structure through the functions provided in the library. The main program must not make any assumptions about the implementation of the data structure and neither must it exploit any information it may have access to. (C cannot control all the visibility of this type of information in ways that modern lanagues can so we must work with the language and adopt a disciplined approach.)

      We will develop a solution using this strategy shortly. In computer science text books you will often see this described as programming with abstract data types.

  3. Separate compilation. Following on from the previous section we need to consider how to structure our program across multiple files. The intention is to implement data hiding as much as possible because this leads to safer and better structured software. Such programs are also easier to read. We move away from monolithic programs and towards modular programs. The principle of stepwise refinement will remain crucial as the design strategy for developing the individual modules.

To complete this section we will present two versions of the shopping basket case study. In first version we will declare the shopping basket as a global variable leaving the function paramter lists empty. In the second verision we will show how the data types remain global but the instance is declared locally (in the main function) and passed between the functions using parameters.

#include<stdio.h>
#define SIZE 100

/* Uses a single global shopping basket */

typedef struct { char name[40]; int price; } ShopItem;
typedef struct { ShopItem items[SIZE]; int nextItem; } Basket;

Basket b; // Shopping basket instance declared (globally)

void clearBasket() {
   b.nextItem = 0;
}

int isFull() {
    return b.nextItem == SIZE;
}

int addItem(ShopItem item) {
   if (isFull())
       return 0;
   else {
       b.items[b.nextItem] = item;
       b.nextItem ++;
       return 1;
   }
}

int cost() {
   int i;
   int c = 0;
   for (i=0; i<b.nextItem; i++)
       c += b.items[i].price;
   return c;
}

int readItem(ShopItem *item) {
   int r = scanf("(\"%[^\"]\", %d)\n", (*item).name, &((*item).price));
   return r && (r!=EOF);
}

void readItems() {
   int i;
   ShopItem item;
   while ( readItem(&item) ) {
       if ( addItem(item) )
           continue;
       else
           break;
   }
}

void writeItem(ShopItem item) {
   printf("Name = %s\tPrice = %i\n", item.name, item.price);
}

void writeItems() {
   int i;
   for (i=0; i<b.nextItem; i++)
       writeItem(b.items[i]);
}

int main(void) {
   clearBasket();
   readItems();
   writeItems();
   printf("Total cost is %i\n", cost());
   return 0;
}
#include<stdio.h>
#define SIZE 100

/* Passes a shopping basket between the functions */

typedef struct { char name[40]; int price; } ShopItem;
typedef struct { ShopItem items[SIZE]; int nextItem; } Basket;

void clearBasket(Basket * b) {
   (*b).nextItem = 0;
}

int isFull(Basket b) {
    return b.nextItem == SIZE;
}

int addItem( Basket * b, ShopItem item ) {
   if (isFull(*b))
       return 0;
   else {
       (*b).items[(*b).nextItem] = item;
       (*b).nextItem ++;
       return 1;
   }
}

int cost(Basket b) {
   int i;
   int c = 0;
   for (i=0; i<b.nextItem; i++)
       c += b.items[i].price;
   return c;
}

int readItem(ShopItem *item) {
   int r = scanf("(\"%[^\"]\", %d)\n", (*item).name, &((*item).price));
   return r && (r!=EOF);
}

void readItems(Basket * b) {
   int i;
   ShopItem item;
   while ( readItem(&item) ) {
       if ( addItem(b, item) )
           continue;
       else
           break;
   }
}

void writeItem(ShopItem item) {
   printf("Name = %s\tPrice = %i\n", item.name, item.price);
}

void writeItems(Basket b) {
   int i;
   for (i=0; i<b.nextItem; i++)
       writeItem(b.items[i]);
}

int main(void) {
   Basket b; // Shopping basket instance declared (locally)
   clearBasket(&b);
   readItems(&b);
   writeItems(b);
   printf("Total cost is %i\n", cost(b));
   return 0;
}

You should compile and run both of these programs. For sample input data you can use:

("Orange: 2L", 99)
("Crisps: 25g",47)
("Choc: 400g", 105)
("Soup: 330ml", 76)
If you put this into a file then you can redirect standard input into the program - this is easier than typing!

Creating Libraries

We will now devlop the shopping basket case study into a solution that uses a library. This will require using multiple C files and compiling them separately; then linking them together to form a complete program.

In this scenario we will adopt the model in which the user defines the instance of the data structure in the client program. The library will be written to handle any number of instances of a shopping basket. Later we will develop a solution to another problem in which we will encapsulate a single instance of a data structure within a library module.

  1. basket.h

    Open a text editor and put the following code into a file called basket.h

        #ifndef BASKET_H
        #define BASKET_H
        #define SIZE 100
        
        typedef struct {
            char name[40];
            int price;
        } ShopItem;
        
        typedef struct {
            ShopItem items[SIZE];
            int nextItem;
        } Basket;
        
        void clearBasket(Basket * b);
        int  isFull(Basket b);
        int  addItem( Basket * b, ShopItem item );
        int  cost(Basket b);
        void readItems(Basket * b);
        void writeItems(Basket b);
        
        #endif
        

    • Note that the lines beginning with a hash (#) are preprocessor directives. The C compiler begins by preprocessing the source file and reacting to directives beginning with the hash symbol (#). Up to now you have seen directives for including library files (#include) and defining values (#define). The use of the construct (#ifndef ... #define ... #endif) tells the preprocessor to include all the code between the #ifndef and the #endif provided that the value BASKET_H has not been previously defined. This ensures that this header file is only included once even if it is included multiple times. This is more generally an issue with larger programs but we should adopt this good style from the outset.
    • See that the type definitions are included here and will be available to any other C file that #includes this header file.
    • Observe the function headers. We place here the signatures of the functions that will be made publicly available for use in other files. Each header includes the name of the function and its input and output types - but there is no body to the function - the signature is terminated with a semicolon. Such definitions give the compiler sufficient information to check that the function is being used properly but does not give the compiler the code that implements the functions.
    • Of specific interest is the lack of some functions. We have omitted the readItem() and writeItem() functions. These are part of the stepwise refinement of the readItems() and writeItems() functions respectively but they are not to be exported for more general use. We say that readItem() and writeItem() are private functions and for this reason they do not appear in the header file. Whatever is in the header file is considered public.
  2. basket.c

    Open a text editor and put the following code into a file called basket.c

        #include <stdio.h>
        #include "basket.h"
        
        void clearBasket(Basket * b) {
           (*b).nextItem = 0;
        }
        
        int isFull(Basket b) {
            return b.nextItem == SIZE;
        }
        
        int addItem( Basket * b, ShopItem item ) {
           if (isFull(*b))
               return 0;
           else {
               (*b).items[(*b).nextItem] = item;
               (*b).nextItem ++;
               return 1;
           }
        }
        
        int cost(Basket b) {
           int i;
           int c = 0;
           for (i=0; i<b.nextItem; i++)
               c += b.items[i].price;
           return c;
        }
        
        int readItem(ShopItem *item) {
           int r = scanf("(\"%[^\"]\", %d)\n", (*item).name, &((*item).price));
           return r && (r!=EOF);
        }
        
        void readItems(Basket * b) {
           int i;
           ShopItem item;
           while ( readItem(&item) ) {
               if ( addItem(b, item) )
                   continue;
               else
                   break;
           }
        }
        
        void writeItem(ShopItem item) {
           printf("Name = %s\tPrice = %i\n", item.name, item.price);
        }
        
        void writeItems(Basket b) {
           int i;
           for (i=0; i<b.nextItem; i++)
               writeItem(b.items[i]);
        }
        

    This file contains the implementation of the basket abstract data type. The public functions have been implemented and the code includes private functions that are used in the decomposition of some of the public methods.

    Note, however, that there is no main function. This means that the file is not an executable program: rather, it is a collection of function implementations. It is a library. It can be compiled like this:

        gcc basket.c -c
        
    The -c flag is important. It tells the compiler that this is a library module not a main program. If the source code is legal C then it will compile to an object file: basket.o. Make sure that you have successfully compiled this library file.

  3. myshop.c

    Now we are in a position to write a main program that uses the basket library. Put the following code into a text file myshop.c

        #include <stdio.h>
        #include "basket.h"
        
        int main(void) {
           Basket b;
           clearBasket(&b);
           readItems(&b);
           writeItems(b);
           printf("Total cost is %i\n", cost(b));
           return 0;
        }
        
    Now you can compile the complete program like this:
        gcc myshop.c basket.o -o myshop
        
    This command compiles the main program and links the precompiled library (object) file and combines these into a single executable. The idea extends to as many library files as you need for a particular application.

    Note that what we have achieved here is essentially a refactoring of the earlier solution to this problem. However, this is not merely a refactoring, but rather a demonstration of how to build C programs from library components.

  4. phonebook.h

    Consider a simple phonebook application that stores names and phone numbers. It is proposed that the phonebook itself is maintained within its own C library file phonebook.c. Initial development discussions have suggested the following basic data type:

        #define MAX_ENTRIES 2048
        #define NAME_LENGTH 100
        
        typedef unsigned long int number_t;
        
        typedef struct {
            char name[NAME_LENGTH];
            number_t number;
        } entry;
        
    The public interface to the library should consist of the following operations:
        int isFull()
        // Returns true when the phonebook is full
        
        int isEmpty()
        // Returns true when the phonebook is empty
    
        void initialise() 
        // initialises the phonebook so there are no entries
        
        int valid(int k) {
        // returns true if the index is valid (refers to a valid entry)
        
        entry getEntry(int k)
        // returns the entry at index k
        // the return value is undefined (random entry) if k is invalid
    
        int add(entry e)
        // adds a new entry to the phonebook
        // the name should not already be in the phonebook
        // returns true if the addition was successful
    
        int findName(char * name)
        // returns the index of the entry with the given name
        // if there is no entry then returns -1
        
        int findNumber(number_t number)
        // returns the first index of the entry with the given number
        // if there is no entry then returns -1
        
        int deleteName(char * name)
        // removes the entry with the given name from the phonebook
        // returns true if the name is successfully removed
        
        int deleteNumber(number_t number)
        // removes all entries with the given number from the phonebook
        // returns the number of entries deleted
        
        void print()
        // prints the phonebook to the standard output (terminal)
        
    NB: For the given application numbers are stored with their country code (so there will be no leading zeros). Therefore it is suitable to use an unsigned long integer for each number.

    Write the header file phonebook.h

  5. phonebook.c

    Given the following data structure (that must be global to the phonebook library)

        entry phonebook[MAX_ENTRIES];
        int next_entry = 0;
        

    Implement the phonebook.c library. You should test this incrementally by making a simple test program testphonebook.c that grows as the library develops. (I.e. do not try and build the whole library and then test it all at once). Remember the command to compile and run your program:

        gcc phonebook.c -c
        gcc testphonebook.c phonebook.o -o testphonebook
        
    A skeleton test program is provided below for you to modify (comment out the options (cases) you don't need and uncomment them as the program evolves.
        #include <stdio.h>
        #include <string.h>
        #include "phonebook.h"
        
        char getCommand() {
            char c;
            printf("Enter command ");
            scanf(" %c", &c);
            return c;
        }
    
        int main() {
            char c;
            entry e;
            char filename[80];
            char name[NAME_LENGTH];
            int k;
            number_t number;
            do {
                 switch(c = getCommand()) {
                     
                 case 'i' : initialise();
                            break;
                            
                 case 'f' : printf("Find whose number? ");
                            scanf(" %[^\n]",name);
                            if (valid(k=findName(name)))
                                writeEntry(stdout,getEntry(k));
                            else
                                printf("Don't know\n");
                            break;
                            
                 case 'w' : printf("Who called? Number was: ");
                            scanf(" %lu",&number);
                            if (valid(k=findNumber(number)))
                                writeEntry(stdout,getEntry(k));
                            else
                                printf("Don't know\n");
                            break;
                            
                 case 'a' : printf("Add who? ");
                            scanf(" %[^\n]",e.name);
                            printf("Number? ");
                            scanf(" %lu", &(e.number));
                            add(e);
                            break;
                
                 case 'd' : printf("Remove who? ");
                            scanf(" %[^\n]",name);
                            deleteName(name);
                            break;
                
                 case 'p' : print();
                            break;
                             
                 default  : break;
                 }
            } while(c!='q');
        }