MSc Computer Systems - Draft Teaching Plan

Hardware Section

I can be contacted on email eg@dmu.ac.uk

Substantial web based learning material can be found on Ian Sexton’s Web-Site www.cse.dmu.ac.uk/~sexton/WWWPages

They will be found under ELEC1099

The notes below were compiled from various sources, including Ian Sexton, Morteza Safazi, and Kath Garnett. Most of the originals can be found on their respective websites, or on Ian Sexton’s ELEC1099 WWWpages.

 

1 NUMBER REPRESENTATION

1.1 Decimal Numbers

In Europe we use the Latin alphabet to represent our diverse languages. However we ceased using Latin Numbers a long time ago. You are unlikely to see the date that I wrote these notes represented as -

XXVII:VI:MMIII Latin Numerals

Instead we use Arabic numerals – 0,1,2,3,4,5,6,7,8,9 – to represent our number systems, so our date looks like -
27:06:2003 Arabic Numerals


Our number system is based on powers on 10 – for no other reason than we have 10 fingers (and thumbs) to count up on. There are other number systems than ours; one lost Polynesian culture only recognised three numbers, One, Two and Many. Even today there is much debate as to whether or not Zero is a true number.


We represent our numbers as polynomials of powers of 10.
For example the number three-hundred-and-eight-six is written as 386. But this is just a convention, what we are really writing is

(3 x 100) + (8 x 10) + (6 x 1)
The first columns numbers are 1’s or 100
The second columns numbers are 10’s or 101

The third columns numbers are 100’s or 102

And so on


Numbers less than one are represented by –ve powers of 10. So 27.34 really means


(2 * 101) + (7 * 100) + (3 * 10-1) + (4 * 10-2)
Where 10-1 is 1/10, 10-2 is 1/100 etc.

We represent –ve numbers by placing a – or minus symbol in front of the numbers. This does not really make the numbers –ve, we just have a convention that tells us how to interpret that symbol.


Our everyday number system is infinite. We can create almost any number we like from the infinitely big to the infinitely small, they can positive or negative, they can even be imaginary. Computers are not so clever, as we will see.



1.2 B
inary Numbers
Computers are not as clever as we are, they are made up of electronic circuits known as LOGIC GATES. These gates are either ON or OFF, and these states are used to represent just two numbers - 0 or 1.


Computers deal with groups of binary digits called words. A word is a string of binary digits that the computer regards as a single unit of data. An n-bit computer is one in which the length of the most frequently used word is n and the main data paths are parallel n-bit paths. The value of n depends on the computer's vintage, the purpose for which it was designed, and its cost. It varies from 4 (in some simple microprocessors) to 64 (e.g. a Pentium) and beyond.


We can build up a number system that is just as rich as our decimal numbers, with just two numbers. The trick is to line them up in columns, and assign different values based on the power of 2 to each column.


If we stick with 4 columns for now we can represent 16 different number –

    23 22 21 20 ‘Binade’

    8 4 2 1         Equivalent Decade

    0 0 0 0         0
    0 0 0 1         1
    0 0 1 0         2
    0 0 1 1         3
    0 1 0 0         4
    0 1 0 1         5
    0 1 1 0         6
    0 1 1 1         7
    1 0 0 0         8
    1 0 0 1         9
    1 0 1 0         10
    1 0 1 1         11
    1 1 0 0         12
    1 1 0 1         13
    1 1 1 0         14
    1 1 1 1         15

We can extend our binary number if we wish, but for the moment we will stick with 4 bits of binary data – better known as a NIBBLE.

1.3 HEX Notation

Even though computers like binary, it is not easy for us to think in binary. Instead we use a compromise notation known as HEXADECIMAL. HEX arithmetic is a number system based on powers of 16 – therefore we need 16 characters to represent our numbers. For the first 10 numerals we use Arabic notation, for the additional 6 numerals we use Latin letters.

Binary Decimal HEX
0000     0             0
0001     1             1
0010     2             2
0011     3             3
0100     4             4
0101     5             5
0110     6             6
0111     7             7
1000     8             8
1001     9             9
1010     10           A
1011     11           B
1100     12           C
1101     13           D
1110     14           E
1111     15           F

We can now create larger numbers using multiple HEX digits. The first column will be powers of 1, the 2nd column powers of 16 and so on. As each HEX digit replaces 4 binary BITS, 4 HEX digits are the equivalent of 16 binary BITS.

For example the HEX number 3E7B is the same as –

Powers of 16 163 162 161 160
Equivalent decimal 4096 256 16 1
HEX Digits 3 E 7 B
Polynomial (3*4096) + (14*256) + (7*16) + (11*1)

1.4 Octal Notation

Octal notation is a number system based on powers of 8. It is rarely, if ever used these days, but you may come across it in old textbooks.

Binary Decimal OCTAL
000     0             0
001     1             1
010     2             2
011     3             3
100     4             4
101     5             5
110     6             6
111     7             7

1.5 Finite Number Sets

Our number system is infinite. However computers do not have infinite storage capacity, so computer number systems are finite.
A memory location is defined by its’ ‘width’, which is another way of saying how many binary bits can be stored in that memory location. Typical lengths are 1,4,8 & 16. These different lengths are also known as a BIT (1), NIBBLE (4), BYTE (8) and WORD (16).

If a memory location is only a byte wide, then the biggest number that it can store is 255 Decimal, or FF HEX. A word memory can go up 65535 Decimal or FFFF HEX. What happens if we overflow? Try it with a pocket calculator, enter the biggest number that you think of, say 99999999999 exp 99, and add 1 to it; the probable result is that the calculator displays ‘ERROR’ – why? Because computers are stupid. They have finite word lengths and cannot ‘think’ of numbers greater than the ability of their memory to store them.

If you add 1 to a byte memory holding FF, it rolls around back to zero. We will see later how computers handle overflows, by use of what is known as the CARRY FLAG.

Consider our finite number system, using HEX notation. If we have 8 bits our number system is a RING –

00 01 02 03 .... FD FE FF

When we add one to FF it wraps back to 00.

If these are considered to be +ve numbers only, the decimal equivalents to above sequence are –

+0 +1 +2 .... +253 +254 +255

When we add 1 to 255 it goes back to zero. So if we subtract 1 from zero we get FF HEX

So 00 – 01 = FF

The HEX number FF therefore represents the negative number –1

Let us rearrange our RING

FD FE FF 00 01 02 03
Negative Positive

This notation is known as 2’s complement. The rule is that if the most significant binary bit is a ‘1’ then the value is –ve,
otherwise it is +ve.


If we have 8 binary bits we can use the 256 available patterns to represent an UNSIGNED BYTE. The numeric range is
00 HEX to FF HEX, or 0 decimal to 255 decimal. Or we can use them to represent a
SIGNED BYTE. The numeric range is
80 HEX to 7F HEX, or –128 decimal to +127 decimal.

Here is a simple proof that it works.

00 = 0
FF = -1
FE = -2
FD = -3

What is –3 +4 = ?

FD + 04 = 01

1.7 Converting +ve Numbers to –ve Numbers

There is a simple technique that allows us to switch a +ve number to a –ve number, and back again. Write out the number in binary, swap all the bits over, and then add 1.

Try it –

03 = 00000011 +3 decimal to binary
11111100 swap the bits
11111101 add 1
-3 = FD convert to HEX

It works the other way around as well –

FD = 11111101 -3 in binary
00000010 swap the bits
00000011 add 1
03 = 03 convert to HEX

1.8 Sign Extension

If we expand a signed byte, to a signed word, then we must maintain the status of the sign bit. This is known as sign extension.

So the number +3 is 00000011 in binary. If it is sign extended to a 16-bit value it becomes 0000000000000011. The sign bit
is extended into the new bit fields.

Now consider –3, which is 11111101 in binary. When it is sign extended to 16 bits it becomes 111111111111111101.

1.9 Multiplication and Division

Start with the decimal number 37. To multiply it by 10 we simply add a zero to get 370. In effect we have shifted our number
to the LEFT, in order to multiply it by 10.

The same is true for binary, but in this instance shifting left multiplies by 2. So if we start with 0011 (3 decimal), and shift it
left we get 0110 (decimal 6).

Shifting right divides a number by the base power. So 370 become 37, and 0110 becomes 0011.

1.10 Arithmetic Shift, Logical Shift and Rotate
Most computers support three different types of rotate instruction. This is a generalisation, you must consult the computer’s
Assembly Language Manual for the precise details of how that computer works.

‘Arithmetic shifts’ are used for signed numbers, and maintain the status of the sign bit. Consider the negative –6, which is
11111010. If we Arithmetically Shift Right we obtain 11111101 which is –3. Note that all the bits are shifted right, the least
significant bit is lost, and the most significant bit remains unchanged. By maintaining the sign bit we can successfully divide
signed –ve numbers by 2.

Arithmetic Shifts Left multiply by two. Some computers maintain the sign bit, others just allow the bit 6 to shift into the
bit 7 position (such as the 6805). As long as our result does not exceed the available numeric range of –128 to +127, all
will be OK

Logical Shifts simply shuffle all the bits left or right as required. If we shift left, bit 7 is lost, and bit 0 is set to 0. If we shift
right then bit 0 is lost and bit 7 becomes 0. These instructions can be used to multiply & divide unsigned values, or just
for bit manipulation.

Rotations consider the number as a ring, but with the addition of an extra bit called the CARRY FLAG. If we rotate left
 then bit 7 goes onto the carry flag, bit 6 goes into bit 7 and so on, and bit 0 is filled with the previous contents of the
carry flag. Rotate right does it the other way around.

1.11 What is the Carry Flag

You have seen the carry flag mentioned a few times. Its role is to capture the bit that overflows (from bit 7) or underflows
(from bit 0) as a result of any relevant arithmetic or logical operation. So if we add together two numbers, with a result
greater than FF HEX then the carry flag is set to 1, else it is set to 0. It is in this case the ‘ninth’ bit. Similarly if we
subtract a big number from a little number it is set to 1, to indicate a ‘borrow’.

It is also used to capture overflows for shifts and rotates.

Some instructions will use the carry flag. For example the OP-CODE ADD will add together two numbers, whereas
the OP-CODE ADC will also add in the current value of the carry flag. In this way we can perform arithmetic on numbers
that exceed the word width of our memory locations.

ADD16:
LDA LOWBYTE1 ; Get low byte of 16-bit value 1
ADD LOWBYTE2 ; Add low byte of 16-bit value 2
STA LOWRESULT ; store the low byte of the result
LDA HIGHBYTE1 ; Get the high byte of 16 bit value 1
ADC HIGHBYTE2 ; Add high byte of 16 bit value 2, and carry
; from low byte addition
STA HIGHRESULT2 ; store the low byte of the result

Note that if our final answer is greater than FFFF HEX, then the carry flag is set.

All computers have a CARRY FLAG. They also have what is known as the ZERO FLAG, which is set whenever the result of the
 last arithmetic or logical operation resulted in a zero.


1.12 Ian Sexton’s Notes Including Floating Point Notation

http://www.cse.dmu.ac.uk/~sexton/WWWPages/info.html

2 Flags

Computers store numbers in registers. If our register can only store a byte, then the largest value that it can remember is 255. What happens if our result is greater than 255? Similarly what happens if we subtract a large number from a small number? We need a mechanism to indicate that the limited capability of our calculator store has resulted in an incorrect answer, because there was either a CARRY or a BORROW.

This is indicated using a special 1-bit store called THE CARRY FLAG. If any arithmetic or logical operation generates a ‘carry’, then the carry flag is SET, else it is RESET. Examples of logical operations that can cause a carry are the SHIFT and ROTATE operations that are discussed later. All computers have a CARRY FLAG.

The other flag that you will always find is known as the ZERO FLAG. It is SET if the result of the last operation was a zero, else it is reset. There are other types of FLAG that will come across, such as SIGN, HALF-CARRY, PARITYand INTERRUPT. These are usually stored in a special register called the CONDITION CODE REGISTER (CCR) or just known simply as THE FLAGS.

The half carry indicates that a carry has occurred between bits 3 & 4 of the accumulator. It used for BCD arithmetic.

The sign flag indicates if the last arithmetic operation resulted in a –ve result (i.e. the sign bit is set to a 1).

The interrupt flag indicates whether or interrupts are active. If it is set to a 1 then the Interrupt Mask is SET, so interrupts are disabled.

The parity flag is set if the parity of the accumulator is even.

3 My First Computer

The core of a computer is a component known as the Arithmetic & Logic Unit, or ALU. It is able to perform a range of arithmetic and logical operations, such as ADD, SUBTRACT, AND, OR etc.


The ALU shown here can perform one of 8 different operations. The Accumulator is always a source of data, and the destination
of the result. The other source of data (if required)
will be from any one of the other registers available in the computer.

How do we decide which instruction to perform? We could use a decoder to select the operation.



We need 3 control lines to select one of 8 different arithmetic or logical operations,
and 2 control lines to select the other source of data for that operation.
We can now construct some truth tables for our control lines:
Control 0
Control 1
Control 2
Operation
0
0
0
ADD
1
0
0
SUB
0
1
0
INC
1
1
0
DEC
0
0
1
SHR
1
0
1
SHL
0
1
1
AND
1
1
1
OR
			
Control 3
Control 4
Register
0
0
0
1
0
1
0
1
2
1
1
3

So how do we ADD REGISTER 2 to the ACCUMULATOR?

To select ADD we set control lines 0,1 & 2 to 000, and control lines 3 & 4 to 01. When the binary pattern 00001 is put onto the control lines then contents of register 2 will be added to the accumulator – that is the way the circuit is designed.

3.1 Exercises

What bit pattern do we use to perform the following operations :

SUBTRACT REGISTER 1 FROM THE ACCUMULATOR

AND REGISTER 3 WITH THE ACCUMULATOR

OR REGISTER 0 WITH THE ACCUMULATOR

3.2 Programmes

We have now seen that if we apply a binary pattern to the control lines of our computer it performs a known operation. The types of operation that we can perform are very simple – the above sequence (exercise 8.1 and section 8) consisted of a sequence of 4 operations.

ADD REGISTER 2 TO THE ACCUMULATOR

SUBTRACT REGISTER 1 FROM THE ACCUMULATOR

AND REGISTER 3 WITH THE ACCUMULATOR

OR REGISTER 0 WITH THE ACCUMULATOR

This sequence is known as a PROGRAMME. A programme is a sequence of 1’s and 0’s which when applied to the control lines of a computer will force to perform a known sequence of operations. This is the link between the ‘programmer’ and the engineers that designed the computer.

Binary patterns are recognised by our computer and are known as MACHINE CODE. They are usually represented in HEX as opposed to binary. There is a limit to the number of control lines that are found in a computer. A typical microcontroller would only have 8 control lines, which means that there a maximum of 256 possible machine code instructions. These are known collectively as the computer’s INSTRUCTION SET.

Human beings prefer words to numbers, so for each machine code there is a mnemonic code that represents that instruction. For example our 4-line programme could be expressed as -

ADD 2

SUB 1

AND 3

OR 0

This type of programming is known as ASSEMBLER CODE or ASSEMBLER LANGUAGE. There is one assembler instruction for each available machine code. Note that each instruction has two parts, one part is the actual operation itself, known as the OPCODE (e.g. ADD), the other part is the memory location that contains the data that is operated upon, or the OPERAND.

We will see later that there are different ways of selecting a unique memory location, known as ADDRESSING MODES. For now we have defined only two types of ADDRESS, register addressing and inherent addressing. Register Addressing selects a unique Register (in this case 0,1,2 or 3). Inherent means that the OPCODE itself refers to a specific address, such as INC and DEC, which always operate on the accumulator.

4 Memory
Computers require memory to store programmes and data. But what do we need to store in the memory? The simplest partition of the memory is into 3 parts – CODE, DATA & STACK.The code partition holds our programmes. The data partition holds our variables. The stack partition is used to manage programme flow, parameter passing and interrupts.

Memory comes in a hierarchy, ranging from ‘slow, big and low-cost’ to ‘fast, small and high-cost’.

The picture below show the simplest partition of our memory.



Code is usually placed at the bottom of memory, data comes next, then we have the ‘stack’.

4.1 How Do We Manage Programme Flow

First reconsider our 4-line assembler programme from earlier

ADD 2
SUB 1
AND 3
OR 0

Each of these commands will need to be stored in consecutive locations in our CODE SEGMENT.

So how does our computer know where to find the next opcode? This is achieved using a special register known as the INSTRUCTION POINTER (IP) or PROGRAMME COUNTER (PC).

Let us review our picture – We need to add a DATA BUS that is used to move opcodes and data from memory to the CPU, and results back into memory. We also need an ADDRESS BUS that is used by the CPU to uniquely select a specific memory location, either to access data or to obtain the next opcode.

This discussion assumes that we are using a computer with an 8-bit data bus and a 16-bit address bus. A 16-bit address bus allows us access to 216 (65536) memory locations. An 8-bit data bus means that our computer can support a maximum of 256 op-codes.

How does our computer manage the address bus? The programme counter is used to point to the next op-code, so we cannot use that to point to data. We need another pointer to select a data memory location; one method is to employ a special 16-bit register known as the INDEX REGISTER. Our instruction pointer is automatically set to 0000H when our computer powers up.


 

So the first op-code that will be executed is always at the bottom of memory. Our computer then enters the ‘instruction-fetch-execute’ cycle. Each op-code is broken down into a series of steps, which can be generalised as -

1) FETCH THE OP-CODE
2) FETCH THE OPERANDS
3) EXECUTE THE OP-CODES
4) STORE THE RESULT

So the op-code is not the lowest layer of our computer’s command structure, there is a lower layer that breaks down each op-code into a series of ‘micro-instructions’ or ‘micro-code’.

When each op-code is ‘fetched’ from the code segment, the instruction pointer is automatically incremented to point to the next available code segment memory location. This may be an extension to the current op-code (e.g. data or an address) or it may the next op-code. When our op-code has completed execution, the instruction pointer now ‘addresses’ the next op-code.

As our instruction pointer is a register, we can either make it increment as already discussed, or we could reload it with a totally new address. Consider the next piece of code –


In this instance, the op-code at location 2 loads the instruction pointer with the operand 0006. So our next instruction will be located at address 6 – it is the equivalent of the BASIC instruction GOTO, and is the means by which we can implement a branch in our programme flow.

4.2 Conditional Branching

Overwriting the instruction pointer using a JMP instruction. This type of branch is known as UNCONDITIONAL, as it will always happen. The really powerful instructions are those that combine the contents of the FLAGS with the branch instruction, these are the CONDITIONAL BRANCH op-codes. Consider just two of the flags, CARRY and ZERO. The carry flag is set if the last arithmetic or logical operation generated a carry, the zero flag is set if the result was a zero.

All computers support the following four conditional instructions, in one form or another –

    JZ Jump if zero
    JNZ Jump if not zero

    JC Jump if carry

    JNC Jump if not carry

It is the existence of these instructions that gives a computer the ability to take decisions. All high level constructs, such as IF THEN ELSE, DO WHILE, FOR etc. Are compiled down to one of these types of op-code.

4.3 Subroutines

Another form of branching that we need to support are subroutine calls. At a low level you would use commands such as –

CALL SUBROUTINE ; RUN ‘FUNCTION’ USING A 16-BIT ABSOLUTE ADDRESS
JSR SUBROUTINE ; RUN ‘FUNCTION’ USING A 16-BIT ABSOLUTE ADDRESS
BSR SUBROUTINE ; RUN ‘FUNCTION’ USING A 8-BIT RELATIVE ADDRESS

The CPU saves the current contents of Instruction Pointer, and then loads it with the address of the first line of the subroutine. The very last instruction of the subroutine will something like RET or RTS. When this opcode is executed the contents of the instruction pointer are restored, so that execution continues from the point that we called the subroutine from.

The ‘return address’ is temporarily stored in the STACK SEGMENT. Access to the stack is via a special purpose pointer register known as the STACK POINTER. When a return address is placed onto the stack it is said to be ‘pushed’ – the stack pointer automatically decrements after each push operation, such that it points to the next free entry.


Other uses for the stack segment include passing parameters to subroutines, receiving returned values from a subroutine, temporary storage of local variables and management of interrupts.

4.4 Speed Of Access – Backup, Cache and Pipelines

At the time of writing the typical access time for MAIN MEMORY is 60ns. Our main memory holds our code, data and stack segments. However it is only part of our memory hierarchy. When we are not using programmes or data they can be stored on large, slow BACKUP MEMORY devices such as hard disks, CD-ROMs, floppy disks and ZIP-Drives; we only need to load them into our main memory when we want to use them.

Once our code and data are loaded into main memory, and we are allocated a stack segment we can start executing the programme.

60ns sounds fast, but as today’s programmes consist of megabytes of code, and data storage for images can be vast, it all adds up to a lot of time. To speed things up we use another type of memory called THE CACHE. This is very high-speed memory that is located close to our CPU, and runs at between 2ns to 25ns depending upon how it is organised.

Main memory is usually made from STATIC RAM, similar to the D Type latches discussed earlier. Once data is written to them, it is stored for as long as power is applied to the memory device. Cache is usually made from DYNAMIC RAM; each memory cell contains far fewer active devices than dynamic ram memory, which is why they operate faster. However the data that is stored on them decays (i.e. the data is lost), so they have top be regularly REFRESHED, to restore the stored data.

Cache is used to speed up data segment access. If a data variable if used, then it is probable that it will be used again very soon; so instead of putting it back into main memory a copy is retained in the cache. In this way the CPU can access the most recently used data very quickly. Cache locations are allocated dynamically, as they are required. When the cache is full, the oldest data is put back into main memory freeing up a new location.

What about CODE? If we have just executed an op-code at location n, then the next instruction is probably going to be at location n+1. Instead of waiting for instruction n to finish, we might as well get the op-code at location n+1 whilst the CPU is executing instruction n. This is known as PIPELINING. The execution part of the CPUs control logic fetches more than just the next instruction, it fetches a batch of sequential instructions, ready for execution. Some more advanced devices look out for branch operations, and fill up two pipelines ready with both possible routes, so that it is always ready to deliver the next op-code in the shortest time possible.

5 MEMORY MANAGEMENT

Memory Management Units are responsible for allocating available main memory space to functions. This is a brief overview of how this task is achieved.

5.1 Virtual Memory

http://computer.howstuffworks.com/virtual-memory.htm

The above link gives a good description of virtual memory, with better pictures than mine. As we have already discussed, modern programmes need a lot of memory – and if you start executing a lot of programmes then you are liable to run out of main memory to keep them in. Virtual memory overcomes this problem. The Virtual Memory Manager looks for available RAM to store a new programme in. If there is not enough space then it looks for a block of RAM that has not been used for some time (i.e. a programme that is ‘running’ but not actually doing anything at that moment of time), and copies that programme’s RAM onto a storage media such as a hard disk. This frees up main memory for the new task. When the dormant programme starts up again, its’ ‘ram image’ is restored to main memory.

5. 2 Segmentation

This is a totally different process that is used to allocate the available main memory RAM to a new task.

When we compile a programme it creates an executable that requires 3 memory segments, CODE, DATA & STACK. These segments are referenced using pointers, such as the INSTRUCTION POINTER, the INDEX REGISTER and the STACK POINTER. Let us assume that these pointers are 16-bits wide; then each segment consists of only 64k bytes.

However our main memory is much larger than this, a MEGABYTE has 1048576 memory locations. Segmentation is a technique that allows us to locate our programme segments anywhere within the main memory.

Our programme address space is only 16-bits wide – which means that all memory pointers have a range of 0000 to FFFF. However if we have a megabyte to play with, our real memory space has a range of 000000 to FFFFFF, which is a 24-bit address.

In effect we have 256 ‘blocks’ of 64k each available. Our programme requires 3 of these, for code, data and stack. The memory management unit allocates 3 free blocks for this new task. The simplest way to do this is to have a set of segment pointers, each of which are 8-bits, that are used to ‘offset’ our programmes blocks into the available main memory. This is known as the EFFECTIVE ADDRESS, as it is the actual location in main memory where will find our information.

An effective address is obtained by combining the programme's pointer information with the segmentation offset.

        EFFECTIVE ADDRESS = POINTER (lower 16-bits) + SEGMENT OFFSET (upper 8-bits)

With the above method CODE, DATA and STACK can never overlap.

Another technique, as used in PCs, is to allow an overlap between the pointer address and the segment offset. In this case the pointer gives us the lower 16-bits, and the segment gives us bits 4 through to 20. This technique provides a more efficient use of main memory, as unused gaps are removed – however it can cause problems as poorly written programmes may allow data to be overwritten with code, and visa-versa.

6 ADDRESSING MODES

So far we have only talked about using memory pointers, such as the Stack Pointer, Instruction Pointer and Index Register. The Index Register allows us to point to any data location of our choice, but it requires us to preload the Index register with an address. There are more direct forms of addressing data that can be used.

    REGISTER
    IMMEDIATE
    DIRECT
    INDIRECT
    INDEXED

6.1 Register Addressing

All CPU's contain some internal registers, which act like local fast data stores. For example an 8086 type CPU contains 4 'general purpose' registers called AX,BX,CX and DX. Programmes can access these registers as an integral part of their op-codes,

    e.g. MOV AX,BX will move the contents of register BX into register AX.

You are unlikely to ever make use of REGISTER ADDRESSING unless you are prepared to abandon your compiler and write your code in assembler. Some C compilers allow you force a variable to be stored in a REGISTER by a slight modification to the variable declaration. e.g. register unsigned int name - will force the compiler to try to reserve a register for the variable 'name' to be stored in a register. You are not advised to use this facility.

6.2 Immediate Addressing

IMMEDIATE ADDRESSING means that the number that we are loading is part of the instruction itself, and is not found in the data memory.

    e.g. MOV AX,#334 - will load the AX register with the number 344.

The following two lines perform the same operation in C

    int my_variable ; // declare an integer Variable called 'my_variable'
    my_variable = 344 ; // store the number 344 into my_variable

With immediate addressing the data is fixed in the code, and is therefore a CONSTANT. In the example given above the number 344 is the constant.

6.3 Direct Addressing

DIRECT ADDRESSING means that the opcode includes the address of the variable that we wish to load. e.g. MOV AX,my_variable - the accumulator AX is loaded with the contents of my_variable.

Or in C

    int my_variable ; // declare an integer variable called my_variable
    int another_variable ; // declare another variable
    my_variable = another_variable ; // whatever was stored in 'another_variable' is // copied into 'my_variable'

You may have now realised that variable names are really just alternative ways of using an absolute memory address. When the compiler 'compiles' our source code, it first allocates a unique memory address for each and every variable that is declared. The machine code accesses memory using numeric addresses, but we access memory by using the names that we declare in our source programmes.

With direct addressing our data address is fixed, but the data value itself can vary.

6.4 Indirect & Indexed Addressing

INDIRECT ADDRESSING means that the address of the variable that we wish to address is not included in the op-code at all, instead we are given the address of a different variable that contains the address the actual variable that we wish to access.

At a low level this can be achieved by means of either an INDEX or a POINTER register. These are special registers in the architecture of the computer that are specifically designed for use as indirect registers.

In an 8086 one these registers is called BP or base pointer. If we load BP with a number, then that number can then be used as the address of a variable e.g.

    MOV BP,#344 ; Load the base pointer with the number 344
    MOV AX,[BP]  ; Load the AX register with the contents of the memory location pointed to by BP

When an internal register is used as a pointer it is known as INDEXED ADDRESSING.

Alternatively we can use another memory location as our data pointer. e.g.

    MOV 178,#344 ; Load memory location 178 with the number 344
    MOV AX,[178] ; Load the AX register with the contents of the memory location pointed to by the contents of memory location 178

When an external memory location is used as a pointer it is known as INDIRECT ADDRESSING.

In C this is achieved as follows -

    unsigned int my_variable ; // declare a variable for us to play with
    unsigned int *pointer ; // declare a pointer to an unsigned integer

So we have a variable to use, and we have a pointer as well. Remember that when we start our programme the pointer DOES NOT HAVE A VALUE, as yet it does not know what to point at.

I will now introduce another C concept, the 'address' symbol, which is a '&'. The & operator returns the address of the variable that it is applied to.

    my_variable = 344 ; // load 344 into my_variable
    pointer = &my_variable ; // load pointer with the address of my_variable

We can now either directly address my_variable using its' name 'my_variable', or we can indirectly address the identical memory location using the pointer. When a pointer is used we use the * symbol to indicate that we wish to access the variable that the pointer 'points to'.

    IMMEDIATE ADDRESSING -
            my_variable = 344 ; // the source data is the constant 344

    DIRECT ADDRESSING -
            another_variable = 344 ;
            my_variable = another_variable ; // the source data is the contents of address // 'another_variable'

    INDIRECT ADDRESSING
            pointer = &another_variable ;
            another _variable = 344 ;
            my_variable = *another_variable ; // the source data is the contents of the memory location pointed to by the contents of                                                                 // pointer

7 Input and Output I/O

How do we get information into and out of a computer? There are two main techniques, Serial I/O and Parallel I/O.

Parallel is the easiest to consider first. Data is transferred between the CPU and memory using the DATA BUS. A Data Bus is no more than 8, 16 or 32 wires in parallel. A PARALLEL PORT is an electrical interface to this bus, instead of the data being presented to a memory location it is instead presented to a series of contacts on a connector. For example a printer port on a PC is a parallel interface – 8-bits of data from the data bus are put onto the pins of the printer connector, another wire called the strobe is asserted to inform the printer that another character is available for it to read. Parallel ports can be uni-directional (as just described), or bi-directional which means that data can be transferred in both directions using the same connections.

Parallel connections are very fast, because each bit of data has its’ own wire to move along. They are not practical for long distances, not least because of the cost of the cabling.

For long distances we use SERIAL I/O. Examples of serial I/O are a PC’s COM PORTS, USB, FireWire and Ethernet. Today we will only consider Serial I/O.

A PC’s COM port conforms to the CCITT V24 standard, better known as RS232. Instead of sending out all data bits at the same time, we clock them out individually one after the other over a single wire – thus the name Serial I/O. Each character is ENCAPSULATED between a START BIT (0) and a STOP BIT (1).

7.1 Direct Memory Access DMA

Serial and parallel I/O both require direct intervention by the CPU, by means of a suitable programme (often called Drivers in PCs). An alternative technique is to use DMA.

DMA is a technique that allows data to be moved into and out of memory without any intervention by the CPU. It can be achieved in a number of different ways. An external DMA Controller can take over the CPUs data bus, effectively locking the CPU out of memory. Whilst the CPU is locked out the DMA controller can perform the memory transfer.

Another technique is known a ‘cycle stealing’ – the CPU only accesses memory when it needs to (i.e. to get the next op-code or to retrieve data), whilst it is not using the memory the DMA controller is allowed access.

8 INTERRUPTS

In an ideal world the occurrence of events, and the actions that need to be taken, can be planned for and scheduled into a nice clean multi-tasking multi-use operating system. However we all know that that is not the real world.

External events can, and will, occur whenever they want to, and the computer must react to them within a specified time. For example we may have an electronic control unit (ECU) for a car, that measures road speed and updates the driver's speedometer. To achieve this the time of arrival of pulses from the road speed sensor must be logged almost immediately (depending upon the accuracy required) whatever the processor may be doing. So the processor must suspend its’ current operation, and pass control immediately over to the road speed sensor function. The processor is INTERRUPTED by the external event, which takes over control of the processor and all its’ resources.

8.1 Examples of Interrupts

In the simplest of systems there will normally be at least two possible interrupt sources. One will be an EXTERNAL INTERRUPT, and the other will be a TIMER INTERRUPT.

8.1.1 External Interrupts

As its’ name implies an external interrupt, is a hard wired signal taken to the processor from an external source. This could be, for example, a pulsed input from an external frequency source, or a crash sensor on a car. This signal is usually digital, and the interrupt is triggered when that signal changes state. External interrupts can be either LEVEL SENSITIVE or EDGE SENSITIVE.

Edge sensitive signals detect a change of state, e.g. high to low or low to high. Level sensitive signals are sensitive to a high or a low level. Edge sensitivity is more commonly used as level sensitivity can cause multiple interrupts to occur from the same source.

External interrupts are also classified as MASKABLE or NON MASKABLE. A maskable interrupt is one that can be enabled and disabled by software, which means that the processor is capable of accidentally switching it off should there be a software error.

Maskable interrupts are usually used for processing expected external signals, such as limit switches, frequency signals, and peripheral requests for attention. In all these examples it is of advantage to the programmer to be able to disable the interrupt source, e.g. there is no point allowing an interrupt to occur during an initialisation or idle phase.

But what if the interrupt is an emergency warning signal, such as a crash sensor, or watchdog time-out. Fail safe procedures are degraded if it is possible for faulty software to switch off the alarm signals, therefore NON MASKABLE INTERRUPTS must be used for such signals.

TYPES OF EXTERNAL INTERRUPT

    RISING EDGE SENSITIVE
    FALLING EDGE SENSITIVE
    HIGH LEVEL SENSITIVE
    LOW LEVEL SENSITIVE

Any of which can be MASKABLE or NON MASKABLE

8.1.2 Timing Interrupts

Timing interrupts are generated internally by the processor itself. An independent digital divider circuit uses the processor clock as timing source. This divider is can be set up by the programmer to divide the main clock source down to give a slower timebase. Simple processors usually offer only a binary division (i.e. by 128 or 256 etc.), more expensive devices (i.e. £2 or more) offer a division register, typically 16 bits wide, but sometimes only 8 bits. When this register counts down to zero an interrupt is issued to the processor.

So for example if we are using a 4 MHz clock, and we need a 10 ms time base the divider register would be loaded with 40000. 4 MHz / 40000 = 100 Hz that is 10 ms.

One thing to look out for is whether or not the timer automatically reloads the division ratio. Some processors do this (e.g. INTEL 8051 FAMILY, but only 8 bit reload), most DO NOT, therefore the timer will need to be reloaded on every occurrence of the interrupt.

Another trap that is waiting for you is the fact that sometimes the divider COUNTS UP and not down, issuing the interrupt when the register rolls over from FFFF to 0000. In this instance you will need to preload the divider with the complement of the number that you wish to divide by, i.e. 65536 - DIVISION RATIO.

TYPES OF TIMING INTERRUPT
    UP COUNTER Interrupt when register changes from full (FFFFH or FFH) to zero
    DOWN COUNTER Interrupt when register changes from 1 to zero
    TYPES OF DIVISION
    BINARY DIVISION Can only divide by 2/4/8/16/32/64/128 etc
    8 BIT DIVISION Counter is only 8 bits wide, so can divide by 1 to 255
    16 BIT DIVISION Counter is 16 bits wide, so can divide by 1 to 65535
    AUTOMATIC RELOAD Counter automatically reloads a fixed number on interrupt

8.1.3 Serial Interrupts

Most reasonable processors include at least one, and possible two serial communications devices on the chip. One will be an asynchronous port, for use with an RS232 or RS485 driver, and the other will be some form IIC/SPI driver.

As such links are often quite slow with respect to the processor, an interrupt is an essential requirement. When data is received by a serial device it will issue a RECEIVER FULL interrupt, and when it is ready to transmit data it will issue a TRANSMITTER EMPTY interrupt.

Normal software practice is to implement a cyclic buffer under interrupt control for both the receiver and transmitter sections. The background process can then examine the size of the RECEIVE BUFFER to see if any data has arrived, or put a character into the SEND BUFFER for interrupt processing.

The main dangers with this approach are as follows -

    RECEIVER - the RECEIVE BUFFER must be polled, and emptied, by the background faster than data arrives, else the buffer will     overrun.
    TRANSMITTER - when there is no data left to send the TRANSMITTER EMPTY INTERRUPT must be disabled, otherwise it will        hang up the processor forever.

8.1.4 Watchdog Interrupts

Watchdogs are used to provide system integrity. They are timing devices, often located externally to the processor, but sometimes they are included on the chip itself. They act like an alarm clock, as long as they are RESET by the processor at regular intervals they never time-out. If the processor dies, or gets lost in some obscure section of code that has been poorly designed, then the timer does times out, and it issues an interrupt.

Internal devices, as can be found on the PIC family, the Motorola 68HC11 family and some advanced 8051 type devices, use an internal interrupt or force a complete processor reset. External watchdog devices (available from many manufacturers - such as DALLAS and MAXIM) provide a digital output that the designer can either use to RESET the processor or cause an EXTERNAL interrupt.

The simplest method of using such a device is to set the ‘time-out’ to be longer than the complete programme cycle; then at the end of the cycle a signal is sent to the watchdog to RESET its’ timer back to zero. As long as the programme is operating correctly then the watchdog will never time out. This method however cannot be used for asynchronous programmes that use interrupt functions. It is necessary to ensure that each function independently ‘kicks’ the watchdog. This can be achieved by using flags. For example, an interrupt function can set a flag indicating that it has occurred, and the background routine can examine this flag, and only if it set will it then ‘kick’ the watchdog. In this way we have proven that both functions have been executed within the desired time.

8.1.5 Power Fail Interrupts

It would be helpful if the processor had advanced warning of an impending power failure, in order that it can gracefully close down all current operations and protect its’ memory. This is achieved by the use of power fail interrupts, which are to be found in a number of advanced microcontrollers. They work by monitoring the voltage level on the chip. If it falls below a minimum threshold it issues the interrupt call before the power falls too far for the device to operate - you then have a small amount of time do something to make everything safe and secure.

8.1.6 Software Interrupts (Traps)

A total separate group of interrupts may also exist, that are designed to ‘trap’ illegal operations. For example sections of memory may be designated as protected, perhaps because it is designated as the stack, or we may try to perform a hardware divide by zero. Illegal operations such as these can cause an interrupt, so that some form of housekeeping software can be invoked to tidy up the mess.

8.2 Vectors & Saving the Processor Status

If we have so many different types of interrupt how can the processor distinguish between them?

This is achieved by an INTERRUPT VECTOR TABLE. The vector table is located at a fixed location in the code space, typically at the bottom of code space (location 0000 to 00FF) or at the top of the code space (FF00 to FFFF). Each interrupt has a fixed location in the vector table for its’ own use, into which must be stored the starting address of its’ unique interrupt function.

When any interrupt occurs, the processor stops running whatever function is currently operating, and executes a SUBROUTINE CALL to the address stored in the vector. Therefore the minimal interrupt code must be a single RETURN instruction, so control will then return back to the interrupted function -

However to ensure that an interrupt cannot itself be interrupted, it is usual for all other interrupts to be disabled whilst any interrupt function is being executed. As the interrupt may wish to call subroutines, a special type of return is used that signals to the processor that interrupts can now be reenabled. In the INTEL family of devices this opcode is RETI, MOTOROLA uses the opcode RTI for the same purpose.

e.g. For an Intel 8051
    code address 0000 reset vector
    0003 external interrupt (say 0100) >>>>>>>>>>>>>>>>>>>>>>
    000B timer 0 overflow interrupt                                                 |
    0013 external interrupt 1                                                         |
    001B timer 1 overflow                                                              |
    0023 serial interrupt                                                               |                                                                                                                                                                                                  |
                                                                                                |
    0100 Start of EXT0 interrupt function <<<<<<<<<<<<<<<<<<<|
    RETI

We are also using the processors internal resources, and data that was resident in its’ registers must be saved before the interrupt code is executed, otherwise they may destroyed. This is achieved by pushing the contents of any registers used by the interrupt function onto the stack, and restoring them before executing the return operation. e.g.

Start of EXT0 interrupt function

    PUSH A ; SAVE REGISTER A
    PUSH B ; SAVE REGISTER B
    PUSH DPH ; SAVE DATA POINTER
    PUSH DPL ; SAVE DATA POINTER

    INTERRUPT CODE CAN NOW USE REGISTERS A/B/DPH AND DPL

    POP DPL
    POP DPH
    POP B
    POP A
    RETI

All this pushing and popping takes time, and interrupts must be quick, so only save the contents of registers that are used by the interrupt routine.

What about our processor status flags, these are not registers but represent the state of the machine at the time the interrupt occurred. For example what was the state of the ZERO and CARRY flags, if they are not correctly restore then any subsequent conditional branch operation in the interrupted routine will fail. These flags must be saved as well, and special instructions are always present to allow this to be done.

Start of EXT0 interrupt function

    PUSH    PSW ; SAVE FALGS
    PUSH A ; SAVE REGISTER A
    PUSH B ; SAVE REGISTER B
    PUSH DPH ; SAVE DATA POINTER
    PUSH DPL ; SAVE DATA POINTER

    INTERRUPT CODE CAN NOW USE REGISTERS A/B/DPH AND DPL

    POP DPL
    POP DPH
    POP B
    POP A
    POP PSW
    RETI

So to summarise an interrupt must cause the following to happen

* THE CURRENT PROGRAMME COUNTER ADDRESS MUST BE PUSHED ONTO THE STACK
* THE START OF THE INTERRUPT FUNCTION MUST BE READ FROM THE INTERRUPT VECTOR TABLE
* THE PROCESSOR STATUS WORD MUST BE SAVED ON THE STACK
* THE CONTENTS OF ALL REGISTERS USED BY THE INTERRUPT MUST BE SAVED ON THE STACK
* EXECUTE THE INTERRUPT CODE
* RESTORE THE SAVED REGISTERS
* RESTORE THE PROCESSOR STATUS WORD
* REENABLE INTERRUPTS EXPLICITLY, OR IMPLICITLY WITH A SPECIAL RETURN
* RETURN TO THE CALLING FUNCTION

 9 My PC

So how does all the above relate to the PC on your desk?

9.1 The CPU and MOTHERBOARD

Most of the above hardware will be found on the ’motherboard’ of your PC. The Central Processing Unit (e.g. a Pentium) contains the ALU, the instruction fetch and execute logic, the pipeline, the flags, the registers, the timing circuitry, the stack pointer, the programme counter and the data pointers.

Close to the CPU will be found the Cache, which is fast dynamic RAM. Also on the motherboard you will find the main memory, the address bus and data bus, and probably some of the I/O devices.

Now some terminology. So far we have referred to the address bus and the data bus; there is also a control bus that includes control lines such as ‘read from memory’ or ‘write to memory’. Collectively these are called the xxx BUS, where xxx may be ISA or PCI or VME or whatever BUS standard this particular computer manufacturer has chosen to use.

9.2 Memory
Backup memory is provided for by hard disk drives, floppy disk drives and CD-ROMs. All three of these devices have file structures that are created and managed by the computer’s Operating System or OS. Other types of backup media, such a Zip drives, do not use the OS file structure.

9.3 I/O
Serial communication can be obtained via a COM port, or a USB port.

COM ports use the RS232/V24 asynchronous communication standard. PC manufacturers are trying to get rid of them, but they are extensively used in telecommunications and will not become obsolete despite some ridiculous claims by the mass-market PC manufacturers. A typical use for a COM port is to connect to a dial-up MODEM – they are also used to connect to ‘Packet Switch Networks’ such as an X-25 PAD (Packet Assembler and Dis-assembler), and the mobile phone network via GSM MODEMS.

USB ports are a high speed serial standard, now widely used to connect a range of peripheral devices to a PC. Currently, the USB Specification, Revision 2.0, covers three speeds 480 Mbps, 12 Mbps, and 1.5 Mbps. The term "Hi-Speed USB" refers to just the 480 Mbps portion of the USB Specification. We now use the term "USB" to refer to the 12Mbps and 1.5Mbps speeds.

See http://www.intel.com/technology/usb/faq.htm for more information.

You make also come across the concept of ‘networking’. A network is a cabling system that connects a group of computers together into as ‘Network’. This allows users at on computer to access resources and files on other computers that are connected to that network. It is the job of the OS to allow users to apply ‘attributes’ to the files and directory structures in order to allow or bar other users from accessing their data.

ETHERNET is an example of networking standard that allows messages to be moved from one computer to another; it is not a complete networking solution, as it does not offer any form a file management. You may have also heard of Novell – this is a network standard that can overlay Ethernet, and provides the ‘networking’ layer that is lacking in Ethernet. There are other standards available.

PCMCIA slots are very popular on portable PCs. These are in effect a collapsed BUS, that allows developers to design slim-line peripherals that can by plugged into the computer to provide a new facility, such as extra memory or a MODEM.

All PC’s have a parallel port, which is usually used as a printer connection.

9.4 Human Machine Interfaces

As a minimum these will be a screen, keyboard and a mouse.

Screens come in two main technologies, CRTs (cathode ray tubes) and LCDs (liquid crystal displays). CRTs are big and bulky, and use the same technology as a TV – three different coloured spot sources (Red Green and Blue) are scanned across the screen. The more lines that your CRT can scan the better the vertical resolution. TVs have 625 lines, but in practice some of these are blank, and as the image is ‘interlaced’ the effective resolution is halved to about 300 lines. As each spot is scanned across your screen its’ intensity is changed. The speed at which this happens determines the horizontal resolution.

So a screens overall resolution is defined in terms of (horizontal spot changes per line) x (total number of lines). A basic VGA screen offer 640 x 480 resolution, which means that there are 480 lines, each with 640 horizontal dots. Each dot is known as a Picture Cell, or Pixel for short.

LCDs are made of individual dots, each of which can be switched on or off. You may be able to see them if you look closely at an LCD or Plasma screen.

Keyboards are matrix-scanned devices, which converts each key depression into an 8-bit character (not ASCII) or a 16-bit character pair. This information is transmitted to the PC using a serial standard called Keyboard Wedge. Other devices that use the keyboard wedge standard are barcode scanners, infrared sensors and RFID (radio frequency ID) sensors.

Mice can use a variety of techniques, USB, serial or keyboard wedge.

10 C Programming Language

This section is not a complete introduction to C; such information can be found from a variety of sources at DMU. It is intended only to relate some C language concepts to material that we have already covered.

10.1 Data Types

At the beginning of these notes we referred to different lengths of binary bits that are used to represent numbers. The names that we used were BIT, NIBBLE, BYTE & WORD. Numbers can unsigned or signed. Later on we discovered how floating point numbers are represented.

In the C programming language most of these methods of representing numbers can be found. They are known as DATA TYPES.

An 8-bit variable is known as a char. It can be signed or unsigned. We can create a char variable for use in our programme by declaring a variable. The programme line -

    char x ;
will create an 8-bit variable that is called ‘x’. In this case it will be a signed variable, as that is the default. So x can have a numeric value in the range –128 to +127.

If instead we wanted an unsigned variable it would be declared as -

    unsigned char y ;
This creates an 8-bit variable that is called ‘y’. It will be an unsigned variable. So y can have a numeric value in the range 0 to 255.

16-bit variables are declared using the data type int.

    int x ;
Will create a 16 bit signed variable, with a numeric range of –32268 to + 32267.

    unsigned int x ;
Will create a 16 bit signed variable, with a numeric range of 0 to 65535.

C supports a 32-bit variable type, which used to be known as a long int, but is now just known as long.

    long x;
Will create a 32-bit signed variable, with a numeric range of –2147483648 to +214748367.

    unsigned long x ;

Will create a 16-bit signed variable, with a numeric range of 0 to 4294967296.

C supports two floating point variables types.

    float x ;
Will create a 32-bit floating point variable that has an 8-bit exponent and a 24-bit mantissa.

    double x ;
Will create a 64-bit floating point variable that has a 16-bit exponent and a 48-bit mantissa.


10.2 Operators

C supports are range of mathematical and logical operators. Most of which are self explanatory. The maths operators are

    + - / *

For Addition Subtraction Division and Multiplication. The usual BODMAS rules apply, which means that multiplication & division take precedence over addition and subtraction; but it is best to use brackets to ensure that you get the correct mathematical operation.

The logical operators are

    & | ^ !

For And Or Exclusive-Or and Not

The logical operators should not be confused with the interrogative operators that are used to build conditional statements
    && || ~


So the line
    a = b & c
Will perform a bitwise logical AND operation and place the result into variable 'a'

Whereas a = b && C

Will set variable 'a' TRUE if both variable 'b' and variblae 'c' are TRUE, otherwise varaible 'a' is set to FALSE