Выбрать главу

Within one company, such as Intel, there is a commercial pressure to ensure that each succeeding microprocessor understands the binary codes of the previous designs. This is referred to as being ‘upward compatible’. There is nothing to prevent a company from designing a microprocessor that has the same pins and programming capability as another. The Z80180 for instance was designed as an updated copy of the Intel Z80 which, in turn, was a revised version of the 8080A. It was a pin for pin compatible plug-in replacement. This may be irritating for the original designers but is accepted provided the internal design has not been copied. Indeed, it often does the original company no harm. If several compatible microprocessors are being sold it will induce many programmers to write programs using this code. This will increase the sales of these microprocessors and, perhaps, no-one will suffer.

The Intel Pentium was under similar attack during 1997/8 from other microprocessors like the Athlon series made by A.M.D. These can run Pentium programs and, for some purposes, are superior to the Pentium.

Machine code

The binary code that is understood by the microprocessor is called machine code and consists of streams of binary bits. They are fed from the RAM or ROM memory chips in blocks of 8, 16, 32 or 64 depending on the microprocessor in use. To us the binary stream is total gibberish.

Example

If we refer to the Z80180 block diagram in Figure 9.3 we can investigate the instruction necessary to add two numbers together. One of the numbers is already stored in the accumulator or register A. Let’s assume this is the number 25H.

Figure 9.3 The microprocessor adds two numbers

In comes the instruction: 11000110 00010101. It is in two parts, the first byte, 11000110, means add the following number to the number stored in the accumulator. This first byte which contains the instruction is called the operation code, usually abbreviated to ‘op code’. The second byte, 00010101, is the number 15H. This particular instruction has two bytes. Some instructions have only the one byte and others three or more bytes. The additional bytes contain the data to be used by the op code and is called the operand.

And here’s the action

1 The first byte goes into the instruction decoder where it is decoded into a sequence of internal operations.

2 It then copies the number 15H from the data buffer into ALU, the arithmetic and logic unit.

3 The number 25H from the accumulator is then copied into the ALU to join the 15H which is already there. The two numbers are added.

4 And the result is copied back to the accumulator.

What is the result?

It is 3AH. Be careful not to let your brain jump back to decimal mode and shout 40 at you.

At the end of a machine-code program, we must include an instruction to stop the program, otherwise we can get some unexpected or unwanted results. You may remember that, at least in the development stages, the program is often stored in RAM.

Now, RAM locations take up random values when they are switched off so when the program has completed the last instruction, it will start executing the random values as if they were a program. These instructions may, of course, do anything at all. They could even delete or change the program that we have just written. Overall, the effect is like an aircraft overshooting the end of a runway.

The problems with machine code

There are so many. The program is not friendly: 11000110 00010101 hardly compares with ‘Add 15H to the number 25H’ for easy understanding. There is nothing about 11000110 which reminds us of its meaning ‘add the following number to the number stored in the accumulator’ so a program would need to be laboriously decoded byte by byte.

Typing in streams of ones and zeros is so boring that we will make many mistakes, particularly when we remember that a real program may be ten thousand times longer than this. Can you imagine typing in half a million bits, finding the program does not run correctly and then settling down to look for the mistakes?

Another problem is that the programmer must be aware of the internal structure of the microprocessor. How else could you know which register to use, or even which registers exist? So you master all this and then change to another microprocessor and then what? The whole learning process has to start again – new instructions, new registers, and new coding requirements. It’s all too horrible.

The difficulties with machine code hardly mattered in the early days of the microprocessor. Everyone who programmed them were fanatics and loved the complexity and there were few serious jobs for the microprocessor to do. This first program language was called a ‘low-level’ language to differentiate it from our own verbal communication language which was called a high-level language. Machine code was later referred to as the ‘First generation’ language (see Figure 9.4).

Figure 9.4 High-and low-level languages

Very soon, the microprocessor was used for an increasing range of tasks and revolutionary ideas like ‘speed and ease’ crept into the discussions. This resulted in a new language called Assembly which overcame the most immediate failings of machine code.

Assembly language, the second generation language

Assembly language was designed to do the same work as machine code but be much easier to use. It replaced all the ones and zeros with letters that were easier to remember but it is still a low-level language.

The assembly equivalent of our machine code example 11000110 00010101 is the code ADD A,m. This means ‘add any nuMber ,m to the value stored in the accumulator. We can see immediately that it would be far easier to guess the meaning of ADD A,m than 11000110 00010101 and so it makes programming much easier. If we had to choose letters to represent the ‘add’ command, ADD A,m was obviously a good choice. A big improvement over alternatives like XYZ k,g or ABC r,h. The code ADD A,m is called a mnemonic.

A mnemonic (pronounced as nemonic) is just an aid to memory and is used for all assembly codes. Here are a couple of examples:

SLA E for shift to the left, the contents of register E.

LD B 25H load the B register with the number 25H.

Now see how easy it makes it by guessing the meaning of these:

INC H

LD C 48H

Finally, have a go at one that we have not considered yet.

LD B BЈ

If SLA E means shift the contents of register E one place to the left, then SRA E means shift the bits one place to the right. LD C 48H means put the number 48H into the C register. LD B BЈ enables us to copy the number stored in the BЈ register into the B register. Note: the actual mnemonics differ between microprocessors. The manufacturers issue an ‘instruction set’ that lists all the codes for each of their microprocessors. Together with the number of clock cycles taken by each instruction and a summary of the function of each.

Non-destructive readout

Incidentally, microprocessors in common with memories, always use non-destructive readouts. This means that information is shifted from one place to another by copying it and leaving the original number unaltered. For example, after the instruction LD A C, the registers A and C will both finish up with the same information in them. This enables stored information to be used over and over again.

To allow us to type in ADD A,15 rather than 11000110 00010101, we need another program to do the conversion. This program is called an assembler (see Figure 9.5).

Figure 9.5 An assembler

The program allows us to type in the assembly code, called the source code, and converts it to machine code referred to as object code. It can show the object code on a monitor screen or print it out or it can load it into the RAM ready for use. When starting the assembler, we have to state the RAM starting address that we wish to use. This is normally only a matter of making sure it is in RAM and avoiding the other programs already installed. The object code is shown in hex numbers rather than binary to make it easier for us. An assembler can only work within the instruction set provided by the microprocessor designer. It cannot add any new instructions and is (almost) just a simple converter or translator between mnemonics and machine code.