The branch prediction logic holds about 256 entries in a cache to aid the Pentium in guessing the next instruction. If we can guess what is coming next before it happens, then the data and instructions can be loaded ready to go.
But how do we guess? There are two likely outcomes: either the branch will be taken and we jump to another part of the program, or we don’t take the branch and we continue with the next instruction. The branch prediction logic argues that what the microprocessor did last time, it will probably do again. This is true more often than not. The reasoning behind this is that when a loop occurs, the program is sent back to repeat a section several, or many, times. It can only NOT take the branch once, so on average it will take a branch more often than it doesn’t.
In the cache are stored the instructions immediately before the branch or jump together with the target address assuming the branch is taken. It also stores statistical information of how often the branch was taken in the past. This information is used to predict the likely outcome of the current situation and is correct for about 85% of the time. When the branch has occurred, the history information is updated to make the next guess even better.
General purpose registers
The Pentium has seven general-purpose registers, all 32-bits wide. One of them is used as an accumulator and to maintain compatibility with the 80386 and the 80486, it can be addressed as a single 32-bit register, two 16-bit or four 8-bit registers. There are three other general-purpose registers that can be similarly split and three that only offer the choice of 32-and 16-bit use.
Interrupts
The handling of interrupts has not changed beyond all recognition since we were looking at the Z80.
There are two hardware interrupts available. The NMI or nonmaskable interrupt is activated by the pin voltage going to a logic 1 or high-level. Immediately on the completion of the current instruction, the Pentium puts the content of the flag register and the current address onto the stack. It then goes to the flag register and resets the interrupt flag to prevent any further interrupts. It then services the interrupt. The NMI normally occurs as a result of hardware failures to quickly limit the damage caused.
The IRQ or interrupt request is also activated by the appropriate pin going to a logic 1 or high-level but in this case remember that it is only a request and can be blocked by resetting the interrupt flag in the flag register. If more than one interrupt is received they are checked for priority and the highest one wins. IRQs are generally initiated by peripheral equipment such as a printer.
Exceptions
These interrupts are issued by the microprocessor itself and occur when the microprocessor has found itself in a difficulty that it cannot resolve.
When an exception occurs, an on-screen message often appears announcing that an exception has occurred and the Pentium attempts the instruction again. Asking the Pentium for an impossible answer causes some exceptions. This could be ‘division by zero’. Dividing any number by zero is not possible and the Pentium cannot respond.
Another one, which often strikes terror into the heart of the user, is ‘General Protection Error’. The software has sent the Pentium off to an address that doesn’t exist and obviously, therefore, no instructions are available.
MMX (MultiMedia eXtensions) is an addition to the standard Pentium designed to increase the speed of multimedia, communications and other applications where large numbers of repetitive calculations are required.
It started by analysing a wide range of typical applications: graphics, video, games, speech recognition etc. Intel was looking for time-consuming common characteristics. Many were found in which a fairly simple instruction like changing the colour of a pixel is applied to a large number of pixels. This gave rise to the idea called SIMD (Single Instruction Multiple Data). Using SIMD, we can perform the same operation on multiple bits of data, and this is executed in parallel. MMX allows eight pixels to be moved around and process them together. SIMD is the heart of MMX.
MMX technology maintains full compatibility with previous instructions and has added a further 57 instructions. No danger of the RISC approach here!
MMX instructions take over control of the eight floating-point registers and it has a further eight registers for holding addresses, loop control, data manipulation instructions etc. The floating-point registers are highly flexible in that the 64-bit mantissa section can be used for eight separate bytes, four 16-bit words, two 32-bit ‘doublewords’ or a single 64-bit ‘quadword’.
Saturation arithmetic
In normal fixed-point arithmetic adding two numbers can cause an overflow to occur and the msb can be lost. To take a simple example, adding the number 1 to the byte 1111 would give the result 10000. This would offer the result of zero and an overflow would have occurred as seen in Chapter 4. To check for the overflow, the microprocessor would have to take time out to check the status register to see if the overflow flag has been set. This is time consuming. When applied to graphics, perhaps shading, the sudden return to zero may cause a sudden and unwanted change in colour.
Saturation arithmetic ensures that any increase that would cause a wrap-around effect of returning the value to zero is prevented (see Figure 12.2). If we counted up from 0000, the Pentium would allow the count to proceed normally until it reached the maximum value of 1111 and it would then be held at that value. The colour in our example would reach black but would be prevented from accidentally returning to white.
Figure 12.2 Saturation arithmetic prevents wraparound
As we have mentioned before, one of the limits on operational speed is the size of the internal components and, until recently the smallest detail was limited to 0.18 µm. As the competition between the AMD continued, it was time for the next step as AMD started using 0.13 µm technology and, as expected, the Pentium 4 also upgraded to the same technology for the faster versions of 1.8 GHz and above. The operating voltage has also been reduced from 1.75 down to 1.5 volts allowing closer spacing and a further increase in speed (and 25% reduction in cost). The new design has allowed the Pentium 4 to increase its transistor headcount from 42 million to 55 million increasing the number of connecting pins to 478. Intel has moved a long way from the 16 pins of their 4-bit offering in 1972.
Thermal safety
The power dissipation increases as any integrated circuit works faster and the Pentium 4 is no exception. Now, bearing in mind that the actual processor circuit is just 10 mm×10 mm (0.4 square inches) and consumes 55 watts. We must be very careful to ensure that it doesn’t overheat. This is achieved by using a large heat sink and a cooling fan. The new Pentium has a thermal safety circuit. If the microprocessor starts to overheat, the cooling fan will increase its revs and the operating speed of the microprocessor will decrease. If things get serious and it reaches a dangerous level of 69°C (155°F) the thermal circuit will call it a day and shut down the computer to prevent the microprocessor from being destroyed.
The system bus
Also called the FSB or Front Side Bus, is 64 bits wide and ‘Quad Pumped’ which is a fancy way of saying that each clock pulse, presently running at 133 MHz, will shift four lots of data along the bus. Now, rounding off the figures a bit, 133 MHz×4=533 MHz so the bus looks like a single 533 MHz bus. Incoming and outgoing information is stored in the 256 kB level 2 Advanced Transfer Cache which is fed 256 bit wide pathways. Intel calls it ‘Advanced Transfer Cache’ which is not quad pumped though being wider, still matches the speed of the system bus.