Learning Resources
 

CPU technologies


What is a core?

The core of the processor is the part that decodes and executes instructions.  On early processors this would describe the whole CPU but over the last 20 years processors have gained built-in cache memory and cache controllers.

Operating speed and power requirements are affected by transistor size; the construction process size of transistor circuits is quoted to give an idea of the advance in technology: Pentium III processors used 180nm technology, whereas modern Intel Core 2 CPUs are 65nm.

Manufacturers usually use a codename to identify a particular design of processor core, and this typically indicates the process size (e.g. 90nm) and operational efficiency.

core name process processor socket typical L2 cache
Willamette 180nm Pentium 4 478 256 KB
Northwood 130nm Celeron 478 128 KB
Pentium 4 478 512 KB
Prescott 90nm Celeron D / P4 478 256 KB / 1MB
Celeron D / P4 (5xx) 775 256 KB / 1MB
Prescott 2M 90nm Pentium 4 (6xx) 775 2 MB
Cedar Mill 65nm Pentium 4 (6xx) 775 2 MB
Smithfield 90nm Pentium D (8xx) 775 2 MB
Presler 65nm Pentium D (9xx) 775 4 MB
Yonah 65nm Core Duo / Solo 775 4 MB
Conroe 65nm Core 2 Duo 775 4 MB
core name process processor socket typical L2 cache
Thoroughbred 130nm Athlon XP A 256 KB
Barton 130nm Athlon XP A 512 KB
Palermo 90nm Sempron 754 128 KB
Clawhammer 130nm Athlon 64 754 1 MB
Newcastle 130nm Athlon 64 754 512 KB
Athlon 64 939 512 KB
Sledgehammer 130nm Athlon 64/FX 939 1 MB
Winchester 90nm Athlon 64 939 512 KB
Venice 90nm Athlon 64 939 512 KB
Manchester 90nm Athlon 64 X2 939 1 MB
Toledo 90nm Athlon 64 X2 939 2 MB

Core names are like version numbers; stepping numbers indicate revisions or bug-fixes.  It is important to match core & stepping in multi-processor systems

 

Hyperthreading

Threads are independent parts of a computer program.  Multi-tasking operating systems (e.g. Windows or Linux) work by allocating each thread a certain amount of "CPU time" in which to execute some instructions.  This means that they can run dozens of programs "at the same time".

Multi-processor systems allow the OS to literally run two or more program threads simultaneously on different CPUs.  This means that there is less competition for CPU time and therefore the computer should operate more quickly.  However, multiprocessor systems are expensive.

Hyper-Threading Technology (HTT) was introduced by Intel to give a cost-effective compromise.  By duplicating some of the parts of the main core it pretends to be two processors.  This gives a small speed increase (up to 30%).

Dual Core & multi-core

Dual core designs have two cores on a single chip, sometimes sharing L2 cache memory and always sharing bus interfaces.  A dual-core chip is not as good as having two processors; however it gives typically 25%–75% faster performance than a single core processor.

The Athlon 64 X2 and Pentium D were the first dual-core processors released for the PC, followed by the Intel Core Duo, Core 2 Duo and AMD Athlon FX60.  Intel has released a four-core Core 2 Quadro processor.

 

Instruction set

The x86 family of processors has a common set of instructions that the processor recognises.  This instruction set has been extended on several occasions.  The first major revision was with the 386 processor, which introduced special 32-bit instruction codes.

MMX

Early processors could perform integer arithmetic only (i.e. calculations involving whole numbers).  Manufacturers soon added Floating-Point Units (FPU) to process numbers with decimal points.  These were quickly integrated within the main processor core.

The Pentium MMX (Matrix Math Extensions but more commonly misnamed Multi-Media eXtensions) introduced extra instructions to make floating-point maths easier, especially when manipulating several numbers at once.  This concept is called SIMD (Single Instruction, Multiple Data) and means that graphics and sound software can run more quickly.

SSE / SSE2 / SSE3

AMD fought back with an expanded MMX instruction set called 3DNow!

Intel created their own version of 3DNow called SSE (Streaming SIMD Extensions), adding 70 new maths instructions.  This was taken further with the Pentium 4's SSE2 and SSE3 extensions.

A multimedia program that supports SSE3 can run from 10% to 100% faster on an SSE3-compatible processor.

Protected execution

The binary codes used for instructions are indistinguishable from those used for storing data.  If a computer programmer issues an incorrect instruction it is possible to accidentally start executing data codes as if they were proper instructions.  This is surprisingly common and leads to unexpected results and crashed software.

This flaw is used by hackers to create buffer overflow attacks.  These take advantage of programming errors by disguising instruction codes as data.  Thus, when the data is accidentally executed, the CPU carries out the instructions set by the hacker.

 

Clock speed

The processing of instructions in a CPU is governed by a single repeating signal — the clock — that synchronises the movement of data within the CPU.

It used to be easy to measure the performance of a processor by looking at its clock frequency (also called clock speed or clock rate).  However that is no longer the case...

There is a maximum limit to the clock frequency: this is determined by the signalling voltage and the transistor design.  If a clock goes too fast then internal buses will change state too quickly and numeric codes will not be read properly.

VRM

Lower signalling voltages mean faster clock rates, because the signal can slew to the desired levels more quickly.  To ensure that the core has the appropriate level of voltage a Voltage Regulator Module (VRM) is required.

ATX motherboards have VRM built in.  Older, AT-based systems do not get a 3V power line and therefore need more hefty VRMs; these sometimes plug in on a separate card beside the processor.

Real speed vs. actual speed

However, AMD started producing CPUs that did more work in every clock cycle.  Thus a 1.8 GHz Athlon would carry out the same number of instructions per second as a 2.4 GHz Pentium III.  Therefore AMD started identifying their chips by the equivalent speed: the 1.8GHz Athlon was sold as the "Athlon 2400+".

 

Throttling

CPUs typically operate at a constant speed and this can mean excessive power use when they are idle.  This can be a major factor in laptops.

Processor throttling is the act of lowering the processor workload or slowing it down to reduce power consumption.  This can be done automatically by some type of CPU or by special software.  The Pentium M has TM1 and TM2 thermal monitoring which respectively add no-operation (NOP) instructions (to slow execution) and lower the internal clock multiplier.

Overclocking

Overclocking is the process of increasing the processor clock frequency to the maximum possible level.  This can have a number of side effects:

  • Increased temperature, requiring better cooling systems
  • Occasional crashes due to illegal instruction codes

Many processors can be overclocked and there are numerous websites dedicated to statistics regarding relative performance and stability.