CPU concepts

  1. Clock Rate
    The clock is a digital signal that drives all processor logic. Each CPU instruction may take one or more cycles of the clock (called CPU cycles) to execute. CPUs execute at a particular clock rate; for example, a 4 GHz CPU performs 4 billion clock cycles per second.
  2. Instructions
    CPUs execute instructions chosen from their instruction set. An instruction includes the following steps, each processed by a component of the CPU called a functional unit:
    Instruction fetch
    Instruction decode
    Execute
    Memory access
    Register write-back

    Memory access is the slowest
  3. Instruction Pipeline
    The instruction pipeline is a CPU architecture that can execute multiple instructions in parallel by executing different components of different instructions at the same time.
  4. Branch Prediction
    Modern processors can perform out-of-order execution of the pipeline, where later instructions can be completed while earlier instructions are stalled, improving instruction throughput. 
  5. Instruction Width
    Multiple functional units of the same type can be included, so that even more instructions can make forward progress with each clock cycle. This CPU architecture is called superscalar and is typically used with pipelining to achieve a high instruction throughput.
  6. Instruction Size
    x86, which is classified as a complex instruction set computer (CISC), allows up to 15-byte instructions. ARM, which is a reduced instruction set computer (RISC), has 4 byte instructions with 4-byte alignment
  7. SMT
    Simultaneous multithreading makes use of a superscalar architecture and hardware multithreading support (by the processor) to improve parallelism. It allows a CPU core to run more than one thread, effectively scheduling between them during instructions.
  8. IPC, CPI
    Instructions per cycle (IPC) is an important high-level metric for describing how a CPU is spending its clock cycles and for understanding the nature of CPU utilization. This metric may also be expressed as cycles per instruction (CPI), the inverse of IPC.
  9. Utilization
    CPU utilization is measured by the time a CPU instance is busy performing work during an interval, expressed as a percentage. It can be measured as the time a CPU is not running the kernel idle thread but is instead running user-level application threads or other kernel threads, or processing interrupts.
  10. User Time/Kernel Time
    The CPU time spent executing user-level software is called user time, and kernel-level software is kernel time. Kernel time includes time during system calls, kernel threads, and interrupts. When measured across the entire system, the user time/kernel time ratio indicates the type of workload performed.
  11. Saturation
    A CPU at 100% utilization is saturated, and threads will encounter scheduler latency as they wait to run on-CPU, decreasing overall performance. This latency is the time spent waiting on the CPU run queue or other structure used to manage threads.
  12. Preemption
    Allows a higher-priority thread to preempt the currently running thread and begin its own execution instead. This eliminates the run-queue latency for higher-priority work, improving its performance.
  13. Priority Inversion
    Priority inversion occurs when a lower-priority thread holds a resource and blocks a higher-priority thread from running. This reduces the performance of the higher-priority work, as it is blocked waiting.
  14. Multiprocess, Multithreading
    Most processors provide multiple CPUs of some form. For an application to make use of them, it needs separate threads of execution so that it can run in parallel.
  15. Word Size
    Processors are designed around a maximum word size—32-bit or 64-bit—which is the integer size and register size. 
  16. Compiler Optimization
    Compilers are also frequently updated to take advantage of the latest CPU instruction sets and to implement other optimizations. Sometimes application performance can be significantly improved simply by using a newer compiler.