MicroOps in the Pentium MMX

Показать описание

A Short video talking about how the Pentium P5 and Pentium MMX used micro-operations compared with more traditional processors like the AMD K6, and the implications for the length pre-decoder of the Pentium MMX.

Chapters:
0:00 Intro and PMMX Overview
1:51 Comparison with K6
4:50 PMMX vs K6 uOP Throughput
6:50 Instruction Queue
7:37 uOp Decoding Analysis
9:46 Parameter Elimination
11:02 Opcode table and Simplification

Рекомендации по теме

Комментарии

MicroOps? More like "Magnificent lectures with quality that's tops!" 👍

PunmasterSTP

This reminds of Jim Keller's comment that x86 decode 'isn't all that bad if you're building a big chip'

qfytidw

I really appreciate you taking the time to make these videos, they're very insightful and understandable even with limited design knowledge. Do you plan to do any videos on how SIMD is implemented on these processors, your mention of microcode emulation piqued my interest

billylaws

Excellent as alway. Really like your idea of pages, look like bitplane for microcode.
* Somes intructions can be fused together, could the decoder handle it with the help of a dedicated page ?
* Somes thought about the hardware multiplication, using the FPU is tempting (the 80bits extended precision have a 64bits mantissa) but wouldn't be funky register wise ?
I read between 10 and 20 cycle for a 32x32 mul on the pentium, seem pretty quick without dedicated hardware. I thought a shifter and a adder would at least take 32 cycles worst case.
* You point out that the pipelines U and V have both a load and store unit on the pentium and mmx. all the other (Pentium pro included) have a load and a store with their own pipeline.
I suppose it's the Out of Order architecture ? Or/and the widening of the adress space to 35-36 bits ? About the implementation, depending of the periphericals ( like PCIe or SATA on an Artix throught liteX ) is it viable to keep a 32 bits adressing space, directly push to 48bits to be futur proof or something in between ?
* In a pipeline architecture, doesn't the worst case scenario depend largely on the adressing mode ? two instructions in immediate adressing, it's one fetch for the two instructions, in direct mode two more read, and two more in indirect mode. Cache or not, the cpu still have one memory access with limited bandwidth and latency. Where and how do you arbitrate all these memory access ?

vincentvoillot

MicroOps in the Pentium MMX

MicroOps in the Pentium MMX

x86 Decoding Simulation in the Pentium MMX

x86 Front End Complexity (Part 2 - Pentium MMX)

Intel Instructions 51 SIMD MMX Instructions

Quake, Floating Point, and the Intel Pentium

x86 Decoding Simulation in the Pentium P5

What is the difference between Pentium(number) and just Pentium?

Did MMX make any difference?

PC PENTIUM MMX 233 MHZ

x86 Front End Complexity (Part 1 - Pentium P5)

Pentium Pro (P6) Microprocessor Architecture

MMX Instructions| Packed instructions| SIMD instructions on improved ISAs

Mod-01 Lec-34 Case study: Intel Pentium 4

What is the difference between Pentium(number) and just Pentium? (2 Solutions!!)

Improving the Utilization of Micro-Operation Caches in x86 Processors

Core Micro Architecture Features

Introduction to x86 (OpenVMS Boot Camp 2016 session 10084)

My first Pentium 4 experience wasn't a good one (Part 1/2)

Unlock your CPU and Execute Arbitrary Microcode! Tutorial Introduction

Texplained: What is x86..??What is an Instruction Set Architecture..??32-Bit vs. 64-Bit

Cyrix 6x86

RISC-V vs x86 - History and Key Differences Explained

Arm vs x86 - Key Differences Explained

RISC: PowerPC, SPARC, MIPS, RISCV - what is different?