Love how this chronicles the instruction count at 301 million and then for each optimization and compromise it cuts xx million instructions of the runtime.

I think the 6502 final would need to be run in an emulator to get the retired instruction count. On 586+ cpu such a function is baked into the hardware.