rdtsc - now even less useful!

An interesting extract from the latest IA32 SDM (18.20.5)

The TSC, IA32_MPERF, and IA32_FIXED_CTR2 operate at the same, maximum-resolved frequency of the platform, which is equal to the product of scalable bus frequency and maximum resolved bus ratio.

For processors based on Intel Core microarchitecture, the scalable bus frequency is encoded in the bit field MSR_FSB_FREQ[2:0] at (0CDH), see Appendix B, "Model-Specific Registers (MSRs)". The maximum resolved bus ratio can be read from the following bit field:

  • If XE operation is disabled, the maximum resolved bus ratio can be read in MSR_PLATFORM_ID[12:8]. It corresponds to the maximum qualified frequency.
  • IF XE operation is enabled, the maximum resolved bus ratio is given in MSR_PERF_STAT[44:40], it corresponds to the maximum XE operation frequency configured by BIOS.

In summary, TSC increment = (scalable bus frequency) * (maximum resolved bus ratio). This implies the TSC is incrementing based on some external bus source (any hardware engineers explain what happened for Core here?), and is a departure from simply assuming that the TSC increments once for each CPU cycle.

The interesting bit is that if XE operation is disabled, the bus ratio is assumed to be the maximum qualified frequency. This seems to mean that if you overclock your CPU and your processor is running at higher than the qualified frequency, attempts to measure the CPU speed by counting TSC ticks over a given time may yeild the wrong results (well, will yield the rated result; i.e. the speed of the processor you bought out of the box).

While interesting, this divergence is probably has little practical implications because using the TSC for benchmarking is already fraught with danger. You have to be super careful to make sure the compiler and processor don't reschedule things around you and handle other architectural nuances. If you need this level of information, you're much better using the right tools to get it (my favourite is perfmon2).