Analyzing Geekbench 6 under Intel’s BOT

✨ Check out this awesome post from Hacker News 📖

📂 **Category**:

✅ **What You’ll Learn**:

We’ve spent the past week investigating Intel’s Binary Optimization Tool (BOT). BOT modifies instruction sequences in executables to improve performance, and can only be used with a handful of applications (including Geekbench 6). Intel’s public documentation on BOT is limited, so we decided to dig in ourselves to understand how it works and what optimizations it’s applying to Geekbench.

We tested both Geekbench 6.3 and Geekbench 6.7 on a Panther Lake laptop (an MSI Prestige 16 AI+ with an Intel Core 9 386H) with BOT enabled and disabled.

Startup Overhead

When running Geekbench 6.3 with BOT enabled, the first run has a 40-second startup delay before the program starts. Subsequent runs are faster, with a 2-second startup delay. The startup delay disappears when BOT is disabled.

When running Geekbench 6.7 with BOT enabled, all runs have a 2-second startup delay. The startup delay disappears when BOT is disabled.

Geekbench Results

Geekbench 6.3 scores increase when BOT is enabled compared to when BOT is disabled. On our test system, both the single-core and the multi-core scores increased by 5.5%.

Geekbench 6.3	BOT Disabled	BOT Enabled	Difference
Single-Core	2955	3119	+5.5%
Multi-Core	16786	17705	+5.5%

Some Geekbench 6.3 workload scores also increase, with scores for two workloads (Object Remover and HDR) increasing by up to 30% with BOT enabled. A comparison that includes all workload scores is available on the Geekbench Browser.

Geekbench 6.7 single-threaded and multi-threaded scores remained roughly the same with BOT enabled and disabled.

Geekbench 6.7	BOT Disabled	BOT Enabled	Difference
Single-Core	2938	2937	+0.0%
Multi-Core	16892	17045	+0.9%

Based on these results, we know BOT only optimizes specific versions of Geekbench. We examined the work done during the startup delay, and BOT is computing a checksum of the Geekbench executable. This suggests the checksum is used to identify whether the binary is known to BOT, and thus whether BOT can optimize the binary.

BOT Optimizations

Intel’s Software Development Emulator (SDE) is a development tool that can monitor which instructions are executed during a program run and is particularly useful for understanding which SIMD extensions a workload uses (e.g., SSE2, AVX2, AVX512).

We used SDE to see which instructions are executed (and how many times they’re executed) during a Geekbench run. To keep our analysis manageable, we focused on the HDR workload since it showed the largest gain under BOT.

Using Geekbench 6.3, we ran the HDR workload for 100 iterations under SDE and examined the results.

	BOT Disabled	BOT Enabled	Difference
Total Instructions	1.26 trillion	1.08 trillion	-14%
Scalar Instructions	220 billion	84.6 billion	-62%
Vector Instructions	1.25 billion	18.3 billion	+1366%

Based on the instruction counts, it’s clear BOT has performed significant changes to the HDR workload’s code. The number of total instructions is reduced by 14%. Most of that reduction comes from BOT vectorizing parts of the workload’s code, converting instructions that operate on one value into instructions that operate on eight values. This is a significantly more sophisticated transformation than simple code-reordering. Intel’s public documentation only discloses the simpler code-reordering techniques, not the vectorization transformations observed here.

Conclusions

Real-world application code is incredibly varied, and Geekbench is designed to reflect that by using a broad range of workloads written in different styles. BOT undermines this by replacing that varied code with processor-tuned, fully optimized binaries, measuring peak rather than typical performance.

If BOT worked with every application, we wouldn’t have concerns with its use with Geekbench. It is an interesting optimization technique that has some drawbacks (the two-second startup delay being one of them, especially for short-lived processes).

However, right now, BOT only supports a handful of applications, meaning BOT-optimized benchmark results paint an unrealistic picture of how a CPU performs in practice. This makes Intel processors appear faster relative to AMD and other vendors than they would be in typical, real-world usage.

Next Steps

We will continue to flag BOT-optimized results in the Geekbench Browser. BOT optimizations are poorly documented, aggressive in scope, and damage comparability with other CPUs. For example, BOT allows Intel processors to run vector instructions while other processors continue to run scalar instructions. This provides an unfair advantage to Intel, and it’s important that Geekbench users understand what BOT does to Geekbench scores.

Geekbench 6.7 (out later this week) will include a way to check whether BOT is running and flag results when it is detected. This means we’ll be able to remove the warning from Geekbench 6.7 results when BOT is not detected. Geekbench 6.6 and earlier results on Windows will continue to be flagged.

🔥 **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#Analyzing #Geekbench #Intels #BOT**

🕒 **Posted on**: 1775016099

🌟 **Want more?** Click here for more info! 🌟