Lean proved this program was correct; then I found a bug.

🔥 Read this awesome post from Hacker News 📖

📂 **Category**:

📌 **What You’ll Learn**:

I fuzzed a verified implementation of zlib and found a buffer overflow in the Lean runtime.

AI agents are getting very good at finding vulnerabilities in
large-scale software systems.

Anthropic, was apparently so spooked by the vulnerability-discovery
capabilities of Mythos, they decided not to release it as it was “too
dangerous” (lol). Whether you believe the hype about these latest
models or not, it seems undeniable that the writing is on the wall:

The cost of
discovering security bugs is collapsing, and the vast majority of
software running today was never built to withstand that kind of
scrutiny. We are facing a looming software crisis.

In the face of this oncoming tsunami, recently there has been
increasing interest in formal verification as a solution. If we state
and prove properties about our code using a mechanical tool, can we
build robust, secure and stable software that can overcome this
oncoming barrage of attacks?

One recent development in the Lean ecosystem has taken steps towards
this question. 10 agents autonomously built and proved an
implementation of zlib, lean-zip, an impressive landmark result.
Quoting from Leo De Moura, the chief architect of the Lean FRO
(here):

With apologies for the AI-slop (Leo has a penchant for it, it seems),
the key result is that lean-zip is not just another implementation of
zlib. It is an implementation that has been verified as correct end to
end, guaranteed by Lean to be entirely free of implementation bugs.

What does “verified as correct” actually look like? Here is one of the
main theorems (github):

theorem zlib_decompressSingle_compress (data : ByteArray) (level : UInt8)
   (hsize : data.size < 1024 * 1024 * 1024) :
    ZlibDecode.decompressSingle
      (ZlibEncode.compress data level) = .ok data

For any byte array less than 1 gigabyte, calling
ZlibDecode.decompressSingle on the output of ZlibEncode.compress
produces the original data. The decompress function is exactly the
inverse of compression. This pair of functions is entirely correct.

Or is it?

I pointed a Claude agent at lean-zip over a weekend, armed with AFL++,
AddressSanitizer, Valgrind, and UBSan. Over 105 million fuzzing
executions, it found:

Zero memory vulnerabilities in the verified Lean application code.
A heap buffer overflow in the Lean 4 runtime (lean_alloc_sarray), affecting every version of Lean to date. (bug report, pending fix)
A denial-of-service in lean-zip‘s archive parser, which was never verified.

💬 **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#Lean #proved #program #correct #bug**

🕒 **Posted on**: 1776127689

🌟 **Want more?** Click here for more info! 🌟

Lean proved this program was correct; then I found a bug.

By

Leave a Reply Cancel reply