A sufficiently detailed spec is code

✨ Explore this awesome post from Hacker News 📖

📂 **Category**:

✅ **What You’ll Learn**:

This post is essentially
this comic strip
expanded into a full-length post:

For a long time I didn’t need a post like the one I’m about to write. If
someone brought up the idea of generating code from specifications I’d share
the above image with them and that would usually do the trick.

However, agentic coding advocates claim to have found a way to defy gravity and
generate code purely from specification documents. Moreover, they’ve also
muddied the waters enough that I believe the above comic strip warrants
additional commentary for why their claims are misleading.

In my experience their advocacy is rooted in two common misconceptions:

Misconception 1: specification documents are simpler than the
corresponding code

They lean on this misconception when marketing agentic coding to believers
who think of agentic coding as the next generation of outsourcing. They
dream of engineers being turned into managers who author specification
documents which they farm out to a team of agents to do the work, which only
works if it’s cheaper to specify the work than to do the work.
Misconception 2: specification work must be more thoughtful than coding
work

They lean on this misconception when marketing agentic coding to skeptics
concerned that agentic coding will produce unmaintainable slop. The argument
is that filtering the work through a specification document will improve
quality and promote better engineering practices.

I’ll break down why I believe those are misconceptions using a concrete
example.

Thinly-veiled code

I’ll begin from OpenAI’s Symphony
project, which OpenAI heralds as as an example of how to generate a project
from a specification document.

The Symphony project is an agent orchestrator that claims to be generated from
a “specification”
(SPEC.md),
and I say “specification” in quotes because this file is less of a specification
and more like pseudocode in markdown form. If you scratch the surface of the
document you’ll find it contains things like prose dumps of the database schema:

4.1.6 Live Session (Agent Session Metadata)

State tracked while a coding-agent subprocess is running.

Fields:

session_id (string, -)

thread_id (string)

turn_id (string)

codex_app_server_pid (string or null)

last_codex_event (string/enum or null)

last_codex_timestamp (timestamp or null)

last_codex_message (summarized payload)

codex_input_tokens (integer)

codex_output_tokens (integer)

codex_total_tokens (integer)

last_reported_input_tokens (integer)

last_reported_output_tokens (integer)

last_reported_total_tokens (integer)

turn_count (integer)

Number of coding-agent turns started within the current worker lifetime.

… or prose dumps of code:

8.3 Concurrency Control

Global limit:

available_slots = max(max_concurrent_agents - running_count, 0)

Per-state limit:

max_concurrent_agents_by_state[state] if present (state key normalized)

otherwise fallback to global limit

The runtime counts issues by their current tracked state in the running map.

8.4 Retry and Backoff

Retry entry creation:

Cancel any existing retry timer for the same issue.

Store attempt, identifier, error, due_at_ms, and new timer handle.

Backoff formula:

Normal continuation retries after a clean worker exit use a short fixed delay of 1000 ms.

Failure-driven retries use delay = min(10000 * 2^(attempt - 1), agent.max_retry_backoff_ms).

Power is capped by the configured max retry backoff (default 300000 / 5m).

Retry handling behavior:

Fetch active candidate issues (not all issues).

Find the specific issue by issue_id.

If not found, release claim.

If found and still candidate-eligible:

Dispatch if slots are available.

Otherwise requeue with error no available orchestrator slots.

If found but no longer active, release claim.

… or sections added explicitly added to babysit the model’s code generation,
like this:

6.4 Config Fields Summary (Cheat Sheet)

This section is intentionally redundant so a coding agent can implement the config layer quickly.

tracker.kind: string, required, currently linear

tracker.endpoint: string, default https://api.linear.app/graphql when tracker.kind=linear

…

… or outright code¹:

16. Reference Algorithms (Language-Agnostic)

16.1 Service Startup

function start_service():
  configure_logging()
  start_observability_outputs()
  start_workflow_watch(on_change=reload_and_reapply_workflow)

  state = 💬

  validation = validate_dispatch_config()
  if validation is not ok:
    log_validation_error(validation)
    fail_startup(validation)

  startup_terminal_workspace_cleanup()
  schedule_tick(delay_ms=0)

  event_loop(state)

I feel like it’s pretty disingenuous for agentic coding advocates to market
this as a substitute for code when the specification document reads like
code (or in some cases is literally code).

Don’t get me wrong: I’m not saying that specification documents should never
include pseudocode or a reference implementation; those are both fairly
common in specification work. However, you can’t claim that specification
documents are a substitute for code when they read like code.

I bring this up because I believe Symphony illustrates the first misconception
well:

Misconception 1: specification documents are simpler than the corresponding code

If you try to make a specification document precise enough to reliably generate
a working implementation you must necessarily contort the document into code
or something strongly resembling code (like highly structured and formal
English).

Dijkstra explains why this is inevitable:

We know in the meantime that the choice of an interface is not just a
division of (a fixed amount of) labour, because the work involved in
co-operating and communicating across the interface has to be added. We know
in the meantime —from sobering experience, I may add— that a change of
interface can easily increase at both sides of the fence the amount of work
to be done (even drastically so). Hence the increased preference for what are
now called “narrow interfaces”. Therefore, although changing to communication
between machine and man conducted in the latter’s native tongue would greatly
increase the machine’s burden, we have to challenge the assumption that this
would simplify man’s life.

A short look at the history of mathematics shows how justified this challenge
is. Greek mathematics got stuck because it remained a verbal, pictorial
activity, Moslem “algebra”, after a timid attempt at symbolism, died when it
returned to the rhetoric style, and the modern civilized world could only
emerge —for better or for worse— when Western Europe could free itself from
the fetters of medieval scholasticism —a vain attempt at verbal precision!—
thanks to the carefully, or at least consciously designed formal symbolisms
that we owe to people like Vieta, Descartes, Leibniz, and (later) Boole.

Agentic coders are learning the hard way that you can’t escape the “narrow
interfaces” (read: code) that engineering labor requires; you can only
transmute that labor into something superficially different which still demands
the same precision.

Flakiness

Also, generating code from specifications doesn’t even reliably work! I
actually tried to do what the Symphony
README
suggested:

Tell your favorite coding agent to build Symphony in a programming language of your choice:

Implement Symphony according to the following spec:
https://github.com/openai/symphony/blob/main/SPEC.md

I asked Claude Code to build Symphony in a programming language of my choice
(Haskell², if you couldn’t guess from the name of my blog) and it did not work.
You can find the result in my
Gabriella439/symphony-haskell repository.

Not only were there multiple bugs (which I had to prompt Claude to fix and you
can find those fixes in the commit history), but even when things “worked”
(meaning: no error messages) the codex agent just spun silently without
making any progress on the following sample Linear ticket:

Create a new blank repository

No need to create a GitHub project. Just create a blank git repository

In other words, Symphony’s “vain attempt at verbal precision” (to use Dijkstra’s
words) still fails to reliably generate a working implementation³.

This problem also isn’t limited to Symphony: we see this same problem even for
well-known specifications like YAML. The
YAML specification is extremely detailed,
widely used, and includes a
conformance test suite and the vast
majority of YAML implementations still do not conform fully to the spec.

Symphony could try to fix the flakiness by expanding the specification but it’s
already pretty long, clocking in at 1/6 the size of the included Elixir
implementation! If the specification were to grow any further they would
recapitulate Borges’s “On Exactitude in Science” short story:

…In that Empire, the Art of Cartography attained such Perfection that the map
of a single Province occupied the entirety of a City, and the map of the
Empire, the entirety of a Province. In time, those Unconscionable Maps no
longer satisfied, and the Cartographers Guilds struck a Map of the Empire
whose size was that of the Empire, and which coincided point for point with
it. The following Generations, who were not so fond of the Study of
Cartography as their Forebears had been, saw that that vast Map was Useless,
and not without some Pitilessness was it, that they delivered it up to the
Inclemencies of Sun and Winters. In the Deserts of the West, still today,
there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in
all the Land there is no other Relic of the Disciplines of Geography.

Slop

Specification work is supposed to be harder than coding. Typically the
reason we write specification documents before doing the work is to encourage
viewing the project through a contemplative and critical lens, because once
coding begins we switch gears and become driven with a bias to action.

So then why do I say that this is a misconception:

Misconception 2: specification work must be more thoughtful than coding
work

The problem is that this sort of thoughtfulness is no longer something we can
take for granted thanks to the industry push to reduce and devalue labor at
tech companies. When you begin from the premise of “I told people
specification work should be easier than coding” then you set yourself up to
fail. There is no way that you can do the difficult and uncomfortable work
that specification writing requires if you optimize for delivery speed. That’s
how you get something like the Symphony “specification” that looks
superficially like a specification document but then falls apart under closer
scrutiny.

In fact, the Symphony specification reads as AI-written slop.
Section 10.5
is a particularly egregious example of the slop I’m talking about, such as this
excerpt:

linear_graphql extension contract:
Purpose: execute a raw GraphQL query or mutation against Linear using Symphony’s configured
tracker auth for the current session.

Availability: only meaningful when tracker.kind == "linear" and valid Linear auth is configured.
Preferred input shape:
{
  "query": "single GraphQL query or mutation document",
  "variables": {
    "optional": "graphql variables object"
  }
}
query must be a non-empty string.

query must contain exactly one GraphQL operation.

variables is optional and, when present, must be a JSON object.

Implementations may additionally accept a raw GraphQL query string as shorthand input.

Execute one GraphQL operation per tool call.

If the provided document contains multiple operations, reject the tool call as invalid input.

operationName selection is intentionally out of scope for this extension.

Reuse the configured Linear endpoint and auth from the active Symphony workflow/runtime config; do
not require the coding agent to read raw tokens from disk.

Tool result semantics:

transport success + no top-level GraphQL errors -> success=true

top-level GraphQL errors present -> success=false, but preserve the GraphQL response body
for debugging

invalid input, missing auth, or transport failure -> success=false with an error payload

Return the GraphQL response or error payload as structured tool output that the model can inspect
in-session.

That is a grab bag of “specification-shaped” sentences that reads like an
agent’s work product: lacking coherence, purpose, or understanding of the
bigger picture.

A specification document like this must necessarily be slop, even if it
were authored by a human, because they’re optimizing for delivery time rather
than coherence or clarity. In the current engineering climate we can no longer
take for granted that specifications are the product of careful thought and
deliberation.

Conclusion

Specifications were never meant to be time-saving devices. If you are
optimizing for delivery time then you are likely better off authoring the code
directly rather than going through an intermediate specification document.

More generally, the principle of “garbage in, garbage out” applies here. There
is no world where you input a document lacking clarity and detail and get a
coding agent to reliably fill in that missing clarity and detail. Coding
agents are not mind readers and even if they were there isn’t much they can do
if your own thoughts are confused.

{💬|⚡|🔥} **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#sufficiently #detailed #spec #code**

🕒 **Posted on**: 1773891227

🌟 **Want more?** Click here for more info! 🌟

A sufficiently detailed spec is code

Thinly-veiled code

4.1.6 Live Session (Agent Session Metadata)

8.3 Concurrency Control

8.4 Retry and Backoff

6.4 Config Fields Summary (Cheat Sheet)

16. Reference Algorithms (Language-Agnostic)

16.1 Service Startup

Flakiness

Create a new blank repository

Slop

Conclusion

By

Leave a Reply Cancel reply