Please Do Not A/B Test My Workflow

💥 Read this must-read post from Hacker News 📖

📂 **Category**:

✅ **What You’ll Learn**:

Claude Code has completely changed how I work, and I’ve been a big fan of Anthropic and the founders’ research since day one. Experiencing my own workflow degrade over the past week was frustrating, and this post was written in that frustration. I’ve since revised it to be more accurate and fair in tone. It’s currently #1 on Hacker News, otherwise I’d probably just delete it.

Anthropic is running A/B tests on Claude Code that actively degrade my workflow. I wish I could opt out.

I don’t think A/B testing is inherently wrong. I don’t think Anthropic is doing this to intentionally degrade anyone’s experience. They’re clearly trying to optimize. But the test design matters, and altering the perceived behavior of a core feature like plan mode, without understanding why, degrades my experience.

I pay $200/month for Claude Code. It’s a professional tool I use to do my job, and I need transparency into how it works and the ability to configure it. What I don’t need is critical functions of the application changing without notice, or being signed up for disruptive testing without opt-in. We need to be responsible with how we steer these tools (AI), and we need to be enabled to do so. Transparency is a critical part of that. Configurability is a critical part of that.

Every day, engineers complain about regressions in Claude Code. Sometimes, without knowing it, you might be in an A/B test and don’t know it.

The Proof

[REMOVED] When Claude told me it was following specific system instructions, I wanted to verify that. I wanted to see if that system prompt actually existed. I don’t want to encourage anybody to go down similar paths, and due to the attention this is gathering on Hacker News, I’ve decided to remove the details.

My plans started coming back as terse bullet lists with no context. I asked Claude why it was writing such bad plans. It told me it was following specific system instructions to hard-cap plans at 40 lines, forbid context sections, and “delete prose, not file paths.”

This feels like the opposite of transparency and responsible AI deployment. AI tooling needs more transparency, not less. I need the ability to own my process and guide AI with a human in the loop.

I’m no expert on what it takes to deploy something like Claude Code or the cost models behind it. I think this Hacker News comment provides insight worth considering:

Presumably Anthropic has to make lots of choices on how much processing each stage of Claude Code uses. If they maxed everything out, they’d make more of a loss/less of a profit on each user. $200/month would cost $400/month. Doing A/B tests on each part of the process to see where to draw the line (perhaps based on task and user) would seem a better way of doing it than arbitrarily choosing a limit.

💬 **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#Test #Workflow**

🕒 **Posted on**: 1773499365

🌟 **Want more?** Click here for more info! 🌟

Please Do Not A/B Test My Workflow

The Proof

By

Leave a Reply Cancel reply