Claude's Memory Lapse: A Bug Erased Its Reasoning After an Hour

A surprising number of users reported that Claude began to forget its reasoning mid-conversation. This lapse in memory was traced back to a bug introduced by three separate changes affecting Claude Code, the Claude Agent SDK, and Claude Cowork. These issues, while not impacting the Anthropic API directly, led to noticeable degradation in Claude’s performance for users.

The most critical flaw involved a session idle handling bug that caused Claude to repeatedly clear older thought processes, resulting in forgetfulness and repetitive responses. This was compounded by a system prompt change aimed at reducing verbosity, which unfortunately diminished Claude’s coding quality. Fortunately, all three identified issues have been resolved as of April 20, with the API remaining unaffected throughout these incidents.

A Cost-Saving Measure Backfires, Eroding Context

Anthropic made a deliberate choice to alter Claude Code’s default reasoning effort from high to medium on March 4. This shift, impacting Sonnet 4.6 and Opus 4.6, was intended to improve efficiency but ultimately degraded coding quality. After receiving user feedback, this change was reverted on April 7, indicating a direct acknowledgment that the optimization came at too high a cost to performance.

Further compounding the context retention problem was a bug introduced on March 26, related to clearing older thinking in sessions idle for over an hour. This meant that after just sixty minutes of inactivity, Claude would effectively “forget” what it had been discussing, leading to repetitive and less intelligent interactions. According to technical documentation, this issue was fixed on April 10 in version v2.1.101, but the cumulative effect of these changes was a noticeable decline in user experience.

Performance Trade-offs and a Race to Rebuild Trust

The problem that caused usage limits to drain faster than expected was a complex interplay at the intersection of Claude Code’s context management, the Anthropic API, and extended thinking. Subsequent requests to a bug resulted in cache misses, exacerbating the perception of rapid usage depletion. According to technical documentation, this bug was fixed on April 10 in v2.1.101, addressing a critical flaw in how the system managed user interactions and resource allocation.

On April 16, a system prompt change was implemented to reduce verbosity, specifically instructing model versions to keep text between tool calls to ≤25 words and final responses to ≤100 words unless more detail was required. This directive had an outsized effect on intelligence, showing a 3% drop in one evaluation for both Opus 4.6 and 4.7. The prompt was immediately reverted as part of the April 20 release, underscoring a rapid response to mitigate further negative impacts on model performance.

📊 Key Numbers

Response degradation: Three separate changes to Claude Code, Claude Agent SDK, and Claude Cowork
Bug fix for idle handling: April 10 (v2.1.101)
Bug fix for usage limit drain: April 10 (v2.1.101)
Prompt reverted for verbosity: April 20 (v2.1.116)
Intelligence drop from prompt change: 3% for Opus 4.6 and 4.7
Default reasoning effort change reverted: April 7
Usage limits reset: April 23

🔍 Context

This incident directly addresses a critical gap in reliable context retention for AI assistants, a problem that emerged unexpectedly with recent Claude updates. It challenges the trend of prioritizing reduced latency and cost over consistent intelligent performance, a balance many AI providers are actively seeking. Anthropic’s direct competitor, OpenAI’s ChatGPT, currently offers more consistent context memory over longer conversations, though often with higher latency. The timing is critical as users’ expectations for AI agent reliability have surged in the past six months, making performance regressions particularly damaging to user trust.

💡 AIUniverse Analysis

Our reading: Anthropic inadvertently prioritized efficiency over core functionality, leading to significant user-facing issues. The implementation of a caching optimization bug for session idle handling, coupled with a reduction in default reasoning effort, directly undermined Claude’s ability to maintain conversational context. This trade-off between speed and intelligence, while understandable from a resource management perspective, exposed a vulnerability in their development pipeline and testing protocols.

The “shadow” here lies in the underlying assumption that these performance gains would not critically impact core AI capabilities like memory and reasoning. The fact that Claude Opus 4.7 identified a bug that Opus 4.6 missed, while simultaneously showing a performance dip from prompt changes, suggests an uneven development trajectory. For these changes to matter in 12 months, Anthropic must demonstrate a robust, transparent process for evaluating the intelligence-agnostic performance of prompt and model adjustments, ensuring that efficiency gains do not come at the expense of fundamental AI utility.

⚖️ AIUniverse Verdict

⚠️ Overhyped. The claim of improved efficiency appears overstated given the significant degradation in Claude’s reasoning and context retention, directly impacting user experience and trust.

Founders & Startups: Founders relying on AI agents for complex tasks may need to carefully re-evaluate their AI provider’s reliability and context retention capabilities following this incident.

Developers: Developers integrating with Claude need to be aware of specific version fixes and potential edge cases related to session management and prompt caching.

Enterprise & Mid-Market: Enterprise clients should scrutinize Anthropic’s quality assurance and rollback processes to ensure business-critical AI integrations are not jeopardized by similar performance regressions.

General Users: Users may experience a noticeable improvement in Claude’s coding capabilities and consistency following the resolution of these reported degradation issues.

⚡ TL;DR

What happened: A bug caused Claude to forget its reasoning after an hour of inactivity, and other changes reduced its coding intelligence.
Why it matters: Users experienced forgetful, repetitive responses and degraded coding quality due to prioritizing efficiency over context retention.
What to do: Monitor Anthropic’s new development controls, especially for changes impacting model intelligence and performance.

📖 Key Terms

reasoning effort: The computational intensity a model expends on a given task, impacting response depth and resource usage.
Claude Agent SDK: A software development kit that enables developers to build and deploy AI-powered agents using Claude technology.
prompt caching: A technique to store and reuse previous responses or reasoning steps to speed up subsequent interactions, but which can be prone to errors if not managed correctly.
system prompt: A set of instructions given to an AI model at the beginning of a conversation to guide its behavior, tone, and output format.
verbosity: The tendency of an AI model to produce lengthy and detailed responses, which can be useful for complex tasks but may also be inefficient.

Analysis based on reporting by anthropic.com. Original article here.

Claude’s Memory Lapse: A Bug Erased Its Reasoning After an Hour

ByAI Universe

A Cost-Saving Measure Backfires, Eroding Context

Performance Trade-offs and a Race to Rebuild Trust

📊 Key Numbers

🔍 Context

💡 AIUniverse Analysis

⚖️ AIUniverse Verdict

⚡ TL;DR

📖 Key Terms

By AI Universe

Related Post

Google Unleashes Next-Gen TPUs to Accelerate AI Agents and Cut Development Time

Alibaba’s New AI Model Tackles Complex Coding Tasks, Outperforming Larger Rivals

AI Agents Now Chatting on iMessage, WhatsApp with New Open-Source Tool

Leave a Reply Cancel reply

You missed

Claude’s Memory Lapse: A Bug Erased Its Reasoning After an Hour

OpenAI Rolls Out GPT-5.5, Boosting Agentic Performance But Doubling API Costs

Startup LeCun Pursues $1 Billion Vision for Modular, Efficient AI

Xiaomi’s New AI Can Tackle University-Level Code Tasks in Hours, Not Weeks