Cline Breaks Its Agent Free From the IDE: The Open-Source SDK That Lets AI Sessions Outlive Any App

Agent sessions that die when a developer closes a tab are no longer an architectural inevitability. Cline published its @cline/sdk on May 14, 2026 — this AI Universe briefing, dated 2026-05-16, summarizes that release — extracting the internal agent harness that previously lived inside its IDE extension into a standalone, open-source TypeScript SDK. The consequence is direct: long-running agent work no longer dies with a UI restart, and sessions can move across surfaces because the agent loop is stateless by design. That is the real story, and it is an architectural one.

The benchmark numbers sharpen the case. Cline’s CLI, running claude-opus-4.7, scored 74.2% on Terminal Benchmark 2.0 — a standard tracked at tbench.ai — against Anthropic’s own published score of 69.4% for Claude Code on the same model. On claude-opus-4.6, Cline CLI reached 71.9% versus Claude Code’s published 65.4%. The Cline team’s own framing for this effort is two words: “Rebuilding the foundation.”

The SDK is available now via npm install @cline/sdk, requires Node.js 22 or later, and supports Anthropic, OpenAI, Google, AWS Bedrock, Mistral, and any OpenAI-compatible endpoint. Full documentation lives at docs.cline.bot/sdk and the official announcement at https://cline.bot/blog/introducing-cline-sdk-the-upgraded-agent-runtime.

A Four-Layer Stack Built to Separate Concerns — and Survive Restarts

The @cline/sdk is not a thin wrapper. It is a deliberately layered TypeScript stack composed of four discrete packages, each with a bounded responsibility. At the foundation sits @cline/shared, which carries types, schemas, tool helpers, hook contracts, and extension registration utilities, with no dependencies on any higher layer. Above it, @cline/llms owns the provider gateway and model catalogs — covering Anthropic, OpenAI, Google, AWS Bedrock, Mistral, LiteLLM, and OpenAI-compatible endpoints such as vLLM, Together, and Fireworks. Crucially, all provider logic is kept out of the agent loop, so switching providers is a configuration change, not a code change.

The third layer, @cline/agents, is the stateless agent execution loop itself: it handles iteration, tool orchestration, and event emission, but deliberately does not own session storage, built-in file or shell tools, or Node-specific orchestration. That separation is what makes it embeddable in browser environments. The top layer, @cline/core, is the Node runtime and orchestration layer, responsible for sessions, storage, built-in tools, hub and remote transports, automation and scheduling, telemetry, and plugin and extension loading. Developers who only need a browser-compatible stateless loop can install just @cline/agents, @cline/shared, and @cline/llms; those who only need an LLM proxy layer can install @cline/llms and @cline/shared alone.

The SDK also ships native multi-agent support — agent teams and subagents — alongside a plugin manifest format called cline.plugins and two exported functions, registerProvider and registerModel, for extending the runtime registry at runtime. Full TypeScript types are included, and plain JavaScript is also supported. Installation works with npm, yarn, pnpm, or bun, and requires an API key from at least one LLM provider.

Benchmark Gaps, Portability Trade-offs, and the Friction of Adoption

The Terminal Benchmark 2.0 results deserve careful reading. On kimi-k2.6, Cline scored 55.1% — compared to Pi-Code’s 45.5% and OpenCode’s 37.1% on the same model. That 17.9-point gap over OpenCode on a single model is not a rounding error; it suggests the harness itself contributes meaningfully to task completion, independent of the underlying model. The implication for developers is that the runtime layer is no longer a neutral substrate — it is a performance variable.

The CRITICAL_ANGLE here is worth naming plainly: Cline has chosen depth over simplicity. A monolithic agent design offers faster initial setup; the @cline/sdk‘s four-package layered structure introduces dependency management complexity and requires developers to internalize which layer handles which concern. The npx skills add cline/sdk-skill command partially addresses this by allowing Claude Code, Codex, or Cline itself to understand the SDK’s APIs for scaffolding agents and wiring up plugins — but that is a convenience layer on top of an architecture that still demands deliberate adoption. The trade-off is explicit: developer adoption friction in exchange for enhanced architectural robustness and the ability to migrate agent state across diverse environments.

Tool	Key Difference	Best For
Cline CLI (`claude-opus-4.7`)	74.2% on Terminal Benchmark 2.0; stateless, portable agent loop	Multi-surface, long-running agent workflows
Pi-Code (`kimi-k2.6`)	45.5% on Terminal Benchmark 2.0	Lighter terminal agent use cases
OpenCode (`kimi-k2.6`)	37.1% on Terminal Benchmark 2.0; lowest of the three on this model	Simpler, lower-complexity terminal tasks

📊 Key Numbers

74.2%: Cline CLI score on Terminal Benchmark 2.0 running claude-opus-4.7 — 4.8 points above Anthropic’s published Claude Code score of 69.4% on the same model
71.9%: Cline CLI score on Terminal Benchmark 2.0 running claude-opus-4.6 — 6.5 points above Claude Code’s published 65.4% on the same model
55.1% vs 37.1%: Cline versus OpenCode on kimi-k2.6 on Terminal Benchmark 2.0 — an 18-point gap attributable to the runtime harness, not the model
45.5%: Pi-Code score on Terminal Benchmark 2.0 with kimi-k2.6, placing it between Cline and OpenCode
Node.js 22 or later: Minimum runtime requirement for the full @cline/core layer; browser-compatible subset (@cline/agents) has no Node dependency
4 packages: @cline/shared, @cline/llms, @cline/agents, @cline/core — each independently installable for targeted use cases

🔍 Context

The Terminal Benchmark 2.0 scores referenced here are drawn from tbench.ai, the benchmark’s tracking site, and compared against Anthropic’s own published results for Claude Code — making the evaluator a third-party standard rather than Cline’s internal testing alone. The specific gap this SDK addresses is the tight coupling between agent logic and its host environment: previously, an agent session running inside a VS Code extension would not survive a restart or transfer to a CLI or Kanban surface. The @cline/sdk resolves this by isolating the stateless agent loop in @cline/agents, which carries no session storage or Node-specific dependencies. This release responds directly to a trend in agentic AI development where developers are building workflows that span terminals, browsers, and CI pipelines simultaneously — environments that a single IDE extension cannot serve. Rather than competing with a named commercial rival, the architectural contrast here is with bespoke, monolithic agent integrations that embed all logic inside a single application layer, making state migration and surface portability structurally impossible without a full rewrite. The release is tied to a concrete product milestone: Cline’s CLI and Kanban are already running on the SDK, with IDE extensions actively being migrated.

💡 AIUniverse Analysis

Our reading: ★ LIGHT — The genuinely new mechanism here is the separation of the agent loop from session ownership. By making @cline/agents stateless and browser-compatible while pushing session storage and transport into @cline/core, Cline has created a runtime where the same agent logic can execute in a terminal, a browser tab, or a serverless function without modification. That is not a marketing claim — it is a direct consequence of the layered architecture, and the benchmark numbers suggest the harness itself adds measurable task-completion capability beyond what the underlying model provides alone.

★ SHADOW — The fine print is the adoption cost. A four-package dependency graph with distinct installation paths for browser, Node, and proxy use cases is not a drop-in replacement for a monolithic agent script. Developers who need the full @cline/core layer are committing to Node.js 22 or later and to Cline’s specific plugin and extension model — including the cline.plugins manifest format and the registerProvider / registerModel registry pattern. The benchmark scores are compelling, but they were produced by Cline’s own CLI, not by third-party developers building on the SDK. Whether independent implementations achieve comparable results on Terminal Benchmark 2.0 remains unverified. A cautious engineering lead would also note that multi-LLM provider support — spanning Anthropic, OpenAI, Google, AWS Bedrock, Mistral, and OpenAI-compatible endpoints — introduces configuration surface area that can silently degrade performance if provider-specific behaviors are not accounted for in the agent loop.

For this to matter in 12 months, independent developers building on @cline/sdk would need to demonstrate that the portability guarantees hold in production multi-surface deployments — not just in Cline’s own CLI and Kanban implementations.

⚖️ AIUniverse Verdict

✅ Promising. The stateless agent loop architecture is a real and verifiable mechanism for cross-surface session portability, and the 4.8-point Terminal Benchmark 2.0 gap over Anthropic’s published Claude Code score on claude-opus-4.7 is a concrete data point — but whether third-party developers can replicate those gains building on the SDK, rather than using Cline’s own CLI, is the open question that determines whether this becomes infrastructure or a footnote.

🎯 What This Means For You

Founders & Startups: Founders can now leverage a robust, open-source agent runtime to accelerate the development and deployment of cross-platform AI agents, reducing initial engineering overhead for complex agentic features.

Developers: Developers gain the ability to embed stateless agent loops in various environments, from browsers to serverless functions, with a pluggable architecture for custom tools and LLM providers — install the full SDK with npm install @cline/sdk, the CLI globally with npm i -g @cline, or add SDK awareness to an existing coding agent with npx skills add cline/sdk-skill.

Enterprise & Mid-Market: Enterprises can integrate a standardized agent runtime to build consistent AI-powered workflows across developer tools and command-line interfaces, with the option to install only @cline/llms and @cline/shared for a lightweight LLM proxy layer that avoids the full Node.js 22 dependency.

General Users: Users may experience more persistent AI assistance as agent sessions can follow them across different applications and devices without interruption — because the session state is no longer tied to a single UI process.

⚡ TL;DR

What happened: Cline extracted its internal agent harness into @cline/sdk, an open-source TypeScript SDK with a stateless agent loop that lets sessions survive UI restarts and move across surfaces.
Why it matters: Cline CLI running claude-opus-4.7 scored 74.2% on Terminal Benchmark 2.0 versus Anthropic’s published 69.4% for Claude Code — suggesting the runtime layer, not just the model, drives task-completion performance.
What to do: Run npm install @cline/sdk to evaluate the full stack, or npx skills add cline/sdk-skill to add SDK awareness to Claude Code, Codex, or Cline — then test your own agent workflows against Terminal Benchmark 2.0 at tbench.ai to verify the portability claims hold in your environment.

📖 Key Terms

@cline/sdk: The open-source TypeScript SDK Cline extracted from its IDE extension, composed of four independently installable packages that together provide a portable, stateless agent runtime.
Agent loop: In this context, the stateless execution cycle inside @cline/agents that handles iteration, tool orchestration, and event emission without owning session storage — the property that makes sessions portable across surfaces.
Terminal Benchmark 2.0: A third-party benchmark tracked at tbench.ai that measures how effectively an AI coding agent completes terminal-based tasks; the scores cited here compare Cline CLI against Anthropic’s own published Claude Code results.
pass@1: A benchmark scoring convention where a model or agent is credited only if it solves a task correctly on its first attempt, without retries — the metric underlying the Terminal Benchmark 2.0 percentages in this article.

📎 Sources

Sources: MarkTechPost

Analysis based on reporting by MarkTechPost. Original article here.

Cline Breaks Its Agent Free From the IDE: The Open-Source SDK That Lets AI Sessions Outlive Any App

ByAI Universe

Cline Breaks Its Agent Free From the IDE: The Open-Source SDK That Lets AI Sessions Outlive Any App

A Four-Layer Stack Built to Separate Concerns — and Survive Restarts

Benchmark Gaps, Portability Trade-offs, and the Friction of Adoption

📊 Key Numbers

🔍 Context

💡 AIUniverse Analysis

⚖️ AIUniverse Verdict

🎯 What This Means For You

⚡ TL;DR

📖 Key Terms

📎 Sources

By AI Universe

Related Post

Claude Opus 4.8 Catches Four Times More Coding Errors — And Lets You Choose How Hard It Thinks

Meta Folds Recommendation Systems into One AI Model, Boosting Speed and Cutting Costs

NVIDIA’s Vera CPU is making waves, challenging established performance benchmarks with its specialized architecture

You missed

Claude Opus 4.8 Catches Four Times More Coding Errors — And Lets You Choose How Hard It Thinks

Anthropic’s Claude Opus 4.8 Unleashes Agent Swarms for Complex Tasks, With Speed Mode Now Cheaper

Meta Folds Recommendation Systems into One AI Model, Boosting Speed and Cutting Costs

Perplexity AI Slashes AI Inference Speed with New Rust Tokenizer