Choose Stagehand if: You're building internal tools where an engineer needs to automate repetitive browser tasks You need to automate unstable UIs (SPAs that change frequently, or custom Canvas-based apps) You want structured data extraction with type safety (Zod schemas in extract()) You're deploying AI agents that need to navigate arbitrary websites Choose Playwright + AI (selector generation) if: You need production-grade reliability (99%+ success rate) Cost is a constraint — AI is only used

Stagehand vs Playwright AI vs Browser Use: AI Web Automation in 2026

TL;DR

Stagehand (by Browserbase) is the best developer experience for AI web automation — a clean TypeScript API built on Playwright that lets you describe what to do in plain English. browser-use is the Python-first agentic browser control library powering many open-source AI agents. Playwright with @playwright/test + LLM assistance is the structured approach — use Playwright's reliable locators, add LLM-powered code generation for stubborn selectors, but keep the execution deterministic. In 2026, use Stagehand for prototyping and internal tooling; use structured Playwright + AI for production automation that must be reliable.

Key Takeaways

Stagehand GitHub stars: ~8k (Feb 2026) — explosive growth since its open-source release in late 2024
browser-use GitHub stars: ~35k — the dominant Python-first AI browser library; JS port in progress
AI web automation costs real money — each act() call in Stagehand consumes ~1–3k tokens, costing $0.003–$0.01 per action
Reliability gap is real — AI-powered automation is 85–95% reliable vs 99%+ for Playwright selectors
Stagehand supports computer use models (Anthropic Claude Sonnet, GPT-4o) for screenshot-based interaction
All three support CDP (Chrome DevTools Protocol) — they all control Chromium/Chrome under the hood
The sweet spot: use AI to generate locators, Playwright to execute — best of both worlds

Why AI Web Automation?

Traditional web automation (Playwright, Puppeteer, Selenium) requires precise CSS selectors or ARIA labels. Maintenance is expensive: every UI change breaks tests. AI-powered automation changes the equation — you describe what to do rather than how to find elements.

The tradeoff: AI automation is slower, more expensive, and less reliable than deterministic automation. It excels at:

One-off tasks where writing precise selectors isn't worth it
Rapidly changing UIs where maintaining selectors is a maintenance burden
Natural language scripts for non-engineers to describe automation tasks
AI agents that need to navigate arbitrary websites

Stagehand: The TypeScript-First AI Browser

Stagehand wraps Playwright with AI superpowers. It adds three key methods: act() for performing actions, extract() for extracting structured data, and observe() for understanding the current page state.

Installation

npm install @browserbasehq/stagehand
# Requires: API keys for OpenAI or Anthropic

Basic Usage

import { Stagehand } from "@browserbasehq/stagehand";
import { z } from "zod";

const stagehand = new Stagehand({
  env: "LOCAL",           // or "BROWSERBASE" for cloud
  modelName: "gpt-4o",   // or "claude-3-5-sonnet-20241022"
  modelClientOptions: {
    apiKey: process.env.OPENAI_API_KEY,
  },
  verbose: 1,
  enableCaching: true,   // Cache LLM calls for repeated actions
});

await stagehand.init();
const { page } = stagehand;

// Navigate
await page.goto("https://github.com");

// act() — describe what to do in plain English
await stagehand.act({ action: "Click the Sign in button" });
await stagehand.act({ action: "Fill the username field with 'your-username'" });
await stagehand.act({ action: "Fill the password field with 'your-password'" });
await stagehand.act({ action: "Click the 'Sign in' button to submit the form" });

// extract() — get structured data from the page
const repos = await stagehand.extract({
  instruction: "Extract the list of pinned repositories",
  schema: z.object({
    repositories: z.array(
      z.object({
        name: z.string(),
        description: z.string().nullable(),
        language: z.string().nullable(),
        stars: z.number(),
      })
    ),
  }),
});

console.log(repos.repositories);
// [{ name: "my-repo", description: "...", language: "TypeScript", stars: 42 }]

await stagehand.close();

Computer Use Mode (Screenshot-Based)

const stagehand = new Stagehand({
  env: "LOCAL",
  modelName: "claude-3-5-sonnet-20241022",  // Anthropic's computer use model
  modelClientOptions: { apiKey: process.env.ANTHROPIC_API_KEY },
  // Enable computer use for screenshot-based interaction (bypasses DOM)
  enableComputerUse: true,
});

await stagehand.init();
await page.goto("https://complex-spa.example.com");

// Works even on canvas-heavy apps, complex custom UIs
await stagehand.act({
  action: "Click the blue 'Create Project' button in the top navigation",
});

Multi-Step Agent Flow

import { Stagehand } from "@browserbasehq/stagehand";
import { z } from "zod";

async function scrapeLinkedInProfile(url: string) {
  const stagehand = new Stagehand({
    env: "BROWSERBASE",  // Cloud browser — stealth, no detection
    browserbaseApiKey: process.env.BROWSERBASE_API_KEY,
    browserbaseProjectId: process.env.BROWSERBASE_PROJECT_ID,
    modelName: "gpt-4o-mini",  // Cheaper model for simple tasks
    modelClientOptions: { apiKey: process.env.OPENAI_API_KEY },
  });

  await stagehand.init();
  const { page } = stagehand;

  await page.goto(url);

  // Wait for page to load
  await stagehand.observe();

  const profile = await stagehand.extract({
    instruction: "Extract the person's profile information",
    schema: z.object({
      name: z.string(),
      headline: z.string(),
      location: z.string().optional(),
      connections: z.string().optional(),
      about: z.string().optional(),
      experience: z.array(
        z.object({
          title: z.string(),
          company: z.string(),
          duration: z.string().optional(),
        })
      ),
    }),
  });

  await stagehand.close();
  return profile;
}

Stagehand with Browserbase (Cloud)

// Browserbase provides managed browsers with:
// - Stealth mode (harder to detect as bot)
// - Rotating residential proxies
// - Session recording and replay
// - CAPTCHA solving

const stagehand = new Stagehand({
  env: "BROWSERBASE",
  browserbaseApiKey: process.env.BROWSERBASE_API_KEY!,
  browserbaseProjectId: process.env.BROWSERBASE_PROJECT_ID!,
  modelName: "claude-3-5-sonnet-20241022",
  modelClientOptions: { apiKey: process.env.ANTHROPIC_API_KEY! },
  browserbaseSessionCreateParams: {
    projectId: process.env.BROWSERBASE_PROJECT_ID!,
    browserSettings: {
      viewport: { width: 1920, height: 1080 },
    },
  },
});

Playwright with AI-Assisted Automation

Traditional Playwright + AI is about using LLMs to generate selectors and test code, not to execute actions. This keeps execution deterministic while reducing the cost of writing automation.

AI-Powered Locator Generation

import { test, expect } from "@playwright/test";
import Anthropic from "@anthropic-ai/sdk";
import * as fs from "fs";

// Utility: Use AI to find the right selector
async function findSelector(
  page: any,
  description: string
): Promise<string> {
  // Take a screenshot + get accessibility tree
  const screenshot = await page.screenshot({ type: "png" });
  const ariaSnapshot = await page.accessibility.snapshot();

  const claude = new Anthropic();
  const response = await claude.messages.create({
    model: "claude-3-5-sonnet-20241022",
    max_tokens: 500,
    messages: [
      {
        role: "user",
        content: [
          {
            type: "image",
            source: {
              type: "base64",
              media_type: "image/png",
              data: screenshot.toString("base64"),
            },
          },
          {
            type: "text",
            text: `Given this page, provide the best Playwright locator for: "${description}"

            ARIA tree: ${JSON.stringify(ariaSnapshot, null, 2)}

            Respond with only the locator string, e.g.: page.getByRole('button', { name: 'Submit' })`,
          },
        ],
      },
    ],
  });

  return response.content[0].text.trim();
}

// Use in tests
test("submit form", async ({ page }) => {
  await page.goto("https://app.example.com/checkout");

  // AI-generated stable locators
  const submitButton = await findSelector(page, "the checkout submit button");
  // Returns: page.getByRole('button', { name: 'Complete Order' })

  // Execute with Playwright — deterministic, no AI at runtime
  await page.getByRole("button", { name: "Complete Order" }).click();
  await expect(page.getByText("Order confirmed")).toBeVisible();
});

`@playwright/test` with `ai` Fixture (Community Pattern)

// playwright.ai.fixture.ts — community pattern for AI-assisted tests
import { test as base } from "@playwright/test";
import OpenAI from "openai";

type AIFixture = {
  ai: {
    act: (instruction: string) => Promise<void>;
    assert: (condition: string) => Promise<void>;
  };
};

const test = base.extend<AIFixture>({
  ai: async ({ page }, use) => {
    const openai = new OpenAI();

    const act = async (instruction: string) => {
      const screenshot = await page.screenshot({ encoding: "base64" });
      const html = await page.content();

      const response = await openai.chat.completions.create({
        model: "gpt-4o",
        messages: [
          {
            role: "user",
            content: [
              { type: "image_url", image_url: { url: `data:image/png;base64,${screenshot}` } },
              {
                type: "text",
                text: `Given the screenshot, provide Playwright code to: ${instruction}
                       HTML context: ${html.slice(0, 3000)}
                       Respond with valid JavaScript only.`,
              },
            ],
          },
        ],
      });

      // Execute the generated code
      const code = response.choices[0].message.content!;
      await page.evaluate(new Function("page", code)(page));
    };

    await use({ act, assert: async () => {} });
  },
});

// Usage
test("user can checkout", async ({ page, ai }) => {
  await page.goto("/checkout");
  await ai.act("Fill in the shipping address form with test data");
  await ai.act("Select the standard shipping option");
  await ai.act("Click the place order button");
  await expect(page.getByText("Order confirmed")).toBeVisible();
});

browser-use: Python-First, JS Coming

browser-use is the dominant open-source AI browser control library, primarily Python. The JavaScript/TypeScript version (browser-use-js) is in early development but gaining traction.

Python Usage (Reference — most popular form)

# Python example (reference)
from browser_use import Agent
from langchain_openai import ChatOpenAI

agent = Agent(
    task="Find the top 3 TypeScript repositories on GitHub this week",
    llm=ChatOpenAI(model="gpt-4o"),
)

result = await agent.run()
print(result)

JavaScript Port (browser-use-js — experimental)

// browser-use-js — early stage, API may change
import { Agent } from "browser-use-js";
import { ChatOpenAI } from "@langchain/openai";

const agent = new Agent({
  task: "Go to GitHub and find the trending TypeScript repositories this week",
  llm: new ChatOpenAI({ modelName: "gpt-4o" }),
  browserConfig: {
    headless: false,  // Visual mode for debugging
    chromePath: "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome",
  },
});

const result = await agent.run();
console.log(result.finalResult());
console.log("Actions taken:", result.actionHistory());

Feature Comparison

Feature	Stagehand	Playwright + AI	browser-use (JS)
Language	TypeScript	TypeScript	TypeScript (port)
Maturity	✅ Stable	✅ Stable	⚠️ Experimental
AI integration	Native (act/extract)	Manual (code generation)	Native (agentic)
Execution mode	AI-powered	Deterministic	AI-powered
Reliability	~85-95%	~99%+	~80-90%
Cost per action	$0.003–$0.01	$0 (code gen only)	$0.003–$0.015
Playwright under hood	✅	✅	Via Playwright
Cloud browser support	✅ Browserbase	✅ (BrowserStack, etc.)	✅
Computer use (screenshot AI)	✅	❌	✅
Structured data extraction	✅ (Zod schema)	Manual	✅
CAPTCHA handling	Via Browserbase	Manual	Manual
GitHub stars	8k	70k (Playwright)	35k (Python)
npm downloads/week	~50k	5M+ (Playwright)	N/A (Python)

Cost Analysis

Running AI web automation at scale requires careful cost management:

Simple action (click, fill form): ~1,000 tokens × $0.000003 = $0.003/action
Complex action (extract data, navigate): ~3,000 tokens × $0.000003 = $0.009/action
Computer use (screenshot): ~2,000 tokens × $0.000003 + vision = $0.01–$0.03/action

100 automations/day × $0.005 avg = $0.50/day = $15/month
1,000 automations/day = $150/month
10,000 automations/day = $1,500/month

Vs. Playwright (deterministic): $0/month for compute (just browser costs)

For high-volume automation, the cost difference between AI-powered and traditional Playwright is significant. Use AI automation where it provides value (unstable UIs, one-off tasks); use deterministic Playwright everywhere else.

When to Use Each

Choose Stagehand if:

You're building internal tools where an engineer needs to automate repetitive browser tasks
You need to automate unstable UIs (SPAs that change frequently, or custom Canvas-based apps)
You want structured data extraction with type safety (Zod schemas in extract())
You're deploying AI agents that need to navigate arbitrary websites

Choose Playwright + AI (selector generation) if:

You need production-grade reliability (99%+ success rate)
Cost is a constraint — AI is only used to generate code, not at runtime
You're building a test suite that runs hundreds of times per day
Your team already has Playwright expertise

Choose browser-use if:

You're working in Python primarily and need agentic browser control
You're building an AI agent that needs to research the web as part of a larger pipeline
The JS port is mature enough for your use case (verify current status before using)

Caching and Cost Optimization

AI-powered browser automation at scale requires deliberate caching strategies to control LLM costs. Stagehand's enableCaching: true option caches the LLM's response for a given page state and instruction combination — if the same page layout triggers the same act() call, the cached selector result is reused without an additional LLM API call. The cache key is derived from a hash of the page's accessibility tree and the instruction string, so cache hits occur when the page structure is unchanged and the instruction is identical. Enabling caching in development dramatically reduces costs during iteration, and the cache can be persisted to disk or a database for sharing across team members or CI runs.

For production automation pipelines, a hybrid approach works best: use Stagehand with caching for the navigation and extraction steps that change infrequently (page structure is stable), and fall back to explicit Playwright selectors for the steps that execute most frequently (form submissions, button clicks in a known UI). This hybrid reduces per-run costs to only the LLM calls needed for the dynamic portions of the workflow. The observe() method in Stagehand is specifically designed for this pattern — it inspects the page and returns structured information about actionable elements, which you can then act on using explicit Playwright locators rather than another act() call.

Reliability Patterns for Production Automation

The 85–95% reliability figure for AI-powered automation versus 99%+ for deterministic Playwright represents a meaningful gap that compounds across multi-step workflows. A 10-step Stagehand workflow where each step is 95% reliable has an overall success rate of 0.95^10, approximately 60% — meaning 40% of runs require a retry or fail entirely. For internal tooling where occasional failures are acceptable and a human monitors the output, this is fine. For production pipelines that must complete reliably on a schedule, this failure rate is problematic without a robust retry and alerting strategy.

The recommended reliability pattern for production AI automation is to wrap each act() call in a retry loop with progressive fallback: attempt the AI action up to 3 times, and if it consistently fails, fall back to a pre-configured Playwright selector if one is available. Stagehand's page object is a full Playwright Page instance — you can mix AI-powered actions and deterministic Playwright selectors in the same automation script. Teams building critical automation pipelines should instrument each act() call with the success/failure outcome and the page state at time of failure, enabling progressive improvement of the explicit-selector fallbacks as failure patterns emerge from production data.

TypeScript and Schema-Driven Extraction

One of Stagehand's most valuable production features is its Zod-schema-driven extract() method, which bridges the gap between LLM outputs (inherently unstructured strings) and TypeScript's type system. By defining a Zod schema for the data you want to extract — structured like an API response type — Stagehand instructs the LLM to return JSON conforming to that schema, then validates and types the response. A failed validation (LLM returned data that doesn't match the schema) triggers a retry with the validation error as additional context, improving success rates on complex extraction tasks. This schema-first extraction pattern eliminates the manual JSON parsing and type assertion code that typically accompanies LLM-based data extraction.

For teams building data pipelines that extract structured information from web pages at scale — competitive pricing data, product catalog information, job listings — the Zod schema approach provides both type safety and a natural quality gate. Extraction schemas can be versioned alongside code, and schema changes automatically update the TypeScript types for the extracted data throughout the consuming codebase. The Playwright AI selector generation approach, in contrast, does not have a native structured extraction equivalent — it generates imperative Playwright code to locate and read specific elements, which is more brittle than schema-based extraction when page structure changes, and requires manual type annotation of the extracted values.

Methodology

Data sourced from GitHub repositories (star counts as of February 2026), npm weekly download statistics (January 2026), official documentation, and community discussions on Discord and Twitter/X. Cost estimates based on GPT-4o pricing ($3/1M input, $15/1M output) as of Q1 2026. Reliability metrics from community benchmarks and Stagehand's published evaluation data. browser-use JS port status verified from official GitHub repository.

Related: Playwright vs Cypress vs Puppeteer for traditional E2E testing, or MSW vs Nock vs Axios Mock Adapter for API mocking in tests.

Stagehand vs Playwright AI vs Browser Use 2026