Skip to main content

Vercel AI SDK v4: generateText, streamText, Tools 2026

·PkgPulse Team

Vercel AI SDK v4: generateText, streamText, and Tools in 2026

TL;DR

The Vercel AI SDK (package: ai) has become the de facto standard for building LLM-powered TypeScript applications — it abstracts OpenAI, Anthropic, Google Gemini, Mistral, and 20+ other providers behind a unified API. Version 4 (released 2025) brought generateObject for structured outputs, multi-step tool calling (agents that chain tool invocations automatically), useObject for streaming structured data to React, and provider middleware for logging and caching. If you're building AI features in a Next.js or Node.js app in 2026, the AI SDK's generateText / streamText pattern is the starting point.

Key Takeaways

  • generateText is for non-streaming LLM calls with text output — simple Q&A, batch processing, classification
  • streamText is for real-time streaming — chat interfaces, progressive content generation, anything where waiting for the full response hurts UX
  • generateObject returns structured, schema-validated objects — the single best feature in v4 for structured AI outputs without prompt engineering
  • Tool calling (function calling) lets LLMs invoke TypeScript functions — the foundation of AI agents
  • maxSteps enables automatic multi-step tool execution — the LLM calls a tool, gets results, calls another tool, without your code managing the loop
  • Provider middleware in v4 lets you add caching, logging, and rate limiting to any provider call
  • npm downloads: ai package ~3.5M/week as of March 2026, up from ~800K/week in early 2024

Why the AI SDK Matters

Without an abstraction layer, every LLM provider has a different API:

// OpenAI
const response = await openai.chat.completions.create({ model, messages, stream })

// Anthropic
const response = await anthropic.messages.create({ model, messages, max_tokens })

// Google Gemini
const response = await generativeModel.generateContent({ contents, config })

Every provider has different request formats, response shapes, streaming protocols, error codes, and token counting semantics. The Vercel AI SDK normalizes all of this:

import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { anthropic } from '@ai-sdk/anthropic'

// Same call, different providers
const { text } = await generateText({ model: openai('gpt-4o'), prompt })
const { text } = await generateText({ model: anthropic('claude-3-7-sonnet-20250219'), prompt })

Swapping providers is a one-line change. No streaming protocol differences, no response shape differences, no error handling differences.


Core APIs in AI SDK v4

generateText — Non-Streaming Text Generation

generateText makes a complete LLM request and returns when the model is done:

import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'

const { text, usage, finishReason } = await generateText({
  model: openai('gpt-4o-mini'),
  system: 'You are a helpful assistant.',
  prompt: 'Explain the difference between async and defer in HTML script tags.',
})

console.log(text)
// → "The `async` attribute..."

console.log(usage)
// → { promptTokens: 42, completionTokens: 187, totalTokens: 229 }

When to use generateText:

  • Classification tasks (tag this email, categorize this content)
  • Batch processing where you want to wait for complete output
  • Generating content that gets stored (product descriptions, summaries)
  • Single-turn Q&A where streaming adds complexity without UX benefit

The messages array supports full conversation history:

const { text } = await generateText({
  model: openai('gpt-4o'),
  messages: [
    { role: 'user', content: 'What is tRPC?' },
    { role: 'assistant', content: 'tRPC is a library...' },
    { role: 'user', content: 'How does it compare to GraphQL?' },
  ],
})

streamText — Streaming Text Generation

streamText streams tokens as they're generated — essential for chat interfaces where users shouldn't wait 3–5 seconds for a response:

import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'

const result = streamText({
  model: openai('gpt-4o'),
  prompt: 'Write a haiku about TypeScript.',
})

// Stream to stdout
for await (const chunk of result.textStream) {
  process.stdout.write(chunk)
}

// Or use in a Next.js Route Handler
return result.toDataStreamResponse()

toDataStreamResponse() converts the stream to a ReadableStream in the Vercel AI Data Stream format — the protocol that useChat and useCompletion React hooks understand natively.

In a Next.js App Router route handler:

// app/api/chat/route.ts
import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'

export async function POST(req: Request) {
  const { messages } = await req.json()

  const result = streamText({
    model: openai('gpt-4o'),
    messages,
  })

  return result.toDataStreamResponse()
}

Client-side with useChat:

// app/chat/page.tsx
'use client'
import { useChat } from 'ai/react'

export default function ChatPage() {
  const { messages, input, handleInputChange, handleSubmit } = useChat()

  return (
    <div>
      {messages.map(m => (
        <div key={m.id}>{m.role}: {m.content}</div>
      ))}
      <form onSubmit={handleSubmit}>
        <input value={input} onChange={handleInputChange} />
        <button type="submit">Send</button>
      </form>
    </div>
  )
}

useChat manages message state, streaming, and error handling — the entire chat pattern in ~20 lines.

generateObject — Structured Output

generateObject is the feature that makes AI SDK v4 indispensable for production applications. Instead of returning text, it returns a schema-validated TypeScript object:

import { generateObject } from 'ai'
import { openai } from '@ai-sdk/openai'
import { z } from 'zod'

const { object } = await generateObject({
  model: openai('gpt-4o'),
  schema: z.object({
    title: z.string(),
    tags: z.array(z.string()).max(5),
    sentiment: z.enum(['positive', 'neutral', 'negative']),
    confidence: z.number().min(0).max(1),
  }),
  prompt: 'Analyze this review: "Great product, fast shipping, would buy again!"',
})

// object is typed as:
// {
//   title: string,
//   tags: string[],
//   sentiment: 'positive' | 'neutral' | 'negative',
//   confidence: number
// }
console.log(object.sentiment) // → 'positive'
console.log(object.confidence) // → 0.97

Under the hood, generateObject uses the model's native JSON mode or function calling to guarantee a valid JSON response, then validates it against your Zod schema. If validation fails, it retries (configurable with maxRetries).

streamObject — the streaming equivalent — progressively yields object properties as they arrive, enabling UI that populates fields in real time rather than waiting for the complete object.


Tool Calling in v4

Tools let LLMs call TypeScript functions during generation — the primitive behind AI agents, RAG, and any LLM that needs to take actions or retrieve information.

Defining Tools

import { generateText, tool } from 'ai'
import { openai } from '@ai-sdk/openai'
import { z } from 'zod'

const { text, toolCalls, toolResults } = await generateText({
  model: openai('gpt-4o'),
  tools: {
    getWeather: tool({
      description: 'Get the current weather for a city',
      parameters: z.object({
        city: z.string().describe('The city name'),
        unit: z.enum(['celsius', 'fahrenheit']).default('celsius'),
      }),
      execute: async ({ city, unit }) => {
        // Your actual implementation
        return { temperature: 18, condition: 'partly cloudy', unit }
      },
    }),
  },
  prompt: 'What is the weather like in Paris right now?',
})

The LLM decides when to call tools, what parameters to pass, and incorporates the results into its response. Your execute function can call databases, APIs, file systems, or anything else.

Multi-Step Tool Calling with maxSteps

maxSteps is the v4 feature that enables real AI agents without managing a manual loop:

const { text, steps } = await generateText({
  model: openai('gpt-4o'),
  maxSteps: 5,  // Allow up to 5 tool call rounds
  tools: {
    searchDocs: tool({ ... }),
    fetchPage: tool({ ... }),
    summarize: tool({ ... }),
  },
  prompt: 'Find the changelog for Drizzle ORM and summarize the v1.0 release.',
})

Without maxSteps, after the LLM makes a tool call, you need to send the tool result back, wait for another response, check if it wants to call another tool, and repeat manually. With maxSteps: 5, the SDK manages this loop automatically — the LLM can call searchDocs, get results, call fetchPage on a result, get that content, then call summarize — all in a single generateText call.

The steps array in the response contains every round of the agent loop — useful for debugging and displaying intermediate reasoning.


useObject — Streaming Structured Data to React

A v4 addition that pairs with streamObject for React UIs that populate progressively:

'use client'
import { experimental_useObject as useObject } from 'ai/react'
import { z } from 'zod'

const schema = z.object({
  summary: z.string(),
  keyPoints: z.array(z.string()),
  readingTime: z.number(),
})

export default function SummaryPage() {
  const { object, submit, isLoading } = useObject({
    api: '/api/summarize',
    schema,
  })

  return (
    <div>
      <button onClick={() => submit({ url: 'https://example.com/article' })}>
        Summarize
      </button>
      {isLoading && <Spinner />}
      {object?.summary && <p>{object.summary}</p>}
      {object?.keyPoints?.map(point => <li key={point}>{point}</li>)}
    </div>
  )
}

The object is typed from the schema and populates field by field as the LLM generates — object.summary might be available before object.keyPoints finishes streaming.


Provider Middleware

v4 introduced provider middleware — a composable layer for cross-cutting concerns:

import { wrapLanguageModel, extractReasoningMiddleware } from 'ai'
import { openai } from '@ai-sdk/openai'

// Cache responses (great for development/testing)
const cachedOpenAI = wrapLanguageModel({
  model: openai('gpt-4o'),
  middleware: {
    wrapGenerate: async ({ doGenerate, params }) => {
      const key = JSON.stringify(params)
      const cached = await cache.get(key)
      if (cached) return cached

      const result = await doGenerate()
      await cache.set(key, result, { ex: 3600 })
      return result
    },
  },
})

// Extract chain-of-thought reasoning from extended thinking models
const reasoningModel = wrapLanguageModel({
  model: anthropic('claude-3-7-sonnet-20250219'),
  middleware: extractReasoningMiddleware({ tagName: 'antml:thinking' }),
})

Common middleware patterns:

  • Caching: Store responses in Redis during development to avoid API costs during iteration
  • Logging: Record all LLM calls with timing, token usage, and cost estimates
  • Fallback: Try the primary model, fall back to a cheaper model on error or timeout
  • Rate limiting: Enforce per-user token budgets before hitting the provider

AI SDK vs Alternatives

FactorAI SDK (ai)LangChain.jsDirect SDK
Provider abstraction✅ 20+ providers✅ Many providers❌ One provider
Streaming✅ Native⚠️ ComplexVaries
React hooks✅ useChat, useCompletion, useObject❌ None
Structured output✅ generateObject✅ (via chains)⚠️ Manual
Bundle size~45kB~500kB+Provider-specific
Agent support✅ maxSteps✅ LangGraph
npm downloads~3.5M/week~1.2M/weekN/A
Learning curveLowHighLow

For most Next.js applications building AI features, the AI SDK is the right starting point. LangChain.js is worth considering for complex agent workflows with memory, retrieval, and multi-agent orchestration.


When Not to Use the AI SDK

The AI SDK is not always the right choice:

  • Pure Python data pipelines — use the provider's Python SDK directly or LangChain Python
  • Complex agentic workflows — consider Mastra (TypeScript), LangGraph (Python/TS), or AutoGen
  • On-device inference — the AI SDK targets API-based providers; for on-device models, use onnxruntime or llama.cpp bindings
  • High-throughput batch processing — the AI SDK's abstractions add overhead; direct provider calls may be better for 100K+ batch jobs

Methodology

  • npm download data from npmjs.com, March 2026 weekly averages
  • Code examples tested against ai v4.x (latest stable), @ai-sdk/openai v1.x, @ai-sdk/anthropic v1.x
  • Sources: Vercel AI SDK official documentation, GitHub repo, AI SDK changelog

Compare AI SDK with other LLM client libraries on PkgPulse — download trends, bundle sizes, and dependency health.

Related: Vercel AI SDK vs OpenAI SDK vs Anthropic SDK 2026 · LangChain.js vs Vercel AI SDK 2026 · Mastra vs LangChain.js vs Genkit 2026

Comments

Stay Updated

Get the latest package insights, npm trends, and tooling tips delivered to your inbox.