OpenAI Agents SDK vs Mastra vs Genkit: JS AI Agents 2026
OpenAI Agents SDK vs Mastra vs Genkit: JavaScript AI Agent Frameworks 2026
TL;DR
For JavaScript AI agents in 2026, OpenAI Agents SDK wins for teams already using OpenAI APIs who want the most straightforward tool-calling and handoff patterns. Mastra wins for production TypeScript applications needing database-backed memory, built-in RAG, and framework integrations. Google Genkit wins for multi-provider flexibility and teams on Firebase/GCP. LangChain.js remains an option for complex chains but is increasingly heavy compared to these focused alternatives.
Key Takeaways
- OpenAI Agents SDK (JS) is the official Python Agents SDK ported to JavaScript — minimal API, focuses on tools and agent handoffs
- Mastra is TypeScript-first with built-in PostgreSQL memory, vector search, and Next.js/Express integrations — the "batteries included" option
- Google Genkit supports 10+ LLM providers (Gemini, OpenAI, Anthropic, Ollama) via plugins — best for provider-agnostic agents
- LangChain.js has 3M+ weekly downloads but its abstraction overhead is increasingly seen as unnecessary for modern tool-calling APIs
- All three are production-usable in 2026 — the "AI agents are too experimental" era is over
- Memory architecture matters most: short-term (context window), long-term (database), and semantic (vector search) serve different needs
What Makes 2026 Different for AI Agents
Two years ago, "AI agents" meant fragile ReAct loops that broke unpredictably. Today, the infrastructure has matured significantly:
- Structured outputs are reliable. GPT-4o, Claude 3.5+, and Gemini 2.0 all support JSON schema-constrained outputs that make tool-calling deterministic.
- Context windows are large enough. 128K–1M token windows mean agents can hold multi-turn conversations with enough context for complex tasks.
- Tool call reliability has improved. Modern LLMs rarely hallucinate tool calls that don't exist or confuse parameter types.
The frameworks have caught up too. The question is no longer "will this work?" but "which abstraction level matches my needs?"
OpenAI Agents SDK (JS): Official and Minimal
npm: @openai/agents | weekly downloads: ~120K | latest: 0.0.x (early 2025 release) | provider: OpenAI-only
The JavaScript port of OpenAI's Python Agents SDK brings the same minimal design philosophy: define tools as functions, define agents that use those tools, and let the SDK handle the loop.
npm install @openai/agents
Basic agent with tools:
import { Agent, tool, run } from '@openai/agents'
import { z } from 'zod'
// Define tools with Zod schemas — automatically validates inputs
const searchTool = tool({
name: 'search_web',
description: 'Search the web for current information',
parameters: z.object({
query: z.string().describe('Search query'),
maxResults: z.number().optional().default(5),
}),
execute: async ({ query, maxResults }) => {
const results = await webSearch(query, maxResults)
return results.map(r => ({ title: r.title, snippet: r.snippet, url: r.url }))
},
})
const codeExecutorTool = tool({
name: 'execute_code',
description: 'Execute JavaScript code in a sandboxed environment',
parameters: z.object({
code: z.string().describe('JavaScript code to execute'),
}),
execute: async ({ code }) => {
return await runInSandbox(code)
},
})
// Define the agent
const researchAgent = new Agent({
name: 'Research Assistant',
model: 'gpt-4o',
instructions: `You are a research assistant. Use the search tool to find
current information and the code executor to analyze data. Always cite sources.`,
tools: [searchTool, codeExecutorTool],
})
// Run the agent
const result = await run(researchAgent, 'What are the top 5 npm packages by weekly downloads?')
console.log(result.finalOutput)
Multi-agent handoffs — the SDK's most distinctive feature:
import { Agent, handoff, run } from '@openai/agents'
const triageAgent = new Agent({
name: 'Triage',
instructions: 'Determine if the request is a coding question or general knowledge question.',
handoffs: [
handoff(codingAgent, {
condition: 'when the user asks about programming, code, or technical topics'
}),
handoff(knowledgeAgent, {
condition: 'for all other questions'
}),
],
})
// Triage routes automatically to the right specialized agent
const result = await run(triageAgent, 'How do I sort an array in TypeScript?')
// → Automatically handed off to codingAgent
Guardrails (input/output validation):
import { InputGuardrail, OutputGuardrail } from '@openai/agents'
const safetyGuardrail = new InputGuardrail({
name: 'pii-detector',
execute: async (input) => {
if (containsPII(input)) {
return { tripwire: true, message: 'Input contains personal data' }
}
return { tripwire: false }
}
})
const agent = new Agent({
model: 'gpt-4o',
instructions: 'You are a helpful assistant.',
inputGuardrails: [safetyGuardrail],
})
Limitations: OpenAI-only (no Claude, no Gemini), no built-in memory persistence, no built-in vector search. The SDK is intentionally minimal — you add your own infrastructure.
Mastra: TypeScript-First Production Agents
npm: mastra | weekly downloads: ~18K | latest: 0.x | provider: Multi-model
Mastra is an opinionated TypeScript framework that packages everything you need for production agents: database-backed memory with PostgreSQL, built-in RAG with vector search, workflow orchestration, and integrations with popular frameworks.
npm install mastra @mastra/core
npx mastra init
Agent with persistent memory:
import { Mastra, Agent, Memory } from '@mastra/core'
import { openai } from '@ai-sdk/openai'
import { PostgresStore } from '@mastra/pg'
// Memory persists across conversations in PostgreSQL
const memory = new Memory({
storage: new PostgresStore({ connectionString: process.env.DATABASE_URL }),
vector: new PgVector(process.env.DATABASE_URL), // vector search
embedder: openai.embedding('text-embedding-3-small'),
options: {
semanticRecall: {
topK: 5,
messageRange: { before: 2, after: 2 },
},
lastMessages: 10, // keep 10 most recent messages in context
workingMemory: { enabled: true }, // structured facts the agent extracts
},
})
const customerAgent = new Agent({
name: 'Customer Support Agent',
instructions: `You are a customer support agent. Use memory to remember
user preferences and past issues. Always check conversation history
before asking for information the user already provided.`,
model: openai('gpt-4o'),
memory,
tools: { lookupOrder, createTicket, checkStatus },
})
// Memory automatically scopes to userId
const response = await customerAgent.generate(
'What was the issue with my last order?',
{ threadId: 'support-session-123', resourceId: `user-${userId}` }
)
Built-in RAG (Retrieval-Augmented Generation):
import { MastraVector } from '@mastra/core'
import { PgVector } from '@mastra/pg'
const vectorStore = new PgVector(process.env.DATABASE_URL)
// Index your documentation
await vectorStore.upsert({
indexName: 'product-docs',
vectors: await Promise.all(
documentChunks.map(async (chunk) => ({
id: chunk.id,
vector: await embed(chunk.text),
metadata: { text: chunk.text, source: chunk.source },
}))
),
})
// Agent automatically retrieves relevant docs via tool
const docAgent = new Agent({
name: 'Documentation Agent',
model: openai('gpt-4o-mini'),
tools: {
searchDocs: createVectorQueryTool({
vectorStoreName: 'mastra-vector',
indexName: 'product-docs',
model: openai.embedding('text-embedding-3-small'),
}),
},
})
Mastra workflows — for multi-step agent pipelines:
import { Workflow, Step } from '@mastra/core'
const contentPipeline = new Workflow({ name: 'content-generation' })
.step('research', async ({ inputData }) => {
return await researchAgent.generate(
`Research the topic: ${inputData.topic}`
)
})
.then('outline', async ({ inputData, getStepResult }) => {
const research = getStepResult('research')
return await outlineAgent.generate(
`Create outline based on: ${research.text}`
)
})
.then('write', async ({ getStepResult }) => {
const outline = getStepResult('outline')
return await writerAgent.generate(`Write article from outline: ${outline.text}`)
})
.commit()
Framework integrations: Mastra has first-class support for Next.js (server actions, API routes), Express, and Hono. The mastra serve command starts a local development server with a visual agent playground.
Self-hosting: Mastra agents can deploy anywhere Node.js runs. The memory layer requires PostgreSQL with the pgvector extension.
Google Genkit: Multi-Provider Flexibility
npm: genkit | weekly downloads: ~35K | latest: 1.x | provider: Multi-model via plugins
Google Genkit is the official Google AI framework for Node.js. Its strength is provider-agnosticism — the same Genkit flow can run against Gemini, GPT-4o, Claude, or a local Ollama model by swapping the plugin.
npm install genkit @genkit-ai/googleai
Basic flow with tools:
import { genkit, z } from 'genkit'
import { googleAI } from '@genkit-ai/googleai'
const ai = genkit({
plugins: [googleAI()],
model: 'googleai/gemini-2.0-flash',
})
// Genkit uses "flows" rather than "agents"
export const researchFlow = ai.defineFlow(
{
name: 'research-flow',
inputSchema: z.object({ query: z.string() }),
outputSchema: z.object({ summary: z.string(), sources: z.array(z.string()) }),
},
async (input) => {
// Define tool inline
const searchTool = ai.defineTool(
{
name: 'search',
description: 'Search for information',
inputSchema: z.object({ query: z.string() }),
outputSchema: z.array(z.object({ title: z.string(), snippet: z.string() })),
},
async ({ query }) => webSearch(query)
)
const response = await ai.generate({
tools: [searchTool],
prompt: `Research: ${input.query}. Provide a summary with cited sources.`,
})
return {
summary: response.text,
sources: extractSources(response.text),
}
}
)
// Run the flow
const result = await researchFlow({ query: 'TypeScript 5.5 new features' })
Multi-provider agent — swapping models without changing code:
import { genkit } from 'genkit'
import { googleAI } from '@genkit-ai/googleai'
import { openAI } from 'genkitx-openai'
import { anthropic } from 'genkitx-anthropic'
const ai = genkit({
plugins: [googleAI(), openAI(), anthropic()],
})
// Switch models with a config change
const response = await ai.generate({
model: process.env.AI_MODEL || 'googleai/gemini-2.0-flash', // or 'openai/gpt-4o' or 'anthropic/claude-opus-4-5'
prompt: 'Explain async/await in JavaScript',
})
Genkit's development UI: Running genkit start opens a local UI for testing flows, inspecting traces, and managing prompts — similar to Mastra's playground but more polished for Gemini-focused development.
Firebase/Cloud Run integration: Genkit flows deploy natively to Firebase Functions or Cloud Run:
import { onFlow } from '@genkit-ai/firebase'
// Deploy directly to Firebase Functions
export const myFlow = onFlow(
ai,
{
name: 'my-research-flow',
httpsOptions: { cors: true },
authPolicy: firebaseAppCheckPolicy(),
},
async (input) => { /* ... */ }
)
Limitation: Genkit's agent patterns (multi-step tool calling) are less ergonomic than OpenAI Agents SDK's explicit handoff model. It excels at flows with predictable steps, but complex multi-agent orchestration requires more manual implementation.
Framework Comparison
| Feature | OpenAI Agents SDK | Mastra | Genkit |
|---|---|---|---|
| npm downloads/week | ~120K | ~18K | ~35K |
| Provider support | OpenAI only | Multi (via ai-sdk) | Multi (via plugins) |
| Built-in memory | ❌ | ✅ PostgreSQL + vector | ⚠️ Limited |
| Built-in RAG | ❌ | ✅ | ⚠️ Manual |
| Agent handoffs | ✅ Native | Via workflows | ❌ Manual |
| TypeScript first | ✅ | ✅ | ✅ |
| Dev UI | ❌ | ✅ | ✅ |
| Framework integrations | Any | Next.js, Express, Hono | Firebase, Cloud Run |
| Self-hosting | Yes (OpenAI API required) | Yes (requires PostgreSQL) | Yes |
| Learning curve | Low | Medium | Medium |
Memory Architecture: The Key Differentiator
How an agent remembers things determines more of its practical behavior than the LLM model choice. There are three types:
1. Short-term (context window) All three frameworks support this. It's just the conversation messages passed to the LLM. Fast, no infrastructure needed, limited by context window size (~128K tokens).
2. Long-term (database) Mastra has first-class support. OpenAI Agents SDK and Genkit leave this to you. Implement with PostgreSQL + a memory schema that loads relevant conversation history into context.
3. Semantic (vector search)
Mastra's Memory with semanticRecall retrieves semantically similar past conversations, not just recent ones. "What was the issue with my order last month?" returns the relevant exchange even if 100 conversations happened between then and now.
For customer support, coding assistants, and research agents, semantic memory is the difference between a frustrating chatbot and a genuinely useful assistant.
Decision Guide
Choose OpenAI Agents SDK if:
- Your entire stack uses OpenAI models (no plans to switch providers)
- You want the minimal API with explicit handoff patterns
- You'll build your own memory and RAG infrastructure
- You're migrating from the Python Agents SDK and want API parity
Choose Mastra if:
- You need production-ready memory with PostgreSQL out of the box
- You want built-in RAG without integrating separate vector databases
- You're building Next.js applications and want seamless integration
- Multi-model support matters (run GPT-4o now, Claude later without code changes)
Choose Genkit if:
- You're building on Firebase or GCP
- Provider flexibility across 10+ LLMs is a hard requirement
- You want a visual development UI and tracing out of the box
- The Gemini ecosystem is your primary target
Methodology
- npm download data from npmjs.com (March 2026)
- Feature comparison from official documentation: OpenAI Agents SDK docs, Mastra docs (mastra.ai), Google Genkit docs
- Memory architecture analysis based on production pattern documentation
- Provider support verified against each framework's plugin ecosystem
Evaluating AI packages for your JavaScript project? Check out PkgPulse's package comparisons for live npm health scores and download trends across the AI ecosystem.