OpenAI Chat Completions vs Responses API vs Assistants API 2026
OpenAI Chat Completions vs Responses API vs Assistants API 2026
TL;DR
OpenAI offers three distinct API surfaces for building AI-powered applications, each targeting a different level of abstraction. Chat Completions is the foundational stateless API — one request, one response, you manage conversation history yourself; it's the most flexible, best understood, and the right default for most production apps. Responses API (launched early 2025) is OpenAI's new unified API that adds built-in conversation state, multi-turn turn-taking, and direct tool result handling with a cleaner DX than raw Completions — it's the future direction for OpenAI's API surface. Assistants API is the high-level managed stateful agent API — persistent Threads, file search, code interpreter, vector store integration, all server-side; it removes boilerplate but trades control for convenience and costs more per token due to managed state overhead. For simple chat or inference: Chat Completions. For multi-step agentic flows with conversation state: Responses API. For document Q&A with minimal code: Assistants API.
Key Takeaways
- Chat Completions is stateless — you send the full message history every request, you own persistence
- Responses API maintains state server-side —
previous_response_idchains turns without resending history - Assistants API manages Threads — persistent conversation objects with full OpenAI-managed lifecycle
- Responses API supports built-in tool calls — web search, file search, computer use as first-class tools
- Assistants API has Code Interpreter — runs Python sandboxed, generates charts, processes files
- Chat Completions is cheapest — no state overhead, pay only for tokens in/out
- Responses API deprecated Assistants API patterns — OpenAI is converging on Responses as the primary stateful API
API Architecture Overview
Chat Completions Responses API Assistants API
───────────────── ───────────────── ─────────────────
Stateless Stateful Stateful + Managed
You own history Server state chain Server Threads/Runs
Raw tool results Built-in tools Code Interpreter
No file search Built-in file search File Search tool
Standard streaming Streaming events Streaming runs
Cheapest Mid-cost Most expensive
Chat Completions: The Stateless Foundation
Chat Completions is the workhorse API — every OpenAI model is available, every feature is supported, and you have complete control.
Installation
npm install openai
Basic Request
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "What is the capital of France?" },
],
temperature: 0.7,
max_tokens: 500,
});
console.log(response.choices[0].message.content);
// "The capital of France is Paris."
Multi-Turn Conversation (Manual History)
import OpenAI from "openai";
import type { ChatCompletionMessageParam } from "openai/resources";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// You manage the conversation array
const messages: ChatCompletionMessageParam[] = [
{ role: "system", content: "You are a concise coding assistant." },
];
async function chat(userMessage: string): Promise<string> {
// Add user message
messages.push({ role: "user", content: userMessage });
const response = await client.chat.completions.create({
model: "gpt-4o",
messages,
temperature: 0.3,
});
const assistantMessage = response.choices[0].message.content ?? "";
// Add assistant response to history
messages.push({ role: "assistant", content: assistantMessage });
return assistantMessage;
}
// Usage
await chat("How do I reverse a string in JavaScript?");
await chat("Can you show me a one-liner version?"); // Full history sent again
Streaming
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Next.js API Route with streaming
export async function POST(req: Request) {
const { messages } = await req.json();
const stream = await client.chat.completions.create({
model: "gpt-4o",
messages,
stream: true,
});
const encoder = new TextEncoder();
const readable = new ReadableStream({
async start(controller) {
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content;
if (delta) {
controller.enqueue(encoder.encode(`data: ${JSON.stringify({ delta })}\n\n`));
}
}
controller.enqueue(encoder.encode("data: [DONE]\n\n"));
controller.close();
},
});
return new Response(readable, {
headers: {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
},
});
}
Tool Calling (Function Calls)
import OpenAI from "openai";
import type { ChatCompletionTool } from "openai/resources";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const tools: ChatCompletionTool[] = [
{
type: "function",
function: {
name: "get_weather",
description: "Get current weather for a city",
parameters: {
type: "object",
properties: {
city: { type: "string", description: "City name" },
unit: { type: "string", enum: ["celsius", "fahrenheit"] },
},
required: ["city"],
},
},
},
{
type: "function",
function: {
name: "search_web",
description: "Search the web for current information",
parameters: {
type: "object",
properties: {
query: { type: "string" },
},
required: ["query"],
},
},
},
];
// Tool execution dispatcher
async function executeTool(name: string, args: Record<string, unknown>): Promise<string> {
if (name === "get_weather") {
const { city } = args as { city: string; unit?: string };
// Call your weather API
return JSON.stringify({ temp: 22, condition: "sunny", city });
}
if (name === "search_web") {
// Call your search API
return JSON.stringify({ results: ["Result 1", "Result 2"] });
}
return JSON.stringify({ error: "Unknown tool" });
}
// Agentic loop
async function runWithTools(userMessage: string): Promise<string> {
const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
{ role: "user", content: userMessage },
];
while (true) {
const response = await client.chat.completions.create({
model: "gpt-4o",
messages,
tools,
tool_choice: "auto",
});
const choice = response.choices[0];
messages.push(choice.message); // Add assistant message with tool calls
if (choice.finish_reason === "stop") {
return choice.message.content ?? "";
}
if (choice.finish_reason === "tool_calls") {
// Execute all tool calls in parallel
const toolResults = await Promise.all(
(choice.message.tool_calls ?? []).map(async (toolCall) => {
const result = await executeTool(
toolCall.function.name,
JSON.parse(toolCall.function.arguments)
);
return {
role: "tool" as const,
tool_call_id: toolCall.id,
content: result,
};
})
);
messages.push(...toolResults);
}
}
}
const answer = await runWithTools("What's the weather in Tokyo and what's trending on HN today?");
Structured Output
import OpenAI from "openai";
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const ArticleSchema = z.object({
title: z.string(),
summary: z.string(),
tags: z.array(z.string()),
readingTime: z.number(),
keyPoints: z.array(z.string()),
});
const response = await client.beta.chat.completions.parse({
model: "gpt-4o",
messages: [
{
role: "user",
content: "Analyze this article and extract metadata: [article text]",
},
],
response_format: zodResponseFormat(ArticleSchema, "article"),
});
const article = response.choices[0].message.parsed;
// Fully typed: { title: string, summary: string, tags: string[], ... }
Responses API: Stateful Multi-Turn
The Responses API (launched 2025) is OpenAI's new primary API — it handles conversation state server-side and includes built-in tools like web search and file search.
Basic Request
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// First turn — no previous_response_id
const response = await client.responses.create({
model: "gpt-4o",
input: "What is the capital of France?",
instructions: "You are a helpful geography assistant.",
});
console.log(response.output_text);
// "The capital of France is Paris."
console.log(response.id); // "resp_01ABC..." — use for next turn
Multi-Turn (Server-Side State)
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Turn 1
const turn1 = await client.responses.create({
model: "gpt-4o",
input: "My name is Sarah and I'm building a React app.",
instructions: "Remember context about the user throughout our conversation.",
});
// Turn 2 — reference previous response, no need to resend history
const turn2 = await client.responses.create({
model: "gpt-4o",
input: "What's a good state management library for my project?",
previous_response_id: turn1.id, // Server looks up prior context
});
// Turn 3
const turn3 = await client.responses.create({
model: "gpt-4o",
input: "Can you show me an example?",
previous_response_id: turn2.id,
});
console.log(turn3.output_text);
// Zustand example tailored to Sarah's React context — no history payload sent
Built-In Web Search
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const response = await client.responses.create({
model: "gpt-4o",
input: "What are the most popular npm packages released in February 2026?",
tools: [{ type: "web_search_preview" }],
});
console.log(response.output_text);
// Response includes web search results synthesized into answer
// Access the raw search results
for (const item of response.output) {
if (item.type === "web_search_call") {
console.log("Searched:", item.status);
}
if (item.type === "message") {
console.log("Answer:", item.content[0].text);
}
}
Built-In File Search
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// First, create a vector store and upload files
const vectorStore = await client.vectorStores.create({
name: "Documentation",
});
await client.vectorStores.fileBatches.uploadAndPoll(vectorStore.id, {
files: [
new File(["Your product documentation content..."], "docs.txt", {
type: "text/plain",
}),
],
});
// Query with file search
const response = await client.responses.create({
model: "gpt-4o",
input: "How do I configure authentication in the product?",
tools: [
{
type: "file_search",
vector_store_ids: [vectorStore.id],
},
],
});
console.log(response.output_text);
// Answer synthesized from uploaded documentation
Streaming with Responses API
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const stream = await client.responses.create({
model: "gpt-4o",
input: "Write a detailed explanation of React's reconciliation algorithm.",
stream: true,
});
for await (const event of stream) {
if (event.type === "response.output_text.delta") {
process.stdout.write(event.delta);
}
if (event.type === "response.completed") {
console.log("\n\nDone. Response ID:", event.response.id);
}
}
Custom Tool Calling
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const response = await client.responses.create({
model: "gpt-4o",
input: "What's the current price of AAPL stock?",
tools: [
{
type: "function",
name: "get_stock_price",
description: "Get the current stock price for a ticker symbol",
parameters: {
type: "object",
properties: {
ticker: { type: "string", description: "Stock ticker symbol" },
},
required: ["ticker"],
},
},
],
tool_choice: "auto",
});
// Handle tool call
for (const item of response.output) {
if (item.type === "function_call") {
const { ticker } = JSON.parse(item.arguments) as { ticker: string };
const price = await fetchStockPrice(ticker); // Your implementation
// Submit tool result and continue
const finalResponse = await client.responses.create({
model: "gpt-4o",
previous_response_id: response.id,
input: [
{
type: "function_call_output",
call_id: item.call_id,
output: JSON.stringify({ price, ticker }),
},
],
});
console.log(finalResponse.output_text);
}
}
async function fetchStockPrice(ticker: string) {
return { price: 185.42, currency: "USD" };
}
Assistants API: Managed Stateful Agents
Assistants API manages Threads, Runs, and built-in tools (Code Interpreter, File Search) fully server-side. Best for document Q&A and code execution use cases with minimal custom infrastructure.
Creating an Assistant
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Create once, reuse by ID
const assistant = await client.beta.assistants.create({
name: "Data Analyst",
instructions:
"You are a data analyst. Use code interpreter to analyze data, create charts, and answer questions about datasets.",
model: "gpt-4o",
tools: [
{ type: "code_interpreter" },
{ type: "file_search" },
],
});
console.log("Assistant ID:", assistant.id);
// "asst_01ABC..." — store this, don't recreate each time
Thread Lifecycle
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const ASSISTANT_ID = "asst_01ABC..."; // From creation
// Create a Thread (persists server-side)
const thread = await client.beta.threads.create();
// Add a message to the Thread
await client.beta.threads.messages.create(thread.id, {
role: "user",
content: "Analyze this sales data and identify trends.",
});
// Run the Assistant on the Thread
const run = await client.beta.threads.runs.createAndPoll(thread.id, {
assistant_id: ASSISTANT_ID,
});
if (run.status === "completed") {
const messages = await client.beta.threads.messages.list(thread.id);
const lastMessage = messages.data[0];
if (lastMessage.content[0].type === "text") {
console.log(lastMessage.content[0].text.value);
}
}
// Next turn — same Thread, conversation continues
await client.beta.threads.messages.create(thread.id, {
role: "user",
content: "Now create a bar chart of the top 5 products.",
});
const run2 = await client.beta.threads.runs.createAndPoll(thread.id, {
assistant_id: ASSISTANT_ID,
});
File Upload and Code Interpreter
import OpenAI from "openai";
import { createReadStream } from "fs";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Upload a CSV file for analysis
const file = await client.files.create({
file: createReadStream("sales-data.csv"),
purpose: "assistants",
});
const ASSISTANT_ID = "asst_01ABC...";
// Create thread with file attachment
const thread = await client.beta.threads.create({
messages: [
{
role: "user",
content: "Analyze this CSV file. What are the top 3 revenue months?",
attachments: [
{
file_id: file.id,
tools: [{ type: "code_interpreter" }],
},
],
},
],
});
const run = await client.beta.threads.runs.createAndPoll(thread.id, {
assistant_id: ASSISTANT_ID,
});
if (run.status === "completed") {
const messages = await client.beta.threads.messages.list(thread.id);
for (const message of messages.data.reverse()) {
if (message.role === "assistant") {
for (const content of message.content) {
if (content.type === "text") {
console.log(content.text.value);
}
// Code Interpreter can generate image files (charts)
if (content.type === "image_file") {
console.log("Chart generated:", content.image_file.file_id);
// Download with client.files.content(content.image_file.file_id)
}
}
}
}
}
File Search (Vector Store)
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Upload documentation files
const vectorStore = await client.beta.vectorStores.create({
name: "Product Docs",
});
await client.beta.vectorStores.fileBatches.uploadAndPoll(vectorStore.id, {
files: [
new File(["Authentication guide..."], "auth.txt", { type: "text/plain" }),
new File(["API reference..."], "api-reference.txt", { type: "text/plain" }),
],
});
// Create assistant with file search
const assistant = await client.beta.assistants.create({
name: "Support Bot",
instructions: "Answer questions using the product documentation. Always cite sources.",
model: "gpt-4o",
tools: [{ type: "file_search" }],
tool_resources: {
file_search: {
vector_store_ids: [vectorStore.id],
},
},
});
// Query
const thread = await client.beta.threads.create();
await client.beta.threads.messages.create(thread.id, {
role: "user",
content: "How do I set up OAuth with your API?",
});
const run = await client.beta.threads.runs.createAndPoll(thread.id, {
assistant_id: assistant.id,
});
const messages = await client.beta.threads.messages.list(thread.id);
const reply = messages.data[0].content[0];
if (reply.type === "text") {
console.log(reply.text.value);
// Includes citations with file names and quote snippets
console.log("Citations:", reply.text.annotations);
}
Streaming Runs
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const ASSISTANT_ID = "asst_01ABC...";
const THREAD_ID = "thread_01ABC..."; // Existing thread
await client.beta.threads.messages.create(THREAD_ID, {
role: "user",
content: "Summarize our conversation so far.",
});
const stream = client.beta.threads.runs.stream(THREAD_ID, {
assistant_id: ASSISTANT_ID,
});
stream
.on("textDelta", (delta) => {
process.stdout.write(delta.value ?? "");
})
.on("toolCallDelta", (delta) => {
if (delta.type === "code_interpreter") {
if (delta.code_interpreter?.input) {
process.stdout.write(delta.code_interpreter.input);
}
}
})
.on("end", () => {
console.log("\nRun complete");
});
await stream.finalRun();
Feature Comparison
| Feature | Chat Completions | Responses API | Assistants API |
|---|---|---|---|
| State management | ❌ You manage | ✅ Server-side chain | ✅ Persistent Threads |
| Conversation history | Manual (resend all) | previous_response_id | Automatic |
| Web search | ❌ Custom tools only | ✅ Built-in | ❌ Custom tools only |
| File search | ❌ Custom only | ✅ Vector stores | ✅ File Search tool |
| Code Interpreter | ❌ | ❌ | ✅ Python sandbox |
| Streaming | ✅ | ✅ | ✅ |
| Structured output | ✅ (response_format) | ✅ | ⚠️ Via instructions |
| Tool calling | ✅ | ✅ | ✅ |
| All models | ✅ | ✅ | ✅ |
| Cost | Lowest | Mid | Highest |
| Maturity | GA (2023) | GA (2025) | GA (2023, updating) |
| Rate limits | Standard | Standard | Thread-scoped |
| Token limit | Context window | Context window | Context window |
| File processing | No | Yes (file search) | Yes (Code Interp + FS) |
| Weekly API usage | Dominant | Growing | Stable |
When to Use Each
Choose Chat Completions if:
- Building a stateless API (classify, summarize, extract) — no multi-turn needed
- You need maximum model flexibility and control
- Custom state storage in your own DB (Redis, Postgres)
- Cost optimization is critical — no overhead from managed state
- Existing codebase built on Chat Completions patterns
- Need
response_format: json_schemafor guaranteed structure
Choose Responses API if:
- Multi-turn chat where OpenAI managing state is preferable
- Need built-in web search without third-party integration
- Document Q&A using file search with simpler setup than Assistants
- Building new projects — this is OpenAI's current recommended primary API
- Agentic flows where you want easier tool result submission
Choose Assistants API if:
- Document Q&A with many files and automatic retrieval
- Code Interpreter use cases: data analysis, chart generation, math
- Long-lived persistent Threads across multiple user sessions
- Support chatbot or copilot where OpenAI handles full conversation lifecycle
- You want to avoid managing vector stores and file search plumbing yourself
Methodology
Data sourced from OpenAI API documentation (platform.openai.com/docs), OpenAI Responses API announcement (early 2025), npm weekly download statistics for the openai package as of February 2026, OpenAI developer forum discussions, and practical benchmarks comparing API latency and cost across the three surfaces. Pricing data from OpenAI pricing page as of February 2026.
Related: Gemini API vs Claude API vs Mistral API for cross-provider LLM comparisons, or Vercel AI SDK vs LangChain vs LlamaIndex for AI orchestration frameworks.