Skip to main content

OpenAI Chat Completions vs Responses API vs Assistants API 2026

·PkgPulse Team

OpenAI Chat Completions vs Responses API vs Assistants API 2026

TL;DR

OpenAI offers three distinct API surfaces for building AI-powered applications, each targeting a different level of abstraction. Chat Completions is the foundational stateless API — one request, one response, you manage conversation history yourself; it's the most flexible, best understood, and the right default for most production apps. Responses API (launched early 2025) is OpenAI's new unified API that adds built-in conversation state, multi-turn turn-taking, and direct tool result handling with a cleaner DX than raw Completions — it's the future direction for OpenAI's API surface. Assistants API is the high-level managed stateful agent API — persistent Threads, file search, code interpreter, vector store integration, all server-side; it removes boilerplate but trades control for convenience and costs more per token due to managed state overhead. For simple chat or inference: Chat Completions. For multi-step agentic flows with conversation state: Responses API. For document Q&A with minimal code: Assistants API.

Key Takeaways

  • Chat Completions is stateless — you send the full message history every request, you own persistence
  • Responses API maintains state server-sideprevious_response_id chains turns without resending history
  • Assistants API manages Threads — persistent conversation objects with full OpenAI-managed lifecycle
  • Responses API supports built-in tool calls — web search, file search, computer use as first-class tools
  • Assistants API has Code Interpreter — runs Python sandboxed, generates charts, processes files
  • Chat Completions is cheapest — no state overhead, pay only for tokens in/out
  • Responses API deprecated Assistants API patterns — OpenAI is converging on Responses as the primary stateful API

API Architecture Overview

Chat Completions     Responses API        Assistants API
─────────────────    ─────────────────    ─────────────────
Stateless            Stateful             Stateful + Managed
You own history      Server state chain   Server Threads/Runs
Raw tool results     Built-in tools       Code Interpreter
No file search       Built-in file search File Search tool
Standard streaming   Streaming events     Streaming runs
Cheapest             Mid-cost             Most expensive

Chat Completions: The Stateless Foundation

Chat Completions is the workhorse API — every OpenAI model is available, every feature is supported, and you have complete control.

Installation

npm install openai

Basic Request

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is the capital of France?" },
  ],
  temperature: 0.7,
  max_tokens: 500,
});

console.log(response.choices[0].message.content);
// "The capital of France is Paris."

Multi-Turn Conversation (Manual History)

import OpenAI from "openai";
import type { ChatCompletionMessageParam } from "openai/resources";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// You manage the conversation array
const messages: ChatCompletionMessageParam[] = [
  { role: "system", content: "You are a concise coding assistant." },
];

async function chat(userMessage: string): Promise<string> {
  // Add user message
  messages.push({ role: "user", content: userMessage });

  const response = await client.chat.completions.create({
    model: "gpt-4o",
    messages,
    temperature: 0.3,
  });

  const assistantMessage = response.choices[0].message.content ?? "";

  // Add assistant response to history
  messages.push({ role: "assistant", content: assistantMessage });

  return assistantMessage;
}

// Usage
await chat("How do I reverse a string in JavaScript?");
await chat("Can you show me a one-liner version?"); // Full history sent again

Streaming

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// Next.js API Route with streaming
export async function POST(req: Request) {
  const { messages } = await req.json();

  const stream = await client.chat.completions.create({
    model: "gpt-4o",
    messages,
    stream: true,
  });

  const encoder = new TextEncoder();

  const readable = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        const delta = chunk.choices[0]?.delta?.content;
        if (delta) {
          controller.enqueue(encoder.encode(`data: ${JSON.stringify({ delta })}\n\n`));
        }
      }
      controller.enqueue(encoder.encode("data: [DONE]\n\n"));
      controller.close();
    },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
    },
  });
}

Tool Calling (Function Calls)

import OpenAI from "openai";
import type { ChatCompletionTool } from "openai/resources";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const tools: ChatCompletionTool[] = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Get current weather for a city",
      parameters: {
        type: "object",
        properties: {
          city: { type: "string", description: "City name" },
          unit: { type: "string", enum: ["celsius", "fahrenheit"] },
        },
        required: ["city"],
      },
    },
  },
  {
    type: "function",
    function: {
      name: "search_web",
      description: "Search the web for current information",
      parameters: {
        type: "object",
        properties: {
          query: { type: "string" },
        },
        required: ["query"],
      },
    },
  },
];

// Tool execution dispatcher
async function executeTool(name: string, args: Record<string, unknown>): Promise<string> {
  if (name === "get_weather") {
    const { city } = args as { city: string; unit?: string };
    // Call your weather API
    return JSON.stringify({ temp: 22, condition: "sunny", city });
  }
  if (name === "search_web") {
    // Call your search API
    return JSON.stringify({ results: ["Result 1", "Result 2"] });
  }
  return JSON.stringify({ error: "Unknown tool" });
}

// Agentic loop
async function runWithTools(userMessage: string): Promise<string> {
  const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
    { role: "user", content: userMessage },
  ];

  while (true) {
    const response = await client.chat.completions.create({
      model: "gpt-4o",
      messages,
      tools,
      tool_choice: "auto",
    });

    const choice = response.choices[0];
    messages.push(choice.message); // Add assistant message with tool calls

    if (choice.finish_reason === "stop") {
      return choice.message.content ?? "";
    }

    if (choice.finish_reason === "tool_calls") {
      // Execute all tool calls in parallel
      const toolResults = await Promise.all(
        (choice.message.tool_calls ?? []).map(async (toolCall) => {
          const result = await executeTool(
            toolCall.function.name,
            JSON.parse(toolCall.function.arguments)
          );
          return {
            role: "tool" as const,
            tool_call_id: toolCall.id,
            content: result,
          };
        })
      );

      messages.push(...toolResults);
    }
  }
}

const answer = await runWithTools("What's the weather in Tokyo and what's trending on HN today?");

Structured Output

import OpenAI from "openai";
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const ArticleSchema = z.object({
  title: z.string(),
  summary: z.string(),
  tags: z.array(z.string()),
  readingTime: z.number(),
  keyPoints: z.array(z.string()),
});

const response = await client.beta.chat.completions.parse({
  model: "gpt-4o",
  messages: [
    {
      role: "user",
      content: "Analyze this article and extract metadata: [article text]",
    },
  ],
  response_format: zodResponseFormat(ArticleSchema, "article"),
});

const article = response.choices[0].message.parsed;
// Fully typed: { title: string, summary: string, tags: string[], ... }

Responses API: Stateful Multi-Turn

The Responses API (launched 2025) is OpenAI's new primary API — it handles conversation state server-side and includes built-in tools like web search and file search.

Basic Request

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// First turn — no previous_response_id
const response = await client.responses.create({
  model: "gpt-4o",
  input: "What is the capital of France?",
  instructions: "You are a helpful geography assistant.",
});

console.log(response.output_text);
// "The capital of France is Paris."
console.log(response.id); // "resp_01ABC..." — use for next turn

Multi-Turn (Server-Side State)

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// Turn 1
const turn1 = await client.responses.create({
  model: "gpt-4o",
  input: "My name is Sarah and I'm building a React app.",
  instructions: "Remember context about the user throughout our conversation.",
});

// Turn 2 — reference previous response, no need to resend history
const turn2 = await client.responses.create({
  model: "gpt-4o",
  input: "What's a good state management library for my project?",
  previous_response_id: turn1.id, // Server looks up prior context
});

// Turn 3
const turn3 = await client.responses.create({
  model: "gpt-4o",
  input: "Can you show me an example?",
  previous_response_id: turn2.id,
});

console.log(turn3.output_text);
// Zustand example tailored to Sarah's React context — no history payload sent
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await client.responses.create({
  model: "gpt-4o",
  input: "What are the most popular npm packages released in February 2026?",
  tools: [{ type: "web_search_preview" }],
});

console.log(response.output_text);
// Response includes web search results synthesized into answer

// Access the raw search results
for (const item of response.output) {
  if (item.type === "web_search_call") {
    console.log("Searched:", item.status);
  }
  if (item.type === "message") {
    console.log("Answer:", item.content[0].text);
  }
}
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// First, create a vector store and upload files
const vectorStore = await client.vectorStores.create({
  name: "Documentation",
});

await client.vectorStores.fileBatches.uploadAndPoll(vectorStore.id, {
  files: [
    new File(["Your product documentation content..."], "docs.txt", {
      type: "text/plain",
    }),
  ],
});

// Query with file search
const response = await client.responses.create({
  model: "gpt-4o",
  input: "How do I configure authentication in the product?",
  tools: [
    {
      type: "file_search",
      vector_store_ids: [vectorStore.id],
    },
  ],
});

console.log(response.output_text);
// Answer synthesized from uploaded documentation

Streaming with Responses API

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const stream = await client.responses.create({
  model: "gpt-4o",
  input: "Write a detailed explanation of React's reconciliation algorithm.",
  stream: true,
});

for await (const event of stream) {
  if (event.type === "response.output_text.delta") {
    process.stdout.write(event.delta);
  }
  if (event.type === "response.completed") {
    console.log("\n\nDone. Response ID:", event.response.id);
  }
}

Custom Tool Calling

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await client.responses.create({
  model: "gpt-4o",
  input: "What's the current price of AAPL stock?",
  tools: [
    {
      type: "function",
      name: "get_stock_price",
      description: "Get the current stock price for a ticker symbol",
      parameters: {
        type: "object",
        properties: {
          ticker: { type: "string", description: "Stock ticker symbol" },
        },
        required: ["ticker"],
      },
    },
  ],
  tool_choice: "auto",
});

// Handle tool call
for (const item of response.output) {
  if (item.type === "function_call") {
    const { ticker } = JSON.parse(item.arguments) as { ticker: string };
    const price = await fetchStockPrice(ticker); // Your implementation

    // Submit tool result and continue
    const finalResponse = await client.responses.create({
      model: "gpt-4o",
      previous_response_id: response.id,
      input: [
        {
          type: "function_call_output",
          call_id: item.call_id,
          output: JSON.stringify({ price, ticker }),
        },
      ],
    });

    console.log(finalResponse.output_text);
  }
}

async function fetchStockPrice(ticker: string) {
  return { price: 185.42, currency: "USD" };
}

Assistants API: Managed Stateful Agents

Assistants API manages Threads, Runs, and built-in tools (Code Interpreter, File Search) fully server-side. Best for document Q&A and code execution use cases with minimal custom infrastructure.

Creating an Assistant

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// Create once, reuse by ID
const assistant = await client.beta.assistants.create({
  name: "Data Analyst",
  instructions:
    "You are a data analyst. Use code interpreter to analyze data, create charts, and answer questions about datasets.",
  model: "gpt-4o",
  tools: [
    { type: "code_interpreter" },
    { type: "file_search" },
  ],
});

console.log("Assistant ID:", assistant.id);
// "asst_01ABC..." — store this, don't recreate each time

Thread Lifecycle

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const ASSISTANT_ID = "asst_01ABC..."; // From creation

// Create a Thread (persists server-side)
const thread = await client.beta.threads.create();

// Add a message to the Thread
await client.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "Analyze this sales data and identify trends.",
});

// Run the Assistant on the Thread
const run = await client.beta.threads.runs.createAndPoll(thread.id, {
  assistant_id: ASSISTANT_ID,
});

if (run.status === "completed") {
  const messages = await client.beta.threads.messages.list(thread.id);
  const lastMessage = messages.data[0];

  if (lastMessage.content[0].type === "text") {
    console.log(lastMessage.content[0].text.value);
  }
}

// Next turn — same Thread, conversation continues
await client.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "Now create a bar chart of the top 5 products.",
});

const run2 = await client.beta.threads.runs.createAndPoll(thread.id, {
  assistant_id: ASSISTANT_ID,
});

File Upload and Code Interpreter

import OpenAI from "openai";
import { createReadStream } from "fs";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// Upload a CSV file for analysis
const file = await client.files.create({
  file: createReadStream("sales-data.csv"),
  purpose: "assistants",
});

const ASSISTANT_ID = "asst_01ABC...";

// Create thread with file attachment
const thread = await client.beta.threads.create({
  messages: [
    {
      role: "user",
      content: "Analyze this CSV file. What are the top 3 revenue months?",
      attachments: [
        {
          file_id: file.id,
          tools: [{ type: "code_interpreter" }],
        },
      ],
    },
  ],
});

const run = await client.beta.threads.runs.createAndPoll(thread.id, {
  assistant_id: ASSISTANT_ID,
});

if (run.status === "completed") {
  const messages = await client.beta.threads.messages.list(thread.id);

  for (const message of messages.data.reverse()) {
    if (message.role === "assistant") {
      for (const content of message.content) {
        if (content.type === "text") {
          console.log(content.text.value);
        }
        // Code Interpreter can generate image files (charts)
        if (content.type === "image_file") {
          console.log("Chart generated:", content.image_file.file_id);
          // Download with client.files.content(content.image_file.file_id)
        }
      }
    }
  }
}

File Search (Vector Store)

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// Upload documentation files
const vectorStore = await client.beta.vectorStores.create({
  name: "Product Docs",
});

await client.beta.vectorStores.fileBatches.uploadAndPoll(vectorStore.id, {
  files: [
    new File(["Authentication guide..."], "auth.txt", { type: "text/plain" }),
    new File(["API reference..."], "api-reference.txt", { type: "text/plain" }),
  ],
});

// Create assistant with file search
const assistant = await client.beta.assistants.create({
  name: "Support Bot",
  instructions: "Answer questions using the product documentation. Always cite sources.",
  model: "gpt-4o",
  tools: [{ type: "file_search" }],
  tool_resources: {
    file_search: {
      vector_store_ids: [vectorStore.id],
    },
  },
});

// Query
const thread = await client.beta.threads.create();
await client.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "How do I set up OAuth with your API?",
});

const run = await client.beta.threads.runs.createAndPoll(thread.id, {
  assistant_id: assistant.id,
});

const messages = await client.beta.threads.messages.list(thread.id);
const reply = messages.data[0].content[0];

if (reply.type === "text") {
  console.log(reply.text.value);
  // Includes citations with file names and quote snippets
  console.log("Citations:", reply.text.annotations);
}

Streaming Runs

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const ASSISTANT_ID = "asst_01ABC...";
const THREAD_ID = "thread_01ABC..."; // Existing thread

await client.beta.threads.messages.create(THREAD_ID, {
  role: "user",
  content: "Summarize our conversation so far.",
});

const stream = client.beta.threads.runs.stream(THREAD_ID, {
  assistant_id: ASSISTANT_ID,
});

stream
  .on("textDelta", (delta) => {
    process.stdout.write(delta.value ?? "");
  })
  .on("toolCallDelta", (delta) => {
    if (delta.type === "code_interpreter") {
      if (delta.code_interpreter?.input) {
        process.stdout.write(delta.code_interpreter.input);
      }
    }
  })
  .on("end", () => {
    console.log("\nRun complete");
  });

await stream.finalRun();

Feature Comparison

FeatureChat CompletionsResponses APIAssistants API
State management❌ You manage✅ Server-side chain✅ Persistent Threads
Conversation historyManual (resend all)previous_response_idAutomatic
Web search❌ Custom tools only✅ Built-in❌ Custom tools only
File search❌ Custom only✅ Vector stores✅ File Search tool
Code Interpreter✅ Python sandbox
Streaming
Structured output✅ (response_format)⚠️ Via instructions
Tool calling
All models
CostLowestMidHighest
MaturityGA (2023)GA (2025)GA (2023, updating)
Rate limitsStandardStandardThread-scoped
Token limitContext windowContext windowContext window
File processingNoYes (file search)Yes (Code Interp + FS)
Weekly API usageDominantGrowingStable

When to Use Each

Choose Chat Completions if:

  • Building a stateless API (classify, summarize, extract) — no multi-turn needed
  • You need maximum model flexibility and control
  • Custom state storage in your own DB (Redis, Postgres)
  • Cost optimization is critical — no overhead from managed state
  • Existing codebase built on Chat Completions patterns
  • Need response_format: json_schema for guaranteed structure

Choose Responses API if:

  • Multi-turn chat where OpenAI managing state is preferable
  • Need built-in web search without third-party integration
  • Document Q&A using file search with simpler setup than Assistants
  • Building new projects — this is OpenAI's current recommended primary API
  • Agentic flows where you want easier tool result submission

Choose Assistants API if:

  • Document Q&A with many files and automatic retrieval
  • Code Interpreter use cases: data analysis, chart generation, math
  • Long-lived persistent Threads across multiple user sessions
  • Support chatbot or copilot where OpenAI handles full conversation lifecycle
  • You want to avoid managing vector stores and file search plumbing yourself

Methodology

Data sourced from OpenAI API documentation (platform.openai.com/docs), OpenAI Responses API announcement (early 2025), npm weekly download statistics for the openai package as of February 2026, OpenAI developer forum discussions, and practical benchmarks comparing API latency and cost across the three surfaces. Pricing data from OpenAI pricing page as of February 2026.


Related: Gemini API vs Claude API vs Mistral API for cross-provider LLM comparisons, or Vercel AI SDK vs LangChain vs LlamaIndex for AI orchestration frameworks.

Comments

Stay Updated

Get the latest package insights, npm trends, and tooling tips delivered to your inbox.