Skip to main content

Gemini API vs Claude API vs Mistral API: LLM Comparison 2026

·PkgPulse Team

Gemini API vs Claude API vs Mistral API: LLM Comparison 2026

TL;DR

The LLM API landscape in 2026 has matured into distinct tiers with meaningfully different strengths. Google Gemini API leads on multimodal capability and raw context window size — Gemini 1.5 Pro supports 2M token context, handles video/audio/images natively, and has a generous free tier via Google AI Studio; Gemini Flash is the fastest and cheapest option for high-throughput applications. Anthropic Claude API leads on reasoning quality, instruction-following, and safe outputs — Claude 3.7 Sonnet performs best on complex analysis, code generation, and nuanced writing; extended thinking mode allows visible step-by-step reasoning for hard problems. Mistral API is the open-weight leader — strong reasoning, function calling, and multimodal capabilities with pricing significantly below OpenAI and Claude; Mistral models are also available for self-hosting via ollama or vllm. For multimodal apps processing images/video/audio: Gemini. For complex reasoning and code generation: Claude. For cost-efficient production workloads or open-weight models: Mistral.

Key Takeaways

  • Gemini 1.5 Pro: 2M token context — entire codebases, full books, hour-long video
  • Claude 3.7 Sonnet extended thinking — visible reasoning traces for complex problems
  • Mistral Large: 128k context — function calling and JSON mode at lower cost than Claude/Gemini Pro
  • Gemini Flash: cheapest at scale — $0.075/1M input tokens (Gemini 1.5 Flash)
  • Claude has tool use + computer use — agents can control browsers and desktops
  • Mistral has open-weight versions — run Mistral 7B/8x7B locally for free
  • All three support function/tool calling — structured JSON output from model decisions

Model and Pricing Quick Reference

Cost-optimized (high throughput):
  Gemini 1.5 Flash     → $0.075/1M in + $0.30/1M out
  Mistral Small         → $0.20/1M in + $0.60/1M out
  Claude Haiku 3.5      → $0.80/1M in + $4.00/1M out

Quality-optimized (complex tasks):
  Gemini 1.5 Pro        → $1.25/1M in + $5.00/1M out
  Claude 3.7 Sonnet     → $3.00/1M in + $15.00/1M out
  Mistral Large 2       → $2.00/1M in + $6.00/1M out

Context window:
  Gemini 1.5 Pro        → 2,000,000 tokens
  Claude 3.7 Sonnet     → 200,000 tokens
  Mistral Large 2       → 128,000 tokens

Free tier:
  Gemini                → 15 req/min (AI Studio)
  Claude                → None (paid only)
  Mistral               → Limited trial credits

Google Gemini API

Gemini offers the largest context window in the industry and native multimodal support — audio, video, images, and text in a single API call.

Installation

npm install @google/generative-ai
# Or using the newer Vertex AI SDK:
npm install @google-cloud/vertexai

Basic Text Generation

import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);

async function generateText(prompt: string): Promise<string> {
  const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });

  const result = await model.generateContent(prompt);
  const response = result.response;
  return response.text();
}

// Usage
const summary = await generateText(
  "Summarize the key principles of functional programming in 3 bullet points."
);
console.log(summary);

Streaming

async function streamText(prompt: string): Promise<void> {
  const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });

  const result = await model.generateContentStream(prompt);

  process.stdout.write("Response: ");
  for await (const chunk of result.stream) {
    const text = chunk.text();
    process.stdout.write(text);
  }
  console.log();
}

Multimodal: Image Understanding

import { GoogleGenerativeAI, HarmCategory, HarmBlockThreshold } from "@google/generative-ai";
import * as fs from "fs";

const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);

async function analyzeImage(imagePath: string, question: string): Promise<string> {
  const model = genAI.getGenerativeModel({ model: "gemini-1.5-pro" });

  const imageData = fs.readFileSync(imagePath);
  const base64Image = imageData.toString("base64");
  const mimeType = "image/jpeg"; // or "image/png", "image/webp"

  const result = await model.generateContent([
    {
      inlineData: {
        data: base64Image,
        mimeType,
      },
    },
    question,
  ]);

  return result.response.text();
}

// Analyze a product photo
const description = await analyzeImage(
  "./product.jpg",
  "What is this product? List its key features visible in the image."
);

Multimodal: File Upload (Video, Audio, PDF)

import { GoogleAIFileManager } from "@google/generative-ai/server";
import { GoogleGenerativeAI } from "@google/generative-ai";

const fileManager = new GoogleAIFileManager(process.env.GEMINI_API_KEY!);
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);

async function transcribeAudio(audioPath: string): Promise<string> {
  // Upload audio file (persists for 48 hours)
  const uploadResult = await fileManager.uploadFile(audioPath, {
    mimeType: "audio/mp3",
    displayName: "Meeting recording",
  });

  const model = genAI.getGenerativeModel({ model: "gemini-1.5-pro" });

  const result = await model.generateContent([
    {
      fileData: {
        mimeType: uploadResult.file.mimeType,
        fileUri: uploadResult.file.uri,
      },
    },
    "Please transcribe this audio recording and identify the main topics discussed.",
  ]);

  return result.response.text();
}

async function analyzeVideo(videoPath: string): Promise<string> {
  const uploadResult = await fileManager.uploadFile(videoPath, {
    mimeType: "video/mp4",
    displayName: "Product demo",
  });

  const model = genAI.getGenerativeModel({ model: "gemini-1.5-pro" });

  const result = await model.generateContent([
    {
      fileData: {
        mimeType: uploadResult.file.mimeType,
        fileUri: uploadResult.file.uri,
      },
    },
    "Summarize the key moments in this video and create a timeline.",
  ]);

  return result.response.text();
}

Function Calling (Tool Use)

import { GoogleGenerativeAI, FunctionDeclarationSchemaType } from "@google/generative-ai";

const tools = [
  {
    functionDeclarations: [
      {
        name: "get_weather",
        description: "Get current weather for a location",
        parameters: {
          type: FunctionDeclarationSchemaType.OBJECT,
          properties: {
            location: {
              type: FunctionDeclarationSchemaType.STRING,
              description: "City name or coordinates",
            },
            unit: {
              type: FunctionDeclarationSchemaType.STRING,
              enum: ["celsius", "fahrenheit"],
            },
          },
          required: ["location"],
        },
      },
    ],
  },
];

async function chatWithTools(userMessage: string): Promise<string> {
  const model = genAI.getGenerativeModel({
    model: "gemini-1.5-pro",
    tools,
  });

  const result = await model.generateContent(userMessage);
  const response = result.response;

  // Check if model wants to call a function
  const functionCall = response.candidates?.[0].content.parts.find(
    (part) => part.functionCall
  )?.functionCall;

  if (functionCall) {
    const { name, args } = functionCall;

    // Execute the actual function
    let functionResult: object;
    if (name === "get_weather") {
      functionResult = await fetchWeather(args as { location: string; unit?: string });
    } else {
      functionResult = { error: "Unknown function" };
    }

    // Send function result back to model
    const chat = model.startChat();
    const finalResult = await chat.sendMessage([
      { text: userMessage },
      { functionResponse: { name, response: functionResult } },
    ]);

    return finalResult.response.text();
  }

  return response.text();
}

Chat with History

async function createChat() {
  const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });

  const chat = model.startChat({
    history: [
      {
        role: "user",
        parts: [{ text: "You are a helpful coding assistant. Be concise." }],
      },
      {
        role: "model",
        parts: [{ text: "Understood! I'll provide concise coding help." }],
      },
    ],
    generationConfig: {
      maxOutputTokens: 1000,
      temperature: 0.7,
    },
  });

  // Subsequent messages maintain context
  const response1 = await chat.sendMessage("Write a TypeScript function to debounce a function");
  console.log(response1.response.text());

  const response2 = await chat.sendMessage("Now add a cancel method to it");
  console.log(response2.response.text()); // Knows what "it" refers to
}

Anthropic Claude API

Claude excels at reasoning, instruction-following, and complex code tasks. Claude 3.7 Sonnet introduces extended thinking for hard problems.

Installation

npm install @anthropic-ai/sdk

Basic Generation

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY!,
});

async function generateText(prompt: string): Promise<string> {
  const message = await anthropic.messages.create({
    model: "claude-3-7-sonnet-20250219",
    max_tokens: 1024,
    messages: [{ role: "user", content: prompt }],
  });

  const textContent = message.content.find((c) => c.type === "text");
  return textContent?.text ?? "";
}

Streaming

async function streamResponse(prompt: string): Promise<void> {
  const stream = anthropic.messages.stream({
    model: "claude-3-7-sonnet-20250219",
    max_tokens: 1024,
    messages: [{ role: "user", content: prompt }],
  });

  stream.on("text", (text) => {
    process.stdout.write(text);
  });

  const finalMessage = await stream.finalMessage();
  console.log("\nFinished. Stop reason:", finalMessage.stop_reason);
}

Extended Thinking (Claude 3.7 Sonnet)

async function solveWithThinking(problem: string): Promise<{
  thinking: string;
  answer: string;
}> {
  const response = await anthropic.messages.create({
    model: "claude-3-7-sonnet-20250219",
    max_tokens: 16000,
    thinking: {
      type: "enabled",
      budget_tokens: 10000,  // Max tokens for thinking step
    },
    messages: [{ role: "user", content: problem }],
  });

  let thinking = "";
  let answer = "";

  for (const block of response.content) {
    if (block.type === "thinking") {
      thinking = block.thinking;
    } else if (block.type === "text") {
      answer = block.text;
    }
  }

  return { thinking, answer };
}

// For complex math/logic problems
const { thinking, answer } = await solveWithThinking(
  "A store sells 3 types of items: A costs $4, B costs $7, C costs $10. " +
  "A customer spends exactly $100 buying 17 items total. " +
  "How many of each item did they buy?"
);
console.log("Reasoning:", thinking.slice(0, 500) + "...");
console.log("Answer:", answer);

Vision (Image Analysis)

import * as fs from "fs";

async function analyzeImage(imagePath: string, question: string): Promise<string> {
  const imageData = fs.readFileSync(imagePath).toString("base64");
  const mimeType = "image/jpeg"; // image/png, image/gif, image/webp

  const response = await anthropic.messages.create({
    model: "claude-3-7-sonnet-20250219",
    max_tokens: 1024,
    messages: [
      {
        role: "user",
        content: [
          {
            type: "image",
            source: {
              type: "base64",
              media_type: mimeType,
              data: imageData,
            },
          },
          {
            type: "text",
            text: question,
          },
        ],
      },
    ],
  });

  const textContent = response.content.find((c) => c.type === "text");
  return textContent?.text ?? "";
}

Tool Use (Function Calling)

const tools: Anthropic.Tool[] = [
  {
    name: "search_database",
    description: "Search product database by name, category, or price range",
    input_schema: {
      type: "object",
      properties: {
        query: { type: "string", description: "Search query" },
        category: { type: "string", enum: ["electronics", "clothing", "books", "food"] },
        max_price: { type: "number", description: "Maximum price in USD" },
      },
      required: ["query"],
    },
  },
];

async function agentChat(userMessage: string): Promise<string> {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userMessage },
  ];

  // Agentic loop — model may call tools multiple times
  while (true) {
    const response = await anthropic.messages.create({
      model: "claude-3-7-sonnet-20250219",
      max_tokens: 1024,
      tools,
      messages,
    });

    if (response.stop_reason === "end_turn") {
      const textContent = response.content.find((c) => c.type === "text");
      return textContent?.text ?? "";
    }

    if (response.stop_reason === "tool_use") {
      // Add assistant's response (including tool_use block)
      messages.push({ role: "assistant", content: response.content });

      // Execute each tool call
      const toolResults: Anthropic.ToolResultBlockParam[] = [];
      for (const block of response.content) {
        if (block.type === "tool_use") {
          const result = await executeToolCall(block.name, block.input as Record<string, unknown>);
          toolResults.push({
            type: "tool_result",
            tool_use_id: block.id,
            content: JSON.stringify(result),
          });
        }
      }

      // Add tool results and continue
      messages.push({ role: "user", content: toolResults });
    }
  }
}

Mistral API

Mistral offers strong reasoning at competitive pricing — and uniquely, open-weight versions that can run locally.

Installation

npm install @mistralai/mistralai

Basic Generation

import { Mistral } from "@mistralai/mistralai";

const client = new Mistral({
  apiKey: process.env.MISTRAL_API_KEY!,
});

async function generateText(prompt: string): Promise<string> {
  const response = await client.chat.complete({
    model: "mistral-large-latest",
    messages: [{ role: "user", content: prompt }],
  });

  return response.choices?.[0].message.content as string ?? "";
}

Streaming

async function streamText(prompt: string): Promise<void> {
  const stream = await client.chat.stream({
    model: "mistral-small-latest",  // Cost-efficient streaming
    messages: [{ role: "user", content: prompt }],
  });

  for await (const event of stream) {
    const delta = event.data.choices[0]?.delta.content;
    if (delta) process.stdout.write(delta as string);
  }
  console.log();
}

Vision (Pixtral)

import * as fs from "fs";

async function analyzeImage(imagePath: string, question: string): Promise<string> {
  const imageData = fs.readFileSync(imagePath).toString("base64");

  const response = await client.chat.complete({
    model: "pixtral-large-latest",
    messages: [
      {
        role: "user",
        content: [
          {
            type: "image_url",
            imageUrl: {
              url: `data:image/jpeg;base64,${imageData}`,
            },
          },
          {
            type: "text",
            text: question,
          },
        ],
      },
    ],
  });

  return response.choices?.[0].message.content as string ?? "";
}

Function Calling

const tools = [
  {
    type: "function" as const,
    function: {
      name: "get_stock_price",
      description: "Get current stock price for a ticker symbol",
      parameters: {
        type: "object",
        properties: {
          ticker: {
            type: "string",
            description: "Stock ticker symbol (e.g., AAPL, MSFT)",
          },
        },
        required: ["ticker"],
      },
    },
  },
];

async function financialAdvisor(question: string): Promise<string> {
  const messages: any[] = [{ role: "user", content: question }];

  const response = await client.chat.complete({
    model: "mistral-large-latest",
    tools,
    toolChoice: "auto",
    messages,
  });

  const choice = response.choices?.[0];

  if (choice?.finish_reason === "tool_calls") {
    messages.push({ role: "assistant", content: choice.message.content, tool_calls: choice.message.tool_calls });

    for (const toolCall of choice.message.tool_calls ?? []) {
      const args = JSON.parse(toolCall.function.arguments);
      const result = await fetchStockPrice(args.ticker);

      messages.push({
        role: "tool",
        tool_call_id: toolCall.id,
        content: JSON.stringify(result),
      });
    }

    const finalResponse = await client.chat.complete({ model: "mistral-large-latest", tools, messages });
    return finalResponse.choices?.[0].message.content as string ?? "";
  }

  return choice?.message.content as string ?? "";
}

JSON Mode

async function extractStructuredData(text: string): Promise<Record<string, unknown>> {
  const response = await client.chat.complete({
    model: "mistral-small-latest",
    responseFormat: { type: "json_object" },
    messages: [
      {
        role: "system",
        content: "Extract product information and return valid JSON with fields: name, price, category, inStock.",
      },
      { role: "user", content: text },
    ],
  });

  const content = response.choices?.[0].message.content as string;
  return JSON.parse(content);
}

// Returns: { name: "Widget Pro", price: 49.99, category: "tools", inStock: true }

Self-Hosted with Ollama (Free)

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull Mistral model
ollama pull mistral:7b
ollama pull mistral:8x7b  # Mixtral MoE

# Run locally on port 11434
ollama serve
// Use Mistral locally via OpenAI-compatible API
import OpenAI from "openai";

const localClient = new OpenAI({
  baseURL: "http://localhost:11434/v1",
  apiKey: "ollama",  // Required but ignored
});

async function generateLocally(prompt: string): Promise<string> {
  const response = await localClient.chat.completions.create({
    model: "mistral:7b",
    messages: [{ role: "user", content: prompt }],
  });

  return response.choices[0].message.content ?? "";
}

Feature Comparison

FeatureGemini 1.5 ProClaude 3.7 SonnetMistral Large 2
Context window2M tokens200k tokens128k tokens
Input pricing$1.25/1M$3.00/1M$2.00/1M
Output pricing$5.00/1M$15.00/1M$6.00/1M
Free tier✅ (AI Studio)
Video understanding✅ Native
Audio understanding✅ Native
Image understanding✅ Claude 3.7✅ Pixtral
Extended thinking
Function calling
JSON mode
Open-weight version✅ (7B, 8x7B)
Self-hosting✅ via Ollama/vLLM
Reasoning qualityHighHighestHigh
Code generationExcellentExcellentVery Good

When to Use Each

Choose Gemini API if:

  • Processing video, audio, or large documents (2M token context is unique)
  • Cost-sensitive high-throughput applications (Flash model is cheapest)
  • Already in the Google Cloud ecosystem (Vertex AI, Firebase)
  • Free tier prototyping (AI Studio has the most generous free tier)
  • Multimodal inputs across image/video/audio in a single request

Choose Claude API if:

  • Complex reasoning tasks where quality is paramount (code review, analysis, planning)
  • Extended thinking mode for hard math, logic, or multi-step reasoning
  • Safety-critical applications where instruction-following and refusal rates matter
  • Long-form writing where consistency and coherence over 100k+ tokens matters
  • Agentic workflows with complex tool use chains

Choose Mistral API if:

  • Cost efficiency at production scale (Large 2 is cheaper than Claude Sonnet)
  • Open-weight models needed for self-hosting, data privacy, or compliance
  • EU data processing requirements (Mistral is European, GDPR-first)
  • Simple function calling and JSON extraction tasks (Small model is very affordable)
  • Local development without API costs (Ollama + Mistral 7B)

Methodology

Data sourced from Google AI Studio documentation (ai.google.dev), Anthropic API documentation (docs.anthropic.com), Mistral AI documentation (docs.mistral.ai), pricing pages as of February 2026, context window and benchmark data from official model cards, and community discussions from Hugging Face forums, r/LocalLLaMA, and the AI Builders Discord.


Related: Vercel AI SDK vs OpenAI SDK vs Anthropic SDK for the TypeScript libraries that abstract over multiple LLM providers including Gemini, Claude, and Mistral, or LangChain vs LlamaIndex vs Haystack for RAG frameworks that integrate with all three APIs.

Comments

Stay Updated

Get the latest package insights, npm trends, and tooling tips delivered to your inbox.