Skip to main content

DataLoader vs p-batch vs graphql-batch: Batching & Deduplication (2026)

·PkgPulse Team

TL;DR

DataLoader (from Meta/Facebook) is the standard solution for the N+1 problem in GraphQL — it batches multiple individual loads within a single event loop tick into one bulk request, and deduplicates repeated loads of the same key. p-batch is a generic promise-based batching utility that works outside GraphQL contexts. graphql-batch is a Ruby gem, not a JS library — in the JavaScript ecosystem the equivalent is DataLoader itself, sometimes combined with graphql-dataloader for easier integration. In 2026: use DataLoader in any GraphQL resolver, always create a new DataLoader instance per request (for per-user caching), and combine it with Prisma's findMany for maximum efficiency.

Key Takeaways

  • dataloader: ~4M weekly downloads — Meta's N+1 solution, per-tick batching, built-in deduplication + in-memory cache
  • p-batch: ~500K weekly downloads — generic batching, not GraphQL-specific, useful for REST or non-resolver contexts
  • N+1 problem: 100 users × 1 DB query per user = 101 queries → DataLoader: 1 batch query for all users
  • DataLoader batches within a single event loop tick — multiple loader.load(id) calls get batched automatically
  • Always create DataLoader per request (not global) — the cache should not persist across users
  • DataLoader's built-in cache prevents duplicate fetches within the same request

The N+1 Problem

// Without DataLoader — N+1 queries:
const query = `
  query {
    posts {        # 1 query to get 100 posts
      title
      author {     # 100 queries — one per post author!
        name
      }
    }
  }
`

// Resolver that causes N+1:
const resolvers = {
  Post: {
    author: async (post) => {
      // Called once PER POST — 100 individual DB queries!
      return db.user.findUnique({ where: { id: post.authorId } })
    },
  },
}

// Database log:
// SELECT * FROM posts;                              -- 1 query
// SELECT * FROM users WHERE id = 1;                -- 100 queries
// SELECT * FROM users WHERE id = 2;
// SELECT * FROM users WHERE id = 1;  ← duplicate!
// ...
// Total: 101 queries for a single GraphQL request

DataLoader

DataLoader — batch and cache async data fetching:

How it works

Event loop tick:
  1. post.author resolver calls loader.load(userId) for user 1
  2. post.author resolver calls loader.load(userId) for user 2
  3. post.author resolver calls loader.load(userId) for user 1 (duplicate!)
  ...all within the same tick...
  4. End of tick: DataLoader collects all unique keys: [1, 2, 3, ...]
  5. Calls batchFn([1, 2, 3, ...]) → single DB query
  6. Resolves each individual promise with its result
  7. Duplicate keys get the same cached result

Basic usage

import DataLoader from "dataloader"
import { PrismaClient } from "@prisma/client"

const db = new PrismaClient()

// Define a batch function — receives array of keys, returns array of values in same order:
const userLoader = new DataLoader<number, User | null>(async (userIds) => {
  // One bulk query for all IDs:
  const users = await db.user.findMany({
    where: { id: { in: [...userIds] } },
  })

  // CRITICAL: Return results in the SAME ORDER as the input keys
  // (DataLoader requires 1:1 correspondence: keys[i] → results[i])
  const userMap = new Map(users.map((u) => [u.id, u]))
  return userIds.map((id) => userMap.get(id) ?? null)
})

// Load individual items (these get batched automatically):
const user1 = await userLoader.load(1)    // Doesn't query yet
const user2 = await userLoader.load(2)    // Doesn't query yet
const user1again = await userLoader.load(1)  // Will use cache

// End of tick → one batch query: SELECT WHERE id IN (1, 2)
// user1again === user1 (same reference — deduped from cache)

Per-request context pattern (correct way)

// IMPORTANT: Create new DataLoader per request — not global!
// If global: user A's request can see user B's cached data (security issue)

// context.ts — create loaders per request:
interface Context {
  db: PrismaClient
  loaders: {
    user: DataLoader<number, User | null>
    post: DataLoader<number, Post | null>
    comment: DataLoader<number, Comment[]>
  }
}

function createContext(db: PrismaClient): Context {
  return {
    db,
    loaders: {
      user: new DataLoader<number, User | null>(async (ids) => {
        const users = await db.user.findMany({
          where: { id: { in: [...ids] } },
        })
        const map = new Map(users.map((u) => [u.id, u]))
        return ids.map((id) => map.get(id) ?? null)
      }),

      post: new DataLoader<number, Post | null>(async (ids) => {
        const posts = await db.post.findMany({
          where: { id: { in: [...ids] } },
        })
        const map = new Map(posts.map((p) => [p.id, p]))
        return ids.map((id) => map.get(id) ?? null)
      }),

      // One-to-many loader (comments for a post):
      comment: new DataLoader<number, Comment[]>(async (postIds) => {
        const comments = await db.comment.findMany({
          where: { postId: { in: [...postIds] } },
        })
        // Group by postId:
        const byPost = new Map<number, Comment[]>()
        for (const comment of comments) {
          const list = byPost.get(comment.postId) ?? []
          list.push(comment)
          byPost.set(comment.postId, list)
        }
        return postIds.map((id) => byPost.get(id) ?? [])
      }),
    },
  }
}

Using loaders in resolvers

// resolvers.ts — resolvers use context.loaders, not direct DB calls:
const resolvers = {
  Query: {
    posts: async (_: unknown, __: unknown, ctx: Context) => {
      return ctx.db.post.findMany({ take: 100 })
    },
  },

  Post: {
    // This resolver is called 100 times for 100 posts:
    author: async (post: Post, _: unknown, ctx: Context) => {
      // DataLoader batches all 100 calls into 1 query!
      return ctx.loaders.user.load(post.authorId)
    },

    comments: async (post: Post, _: unknown, ctx: Context) => {
      // Also batched — all comment lookups for all posts in one query:
      return ctx.loaders.comment.load(post.id)
    },
  },

  Comment: {
    author: async (comment: Comment, _: unknown, ctx: Context) => {
      // Deduplication: multiple comments by same author share cache:
      return ctx.loaders.user.load(comment.authorId)
    },
  },
}

Integration with graphql-yoga / Apollo Server

import { createSchema, createYoga } from "graphql-yoga"
import { createServer } from "http"

const yoga = createYoga({
  schema,
  context: async () => {
    // New loaders per request — no data leakage between requests:
    return createContext(db)
  },
})

// Apollo Server v4 equivalent:
import { ApolloServer } from "@apollo/server"
import { expressMiddleware } from "@apollo/server/express4"

const server = new ApolloServer({ typeDefs, resolvers })

app.use("/graphql", expressMiddleware(server, {
  context: async ({ req }) => createContext(db),
}))

Cache control

import DataLoader from "dataloader"

const userLoader = new DataLoader<number, User | null>(batchFn, {
  // Disable cache if you want fresh data every call within same request:
  cache: false,

  // Custom cache map (e.g., LRU for long-lived loaders):
  // cacheMap: new LRUCache({ max: 1000 }),

  // Max batch size (avoid huge IN clauses):
  maxBatchSize: 100,

  // Batch scheduler — default is process.nextTick, can customize:
  batchScheduleFn: (callback) => setTimeout(callback, 10),  // 10ms window
})

// Manual cache operations:
userLoader.prime(1, existingUser)  // Seed cache without calling batchFn
userLoader.clear(1)                 // Invalidate specific key
userLoader.clearAll()               // Clear entire cache

p-batch

p-batch — generic promise batching:

Basic usage

import pBatch from "p-batch"

// Generic batch processor — works for any async operation:
const processInBatches = pBatch(
  async (items: number[]) => {
    // Process all items at once:
    const results = await db.user.findMany({ where: { id: { in: items } } })
    const map = new Map(results.map((r) => [r.id, r]))
    return items.map((id) => map.get(id) ?? null)
  },
  { maxBatchSize: 50 }
)

// Individual calls get batched:
const [user1, user2, user3] = await Promise.all([
  processInBatches(1),
  processInBatches(2),
  processInBatches(3),
])

Compared to DataLoader

// DataLoader: automatic tick-based batching + deduplication + cache
// p-batch: configurable batching (time window or size), no built-in cache

// p-batch is useful for non-GraphQL batching:
// - Rate-limited API calls
// - Bulk event processing
// - Batching writes (not just reads)

// Example: batch API calls to avoid rate limiting:
const fetchUserFromApi = pBatch(
  async (userIds: string[]) => {
    const response = await fetch(`/api/users?ids=${userIds.join(",")}`)
    const data = await response.json()
    return userIds.map((id) => data.users.find((u: User) => u.id === id) ?? null)
  },
  {
    maxBatchSize: 50,
    maxWait: 50,  // Wait up to 50ms to accumulate a batch
  }
)

Feature Comparison

FeatureDataLoaderp-batch
Built-in cache (deduplication)
Tick-based auto-batching❌ (time-based)
Max batch size
GraphQL-specificDesigned forAgnostic
TypeScript
prime() / clear() cache API
Weekly downloads~4M~500K
Custom cache map

When to Use Each

Choose DataLoader if:

  • Building GraphQL resolvers and want to solve the N+1 problem
  • Need automatic per-tick batching with no extra setup
  • Want built-in deduplication so duplicate keys use cached results
  • Working with any GraphQL server (Apollo, graphql-yoga, mercurius, etc.)

Choose p-batch if:

  • Batching outside of GraphQL (REST API clients, event processing)
  • Need time-window batching rather than event-loop-tick batching
  • Batching writes or side effects (not just reads)

DataLoader best practices:

// ✅ DO: Create per request
const context = { loaders: createLoaders(db) }

// ❌ DON'T: Create globally
const globalLoader = new DataLoader(batchFn)  // Cache persists across users!

// ✅ DO: Return results in same order as keys
const batchFn = async (ids) => {
  const results = await db.findMany({ id: { in: ids } })
  const map = new Map(results.map(r => [r.id, r]))
  return ids.map(id => map.get(id) ?? null)  // Ordered!
}

// ❌ DON'T: Return unsorted results
const badBatchFn = async (ids) => {
  return db.findMany({ id: { in: ids } })  // DB may return in different order!
}

// ✅ DO: Use prime() to seed cache from list queries
const posts = await db.post.findMany({ take: 100 })
for (const post of posts) {
  ctx.loaders.post.prime(post.id, post)  // No re-fetch needed if later loader.load(post.id)
}

Methodology

Download data from npm registry (weekly average, February 2026). Feature comparison based on dataloader v2.x and p-batch v3.x.

Compare GraphQL and API packages on PkgPulse →

Comments

Stay Updated

Get the latest package insights, npm trends, and tooling tips delivered to your inbox.