Skip to main content

metascraper vs open-graph-scraper vs unfurl.js: URL Metadata Extraction in Node.js (2026)

·PkgPulse Team

TL;DR

metascraper is the most complete metadata extraction library — rule-based, handles Open Graph, Twitter Cards, JSON-LD, oEmbed, and falls back gracefully through multiple sources. open-graph-scraper is the most popular and straightforward — focused on Open Graph tags with optional Twitter Card support, easy to use. unfurl.js is a full oEmbed + Open Graph extractor with TypeScript types out of the box. For link preview features (like Slack/Discord): metascraper. For simple OG tag extraction: open-graph-scraper. For oEmbed support (YouTube, Twitter embed codes): unfurl.js.

Key Takeaways

  • metascraper: ~90K weekly downloads — rule-based, 30+ rules, Open Graph + JSON-LD + oEmbed + fallbacks
  • open-graph-scraper: ~200K weekly downloads — most popular, focused on OG tags, easy API
  • unfurl.js: ~20K weekly downloads — TypeScript-first, oEmbed + Open Graph + Twitter Cards
  • All three require fetching the page HTML first (or accept pre-fetched HTML)
  • Rate limiting matters — implement caching to avoid hammering external sites
  • For production: cache results in Redis or DB — don't re-scrape on every request

What Metadata Gets Extracted

Open Graph (og:*) tags — set by web publishers:
  og:title      "React vs Vue in 2026"
  og:description "A data-driven comparison of download trends..."
  og:image      "https://pkgpulse.com/og/react-vs-vue.png"
  og:url        "https://pkgpulse.com/compare/react-vs-vue"
  og:type       "article" | "website" | "video" | etc.

Twitter Card tags (twitter:*):
  twitter:card  "summary_large_image"
  twitter:title "React vs Vue"
  twitter:image "https://..."

JSON-LD (structured data):
  { "@type": "Article", "headline": "...", "image": [...] }

oEmbed — embed codes for YouTube, Twitter, Instagram, etc.:
  { type: "video", html: "<iframe>...</iframe>", title: "..." }

Fallbacks (when og:* not set):
  <title> tag, <meta name="description">, first <img> on page

metascraper

metascraper — rule-based metadata extraction:

Setup (rule-based plugins)

import got from "got"
import metascraper from "metascraper"
import metascraperTitle from "metascraper-title"
import metascraperDescription from "metascraper-description"
import metascraperImage from "metascraper-image"
import metascraperUrl from "metascraper-url"
import metascraperAuthor from "metascraper-author"
import metascraperDate from "metascraper-date"

// Compose your scraper with the rules you need:
const scraper = metascraper([
  metascraperTitle(),       // og:title, twitter:title, <title>
  metascraperDescription(), // og:description, meta[description], first paragraph
  metascraperImage(),       // og:image, twitter:image, first img
  metascraperUrl(),         // og:url, canonical link, href
  metascraperAuthor(),      // JSON-LD author, meta[author]
  metascraperDate(),        // og:published_time, JSON-LD datePublished
])

// Fetch and extract:
async function extractMetadata(url: string) {
  const { body: html, url: finalUrl } = await got(url)
  const metadata = await scraper({ html, url: finalUrl })

  return metadata
}

const meta = await extractMetadata("https://pkgpulse.com/blog/react-vs-vue")
// {
//   title: "React vs Vue in 2026: A Data-Driven Comparison",
//   description: "Compare React and Vue download trends, health scores...",
//   image: "https://pkgpulse.com/og/react-vs-vue.png",
//   url: "https://pkgpulse.com/blog/react-vs-vue",
//   author: "PkgPulse Team",
//   date: "2026-03-01T00:00:00.000Z",
// }

Available rules

// metascraper plugins — install only what you need:
// npm install metascraper-title metascraper-description metascraper-image

import metascraperTitle from "metascraper-title"
import metascraperDescription from "metascraper-description"
import metascraperImage from "metascraper-image"
import metascraperUrl from "metascraper-url"
import metascraperAuthor from "metascraper-author"
import metascraperDate from "metascraper-date"
import metascraperPublisher from "metascraper-publisher"  // Site name
import metascraperReadability from "metascraper-readability"  // Article content
import metascraperLang from "metascraper-lang"  // Page language
import metascraperVideo from "metascraper-video"  // og:video, video elements
import metascraperAudio from "metascraper-audio"  // og:audio
import metascraperIframe from "metascraper-iframe"  // oEmbed iframe
import metascraperTwitter from "metascraper-twitter"  // Twitter-specific
import metascraperYoutube from "metascraper-youtube"  // YouTube oEmbed
import metascraperSpotify from "metascraper-spotify"  // Spotify oEmbed

Custom rule

import metascraper from "metascraper"

// Write a custom rule for site-specific metadata:
const metascraperPkgPulse = () => ({
  packageName: [
    // Rule 1: Try custom meta tag first:
    ({ htmlDom: $ }) => $("meta[name='pkg:name']").attr("content"),
    // Rule 2: Fall back to og:title parsing:
    ({ htmlDom: $ }) => {
      const title = $("meta[property='og:title']").attr("content")
      return title?.match(/^(\S+) vs/)?.[1]  // Extract first package name
    },
  ],
})

const scraper = metascraper([
  metascraperPkgPulse(),
  // ... other rules
])

open-graph-scraper

open-graph-scraper — straightforward OG extraction:

Basic usage

import ogs from "open-graph-scraper"

// Simple API — one function:
const { result, error } = await ogs({ url: "https://pkgpulse.com" })

if (error) {
  console.error("Failed to scrape:", error)
} else {
  console.log(result.ogTitle)        // "PkgPulse — npm Package Health"
  console.log(result.ogDescription)  // "Compare npm packages..."
  console.log(result.ogImage)        // [{ url: "https://...", type: "image/png" }]
  console.log(result.ogUrl)          // "https://pkgpulse.com"
  console.log(result.ogSiteName)     // "PkgPulse"
  console.log(result.twitterCard)    // "summary_large_image"
  console.log(result.twitterTitle)   // "PkgPulse"
}

With custom options

import ogs from "open-graph-scraper"

const { result } = await ogs({
  url: "https://example.com",

  // Custom fetch options:
  fetchOptions: {
    headers: {
      "User-Agent": "MyApp/1.0 LinkPreviewBot",
      Accept: "text/html",
    },
    signal: AbortSignal.timeout(5000),  // 5 second timeout
  },

  // Pass pre-fetched HTML (no network request):
  html: "<html>...</html>",
  // When html is provided, url is still required for relative URL resolution
})
import express from "express"
import ogs from "open-graph-scraper"

const app = express()

// Rate limiting + caching are essential here:
import NodeCache from "node-cache"
const cache = new NodeCache({ stdTTL: 3600 })  // 1 hour cache

app.get("/api/link-preview", async (req, res) => {
  const url = req.query.url as string

  if (!url) {
    return res.status(400).json({ error: "url required" })
  }

  // Validate URL:
  try { new URL(url) } catch {
    return res.status(400).json({ error: "Invalid URL" })
  }

  // Check cache:
  const cached = cache.get(url)
  if (cached) return res.json(cached)

  try {
    const { result } = await ogs({
      url,
      fetchOptions: {
        headers: { "User-Agent": "LinkPreviewBot/1.0" },
        signal: AbortSignal.timeout(8000),
      },
    })

    const preview = {
      title: result.ogTitle ?? result.twitterTitle ?? null,
      description: result.ogDescription ?? result.twitterDescription ?? null,
      image: result.ogImage?.[0]?.url ?? result.twitterImage?.[0]?.url ?? null,
      url: result.ogUrl ?? url,
      siteName: result.ogSiteName ?? null,
    }

    cache.set(url, preview)
    res.json(preview)
  } catch (err) {
    res.status(500).json({ error: "Failed to fetch preview" })
  }
})

unfurl.js

unfurl.js — TypeScript-first, oEmbed + Open Graph:

Basic usage

import { unfurl } from "unfurl.js"

// Returns strongly-typed metadata:
const metadata = await unfurl("https://www.youtube.com/watch?v=dQw4w9WgXcQ")
// {
//   title: "Rick Astley - Never Gonna Give You Up",
//   description: "...",
//   open_graph: { title: "...", images: [...] },
//   twitter_card: { title: "..." },
//   oEmbed: {
//     type: "video",
//     title: "Rick Astley - Never Gonna Give You Up",
//     html: "<iframe width='459' height='344' src='https://www.youtube.com/embed/...'...",
//     width: 459,
//     height: 344,
//   },
// }

// oEmbed gives you the embed iframe HTML — useful for YouTube, Twitter, Instagram, etc.

// For regular websites (no oEmbed):
const pkgMeta = await unfurl("https://pkgpulse.com")
// {
//   title: "PkgPulse — npm Package Health",
//   open_graph: { title: "...", description: "...", images: [...] },
//   description: "...",
// }

TypeScript types

import { unfurl, Metadata } from "unfurl.js"

// Strongly typed return value:
const meta: Metadata = await unfurl("https://pkgpulse.com/blog/react-vs-vue")

// Access typed fields:
const ogImages: Array<{ url: string; width?: number; height?: number }> =
  meta.open_graph?.images ?? []

const oEmbedHtml: string | undefined = meta.oEmbed?.html

// Type narrowing:
if (meta.oEmbed?.type === "video") {
  console.log("Video embed:", meta.oEmbed.html)
  console.log("Video width:", meta.oEmbed.width)
}

Feature Comparison

Featuremetascraperopen-graph-scraperunfurl.js
Open Graph
Twitter Cards
JSON-LD
oEmbed⚠️ Limited✅ Excellent
Custom rules
Readability (content)✅ (plugin)
TypeScript✅ (best)
Modularity✅ Per-plugin❌ Monolithic
Bundle sizeModular~200KB~150KB
ESM
Pre-fetched HTML❌ (URL only)

When to Use Each

Choose metascraper if:

  • Building a full link preview system (like Slack or Notion)
  • You need fallback chains — try og:, then JSON-LD, then raw HTML
  • Custom rule needed for specific sites
  • Want modular installation (only include rules you use)

Choose open-graph-scraper if:

  • Simple OG tag extraction is all you need
  • Quick setup with minimal configuration
  • Most popular → most Stack Overflow answers and examples

Choose unfurl.js if:

  • TypeScript types are important — unfurl has the best typings
  • YouTube/Twitter/Vimeo oEmbed support (embed codes) is needed
  • Simplest API with good oEmbed handling

Methodology

Download data from npm registry (weekly average, February 2026). Feature comparison based on metascraper v5.x, open-graph-scraper v6.x, and unfurl.js v2.x.

Compare web scraping and metadata packages on PkgPulse →

Comments

Stay Updated

Get the latest package insights, npm trends, and tooling tips delivered to your inbox.