metascraper vs open-graph-scraper vs unfurl.js: URL Metadata Extraction in Node.js (2026)
TL;DR
metascraper is the most complete metadata extraction library — rule-based, handles Open Graph, Twitter Cards, JSON-LD, oEmbed, and falls back gracefully through multiple sources. open-graph-scraper is the most popular and straightforward — focused on Open Graph tags with optional Twitter Card support, easy to use. unfurl.js is a full oEmbed + Open Graph extractor with TypeScript types out of the box. For link preview features (like Slack/Discord): metascraper. For simple OG tag extraction: open-graph-scraper. For oEmbed support (YouTube, Twitter embed codes): unfurl.js.
Key Takeaways
- metascraper: ~90K weekly downloads — rule-based, 30+ rules, Open Graph + JSON-LD + oEmbed + fallbacks
- open-graph-scraper: ~200K weekly downloads — most popular, focused on OG tags, easy API
- unfurl.js: ~20K weekly downloads — TypeScript-first, oEmbed + Open Graph + Twitter Cards
- All three require fetching the page HTML first (or accept pre-fetched HTML)
- Rate limiting matters — implement caching to avoid hammering external sites
- For production: cache results in Redis or DB — don't re-scrape on every request
What Metadata Gets Extracted
Open Graph (og:*) tags — set by web publishers:
og:title "React vs Vue in 2026"
og:description "A data-driven comparison of download trends..."
og:image "https://pkgpulse.com/og/react-vs-vue.png"
og:url "https://pkgpulse.com/compare/react-vs-vue"
og:type "article" | "website" | "video" | etc.
Twitter Card tags (twitter:*):
twitter:card "summary_large_image"
twitter:title "React vs Vue"
twitter:image "https://..."
JSON-LD (structured data):
{ "@type": "Article", "headline": "...", "image": [...] }
oEmbed — embed codes for YouTube, Twitter, Instagram, etc.:
{ type: "video", html: "<iframe>...</iframe>", title: "..." }
Fallbacks (when og:* not set):
<title> tag, <meta name="description">, first <img> on page
metascraper
metascraper — rule-based metadata extraction:
Setup (rule-based plugins)
import got from "got"
import metascraper from "metascraper"
import metascraperTitle from "metascraper-title"
import metascraperDescription from "metascraper-description"
import metascraperImage from "metascraper-image"
import metascraperUrl from "metascraper-url"
import metascraperAuthor from "metascraper-author"
import metascraperDate from "metascraper-date"
// Compose your scraper with the rules you need:
const scraper = metascraper([
metascraperTitle(), // og:title, twitter:title, <title>
metascraperDescription(), // og:description, meta[description], first paragraph
metascraperImage(), // og:image, twitter:image, first img
metascraperUrl(), // og:url, canonical link, href
metascraperAuthor(), // JSON-LD author, meta[author]
metascraperDate(), // og:published_time, JSON-LD datePublished
])
// Fetch and extract:
async function extractMetadata(url: string) {
const { body: html, url: finalUrl } = await got(url)
const metadata = await scraper({ html, url: finalUrl })
return metadata
}
const meta = await extractMetadata("https://pkgpulse.com/blog/react-vs-vue")
// {
// title: "React vs Vue in 2026: A Data-Driven Comparison",
// description: "Compare React and Vue download trends, health scores...",
// image: "https://pkgpulse.com/og/react-vs-vue.png",
// url: "https://pkgpulse.com/blog/react-vs-vue",
// author: "PkgPulse Team",
// date: "2026-03-01T00:00:00.000Z",
// }
Available rules
// metascraper plugins — install only what you need:
// npm install metascraper-title metascraper-description metascraper-image
import metascraperTitle from "metascraper-title"
import metascraperDescription from "metascraper-description"
import metascraperImage from "metascraper-image"
import metascraperUrl from "metascraper-url"
import metascraperAuthor from "metascraper-author"
import metascraperDate from "metascraper-date"
import metascraperPublisher from "metascraper-publisher" // Site name
import metascraperReadability from "metascraper-readability" // Article content
import metascraperLang from "metascraper-lang" // Page language
import metascraperVideo from "metascraper-video" // og:video, video elements
import metascraperAudio from "metascraper-audio" // og:audio
import metascraperIframe from "metascraper-iframe" // oEmbed iframe
import metascraperTwitter from "metascraper-twitter" // Twitter-specific
import metascraperYoutube from "metascraper-youtube" // YouTube oEmbed
import metascraperSpotify from "metascraper-spotify" // Spotify oEmbed
Custom rule
import metascraper from "metascraper"
// Write a custom rule for site-specific metadata:
const metascraperPkgPulse = () => ({
packageName: [
// Rule 1: Try custom meta tag first:
({ htmlDom: $ }) => $("meta[name='pkg:name']").attr("content"),
// Rule 2: Fall back to og:title parsing:
({ htmlDom: $ }) => {
const title = $("meta[property='og:title']").attr("content")
return title?.match(/^(\S+) vs/)?.[1] // Extract first package name
},
],
})
const scraper = metascraper([
metascraperPkgPulse(),
// ... other rules
])
open-graph-scraper
open-graph-scraper — straightforward OG extraction:
Basic usage
import ogs from "open-graph-scraper"
// Simple API — one function:
const { result, error } = await ogs({ url: "https://pkgpulse.com" })
if (error) {
console.error("Failed to scrape:", error)
} else {
console.log(result.ogTitle) // "PkgPulse — npm Package Health"
console.log(result.ogDescription) // "Compare npm packages..."
console.log(result.ogImage) // [{ url: "https://...", type: "image/png" }]
console.log(result.ogUrl) // "https://pkgpulse.com"
console.log(result.ogSiteName) // "PkgPulse"
console.log(result.twitterCard) // "summary_large_image"
console.log(result.twitterTitle) // "PkgPulse"
}
With custom options
import ogs from "open-graph-scraper"
const { result } = await ogs({
url: "https://example.com",
// Custom fetch options:
fetchOptions: {
headers: {
"User-Agent": "MyApp/1.0 LinkPreviewBot",
Accept: "text/html",
},
signal: AbortSignal.timeout(5000), // 5 second timeout
},
// Pass pre-fetched HTML (no network request):
html: "<html>...</html>",
// When html is provided, url is still required for relative URL resolution
})
Link preview API endpoint
import express from "express"
import ogs from "open-graph-scraper"
const app = express()
// Rate limiting + caching are essential here:
import NodeCache from "node-cache"
const cache = new NodeCache({ stdTTL: 3600 }) // 1 hour cache
app.get("/api/link-preview", async (req, res) => {
const url = req.query.url as string
if (!url) {
return res.status(400).json({ error: "url required" })
}
// Validate URL:
try { new URL(url) } catch {
return res.status(400).json({ error: "Invalid URL" })
}
// Check cache:
const cached = cache.get(url)
if (cached) return res.json(cached)
try {
const { result } = await ogs({
url,
fetchOptions: {
headers: { "User-Agent": "LinkPreviewBot/1.0" },
signal: AbortSignal.timeout(8000),
},
})
const preview = {
title: result.ogTitle ?? result.twitterTitle ?? null,
description: result.ogDescription ?? result.twitterDescription ?? null,
image: result.ogImage?.[0]?.url ?? result.twitterImage?.[0]?.url ?? null,
url: result.ogUrl ?? url,
siteName: result.ogSiteName ?? null,
}
cache.set(url, preview)
res.json(preview)
} catch (err) {
res.status(500).json({ error: "Failed to fetch preview" })
}
})
unfurl.js
unfurl.js — TypeScript-first, oEmbed + Open Graph:
Basic usage
import { unfurl } from "unfurl.js"
// Returns strongly-typed metadata:
const metadata = await unfurl("https://www.youtube.com/watch?v=dQw4w9WgXcQ")
// {
// title: "Rick Astley - Never Gonna Give You Up",
// description: "...",
// open_graph: { title: "...", images: [...] },
// twitter_card: { title: "..." },
// oEmbed: {
// type: "video",
// title: "Rick Astley - Never Gonna Give You Up",
// html: "<iframe width='459' height='344' src='https://www.youtube.com/embed/...'...",
// width: 459,
// height: 344,
// },
// }
// oEmbed gives you the embed iframe HTML — useful for YouTube, Twitter, Instagram, etc.
// For regular websites (no oEmbed):
const pkgMeta = await unfurl("https://pkgpulse.com")
// {
// title: "PkgPulse — npm Package Health",
// open_graph: { title: "...", description: "...", images: [...] },
// description: "...",
// }
TypeScript types
import { unfurl, Metadata } from "unfurl.js"
// Strongly typed return value:
const meta: Metadata = await unfurl("https://pkgpulse.com/blog/react-vs-vue")
// Access typed fields:
const ogImages: Array<{ url: string; width?: number; height?: number }> =
meta.open_graph?.images ?? []
const oEmbedHtml: string | undefined = meta.oEmbed?.html
// Type narrowing:
if (meta.oEmbed?.type === "video") {
console.log("Video embed:", meta.oEmbed.html)
console.log("Video width:", meta.oEmbed.width)
}
Feature Comparison
| Feature | metascraper | open-graph-scraper | unfurl.js |
|---|---|---|---|
| Open Graph | ✅ | ✅ | ✅ |
| Twitter Cards | ✅ | ✅ | ✅ |
| JSON-LD | ✅ | ✅ | ✅ |
| oEmbed | ✅ | ⚠️ Limited | ✅ Excellent |
| Custom rules | ✅ | ❌ | ❌ |
| Readability (content) | ✅ (plugin) | ❌ | ❌ |
| TypeScript | ✅ | ✅ | ✅ (best) |
| Modularity | ✅ Per-plugin | ❌ Monolithic | ❌ |
| Bundle size | Modular | ~200KB | ~150KB |
| ESM | ✅ | ✅ | ✅ |
| Pre-fetched HTML | ✅ | ✅ | ❌ (URL only) |
When to Use Each
Choose metascraper if:
- Building a full link preview system (like Slack or Notion)
- You need fallback chains — try og:, then JSON-LD, then raw HTML
- Custom rule needed for specific sites
- Want modular installation (only include rules you use)
Choose open-graph-scraper if:
- Simple OG tag extraction is all you need
- Quick setup with minimal configuration
- Most popular → most Stack Overflow answers and examples
Choose unfurl.js if:
- TypeScript types are important — unfurl has the best typings
- YouTube/Twitter/Vimeo oEmbed support (embed codes) is needed
- Simplest API with good oEmbed handling
Methodology
Download data from npm registry (weekly average, February 2026). Feature comparison based on metascraper v5.x, open-graph-scraper v6.x, and unfurl.js v2.x.