How Health Scores Help You Choose Packages 2026
TL;DR
GitHub stars are vanity metrics. Health scores are actionable. A package with 50K GitHub stars might have 0 commits in 2 years, 200 open issues, and declining downloads. A package with 5K stars might have weekly releases, 10 active maintainers, and 2M weekly downloads. Health scores aggregate the signals that actually predict maintenance quality and longevity — so you make package decisions on data, not hype.
Key Takeaways
- GitHub stars: correlation with quality = 0.2 — stars measure popularity at a point in time
- Downloads + trend — the strongest signal of continued adoption
- Last publish date — packages not updated in 18+ months carry risk
- Issue velocity — fast issue response = active maintainer
- TypeScript coverage — now a first-class quality signal
- Dependency count — high transitive deps = higher attack surface + bundle size
Why GitHub Stars Mislead
Real examples of star count vs health disconnect:
Package A: 52,000 stars
Last commit: 18 months ago
Open issues: 847 (no responses for 6 months)
Weekly downloads: 120K (declining 40% YoY)
TypeScript support: None (@types/packageA missing)
Verdict: DO NOT USE for new projects
Package B: 3,200 stars
Last commit: 3 days ago
Open issues: 12 (average response: 2 hours)
Weekly downloads: 2.4M (growing 80% YoY)
TypeScript support: Built-in, strict
Verdict: Excellent choice
Stars reflect: discovery, viral blog posts, trending periods
Stars don't reflect: maintenance activity, community responsiveness, longevity
The GitHub stars problem has become more acute as the practice of star purchasing became widespread. GitHub star marketplaces — services that sell "organic-looking" star increments from real accounts — exist and are actively used by package authors trying to gain credibility. A package at 20,000 stars may have 5,000 purchased stars buried in the total, providing false social proof to developers who treat stars as a quality signal.
Even without manipulation, the viral nature of star accumulation creates distortion. A package featured in a widely-shared "awesome list" or trending on GitHub for a week accumulates thousands of stars from developers who bookmarked it out of curiosity rather than adoption intent. Those stars persist indefinitely, making the package look perpetually popular even years after the community has moved on.
The pragmatic response: use downloads and download trends as the primary adoption signal. Downloads represent npm installs from CI/CD pipelines, production builds, and developer workstations. They're harder to game (purchasing fake npm downloads is far more expensive and less effective than purchasing fake GitHub stars), and they reflect actual usage rather than bookmarked intent. A package with 5,000 stars and 2 million weekly downloads is doing something real. A package with 50,000 stars and 80,000 weekly downloads should raise questions about where those stars came from.
What Health Scores Measure
Dimension 1: Maintenance Activity
Signals:
- Days since last npm publish (lower = better)
- Commit frequency (average per month)
- Issue close rate (% of issues resolved in 30 days)
- PR merge rate and time
Scoring rubric:
Published < 30 days: 30/30 points
Published 30-90 days: 25/30
Published 90-365 days: 15/30
Published 1-2 years: 5/30
Published > 2 years: 0/30
Why this matters:
- Security vulnerabilities get patched
- Breaking changes in dependencies get fixed
- New Node.js/TypeScript versions get support
Dimension 2: Adoption and Growth
Signals:
- Weekly download count (absolute)
- Download trend (3-month and 12-month growth rate)
- Dependents count (packages that depend on it)
Why downloads beat stars:
- Stars can be bought/gamed (GitHub star purchasing is a real market)
- Downloads represent actual usage in CI/CD, not just bookmarks
- Download trend reveals momentum (growing = healthy ecosystem interest)
- Dependents count reveals lock-in to the ecosystem
Interpreting download trends:
+50%+ YoY: Rapidly growing, strong momentum
+10-50%: Healthy growth
-10% to +10%: Stable
-10% to -30%: Slight decline — watch
-30%+ YoY: Significant decline — evaluate alternatives
Dimension 3: TypeScript Support
Scoring levels (0-20 points):
20 pts: Written in TypeScript, types bundled
15 pts: Bundled .d.ts files (not written in TS but types included)
10 pts: DefinitelyTyped @types/ package available
5 pts: Types available but outdated
0 pts: No types, any usage everywhere
Why this matters in 2026:
- 83% of new projects use TypeScript
- Packages without types require manual type declarations
- Type accuracy predicts API correctness (if types are wrong, API docs often are too)
Dimension 4: Security Posture
Signals:
- Known CVE count (from npm audit database)
- Time to patch past CVEs
- Dependency count (attack surface)
- Package provenance (verified build chain)
Scoring:
0 known CVEs: 30/30
1-2 CVEs, patched: 20/30
1-2 CVEs, unpatched: 10/30
3+ CVEs: 0/30
Transitive dependency count:
< 5 deps: 20/20
5-20 deps: 15/20
20-50 deps: 10/20
50+ deps: 5/20
Dimension 5: Bundle Efficiency
Signals:
- Gzipped bundle size
- Tree-shaking support (sideEffects: false in package.json)
- ESM support (better tree-shaking for bundlers)
- Dependency weight (deps bundle size)
Scoring:
< 5KB gzip: 20/20
5-25KB: 15/20
25-100KB: 10/20
100-500KB: 5/20
> 500KB: 0/20
The five dimensions described above interact in ways that raw individual metrics don't capture. A package can have excellent maintenance activity (publishing weekly) but poor security posture (many transitive dependencies with known vulnerabilities). A package can have massive download counts but deteriorating TypeScript support as a maintainer shifts focus. The composite score reflects these interactions — which is why a single-dimension evaluation ("it has 10 million weekly downloads, must be fine") frequently leads to poor decisions that only become apparent when maintaining a production application a year later.
The weighting of these dimensions is not arbitrary. Maintenance activity carries the highest weight because unmaintained packages create compounding risk: today's stable unmaintained package becomes tomorrow's security liability when a CVE is discovered in one of its dependencies and there's no maintainer to update it. Security posture is weighted heavily because the consequences of a supply chain compromise are disproportionately severe compared to most other package quality issues. Adoption and growth together indicate ecosystem confidence — an accurate signal that's hard for a single package author to fake over time.
Bundle efficiency and TypeScript support are weighted as secondary signals because poor performance on these dimensions doesn't prevent a package from being adopted safely — it just means choosing that package comes with DX costs the team should be aware of. These dimensions often reveal where a package was designed rather than whether it was designed well.
Real-World Health Score Examples
High Health Score (Drizzle ORM: 92/100)
Drizzle ORM health breakdown:
Maintenance: 29/30 (published 5 days ago)
Adoption: 25/25 (2M downloads, +400% YoY)
TypeScript: 20/20 (written in TypeScript, excellent inference)
Security: 28/30 (0 CVEs, 12 transitive deps)
Bundle efficiency: 17/20 (3KB runtime, ESM support)
Composite: 92/100
Interpretation: Excellent choice for new projects.
Active development, strong growth, TypeScript-first.
Declining Health Score (TypeORM: 52/100)
TypeORM health breakdown:
Maintenance: 15/30 (last published 4 months ago, slow issue response)
Adoption: 18/25 (3M downloads but -15% YoY)
TypeScript: 15/20 (TypeScript support but legacy decorator approach)
Security: 14/30 (2 unpatched moderate CVEs, high dep count)
Bundle efficiency: 10/20 (larger bundle, CJS-first)
Composite: 52/100
Interpretation: Still works, but declining. For new projects,
consider Prisma or Drizzle instead.
The Drizzle and TypeORM examples are instructive precisely because they show a package in ascent and a package in decline occupying the same conceptual space (TypeScript ORM). In 2020, TypeORM had a score that would look similar to Drizzle's today — it was the actively maintained, growing option in its category. The decline didn't happen suddenly; it happened incrementally as maintenance slowed, as Drizzle's type inference proved superior, and as the community gradually recommended Drizzle to new projects while keeping TypeORM for existing ones. Health scores are a snapshot that reveals the current state, but the trajectory — what the score looked like three or six months ago — tells the more important story about where the package is headed.
The practical corollary: when a package score is borderline (55-70), look at the historical trend rather than the current number. A package at 62 that was at 80 six months ago is in a different situation from a package that has held at 62 for two years. The declining package needs attention — either an investigation into what changed or a migration plan if the decline signals abandonment. The stable package has found its equilibrium and may continue at that quality level indefinitely without representing any new risk to teams already using it. The former may be experiencing a temporary disruption — a change in maintainership, a major version release that created a transient issue backlog — or a genuine decline. The latter has found a stable equilibrium that may persist indefinitely.
Packages backed by companies or foundations have structurally different health score trajectories than individually maintained projects. React, maintained by Meta, and Vite, backed by Evan You's team with broad commercial sponsorship, have strong structural support that individual contributors lack. When evaluating packages for long-lived production applications, maintainer organizational backing is worth understanding separately from the score — a well-backed package at 75 may be safer than an individually-maintained package at 85. Packages with multiple active maintainers have more resilient health scores than single-maintainer projects, all else equal. A solo maintainer who becomes unavailable for three months — illness, new job, personal circumstances — causes a dramatic maintenance dimension score drop. A five-person maintainer team has redundancy that absorbs individual availability fluctuations. For business-critical dependencies, maintainer count is worth examining explicitly beyond the composite score.
How to Read Health Scores in Context
Health scores are useful signals, not absolute verdicts:
High score but wrong choice:
- styled-components: 75/100 (active, popular) but RSC-incompatible
→ High health score, but wrong choice for Next.js 15 app router
Low score but acceptable:
- An internal analytics package: 40/100 (slow updates)
→ Slow updates are fine if the API is stable and you don't need new features
Score trajectories matter:
- Package at 80/100 but declining 5 points/quarter → better alternatives exist
- Package at 60/100 but improving rapidly → promising, monitor it
Building Your Own Package Evaluation Process
# 5-minute package evaluation checklist:
1. Check PkgPulse health score
→ pkgpulse.com/compare/[package-a]-vs-[package-b]
2. Check download trend
→ npmtrends.com for 12-month view
3. Check last publish date
→ npm view [package] time.modified
4. Check TypeScript support
→ npmjs.com/package/[package] → sidebar shows "TypeScript"
5. Check bundle size
→ bundlephobia.com/package/[package]
6. Scan GitHub issues
→ Filter by "no response" label — are issues being ignored?
7. Check dependents
→ npmjs.com/package/[package] → "Used by N packages"
→ High dependents = slower to abandon
# Total time: 5-10 minutes
# Catches 90% of bad package choices
The 5-minute checklist above is the right approach for routine dependency evaluation. It covers the most common failure modes: unmaintained packages, security issues, poor TypeScript support, bloated bundles. What it doesn't capture are the subtler signals that distinguish excellent packages from merely acceptable ones. For mission-critical dependencies — the database client, the authentication library, the core framework — a more thorough evaluation is warranted.
The extended evaluation adds: reviewing the changelog for evidence of thoughtful API design and breaking change management (frequent breaking changes without migration guides are a red flag); examining issue and PR response patterns for specific categories of issues like security reports, documentation gaps, and TypeScript accuracy problems; checking whether the package has a clear governance model (individual vs organizational backing) and whether commercial support is available if needed; and verifying that the package's version support policy aligns with your project's Node.js version requirements.
Most packages don't need this level of evaluation. A utility library that joins strings with a delimiter does not need a governance model assessment. But for the handful of dependencies in any project that represent foundational decisions — the ones where migration would be painful and the ones that handle security-sensitive operations — the extra 20 minutes of evaluation prevents problems that would cost orders of magnitude more time to fix later.
The Alternative Data Sources Behind Health Scores
Download counts and GitHub stars are the first things developers check, but health scores draw on a deeper stack of signals that most package browsers don't surface. Understanding what feeds into the score helps you interpret it correctly.
Issue resolution velocity is one of the highest-signal inputs: the ratio of issues closed to issues opened over the last 90 days. A ratio above 1.0 means the maintainer is closing faster than issues are being filed — a strong sign of responsive stewardship. A declining ratio over multiple quarters often signals that the project is scaling beyond what the current maintainer team can handle, even if overall activity looks healthy on the surface.
Security audit history provides another dimension that raw CVE counts miss. The metric that matters is how quickly a maintainer patches CVEs once they're reported. Packages that address critical vulnerabilities within 48 hours score significantly higher than packages that sit on unpatched advisories for months. The speed of response is a proxy for the maintainer's security posture and operational capacity — it answers the question: "If a zero-day appears in this package, how long will my production app be exposed?"
TypeScript coverage quality is increasingly treated as a first-class signal rather than a binary "has types / no types" flag. Some packages technically include .d.ts files that are so permissive — overuse of any, missing generics, incomplete overloads — that they provide almost no safety benefit. Scores that account for type accuracy are more predictive of developer experience than those that only check type presence.
Dependency graph cleanliness is the supply chain signal: packages with zero to two runtime dependencies represent contained attack surfaces. Every transitive dependency is an additional potential vulnerability vector. A utility library pulling in 47 transitive packages for a single function is a fundamentally different security and maintenance commitment than one with no dependencies at all.
Finally, download velocity — not absolute download count — distinguishes momentum from inertia. A package at 500K downloads per week growing 15% month-over-month is genuinely healthier than a package at 5M downloads per week declining 5% per month. The former is gaining ecosystem trust; the latter is living on accumulated adoption that may be quietly eroding as the ecosystem moves on.
Reading Health Scores in Context
A health score is most useful when interpreted relative to package category and use case. Baseline expectations vary significantly across the ecosystem, and comparing a score in isolation without understanding what's normal for that category leads to miscalibrated decisions.
Build tools — Vite, esbuild, Rollup, Turbopack — typically score 85–95. They're core infrastructure with commercial backing, large contributor communities, and strong incentives to stay current with Node.js and browser targets. When a build tool scores 78, that's worth investigating. When a utility package scores 78, that may be entirely acceptable.
Utility packages with bounded scope — clsx, ms, klona, nanoid — tend to score well precisely because their maintenance burden is low. A stable API that hasn't needed changes in two years isn't a red flag for these packages; it's a sign the scope is right-sized.
ORMs and authentication libraries score more variably because their complexity creates sustained maintenance burden and broad security surface. An ORM that integrates with multiple databases, handles migrations, and exposes query APIs has far more moving parts than a date formatting utility. Expect wider score ranges in these categories and calibrate accordingly.
The practical thresholds: a score of 70 or above indicates a well-maintained package safe for production adoption. The 50–70 range warrants investigation before adopting — check the last release date, open CVEs, and whether download velocity is declining. Below 50 is a yellow flag for general adoption and a red flag for anything security-sensitive.
Relative comparison is often more informative than absolute scores. When two alternatives score 90 versus 75, the 15-point gap is making a clear statement about relative maintenance quality that's more meaningful than either number in isolation.
One important caveat: health scores are backward-looking. They measure what has happened, not what will happen. A package just acquired by a company known for strong open-source stewardship might score low today based on historical neglect, but trajectory matters. When scores are trending upward — new maintainers added, release cadence increasing, CVE response time shortening — a currently modest score may understate future quality.
Building a Health Score Monitoring Practice
Checking a health score at the point of installing a new dependency is valuable, but the full benefit comes from making health monitoring a repeatable practice built into your team's workflow at multiple checkpoints.
The quarterly dependency review is the highest-leverage habit. Once per quarter, run a health score check on all direct dependencies. Flag any that have declined by 15 or more points since the last review or fallen below 60. For packages that have declined significantly, investigate the cause: abandoned maintainer, unpatched security issue, ecosystem migration underway? This review typically takes 30–60 minutes for a project with 20–40 direct dependencies — a worthwhile investment against the alternative of discovering a deprecated package during an emergency.
Integrating into code review adds friction at the right moment. In PR templates, add a checkbox: "If this PR adds a new npm dependency, I've checked its health score and it is above 70." That simple gate creates a forcing function that surfaces marginal packages for discussion before they're committed to the project's dependency manifest. This creates a natural pause proportional to the significance of the decision. Adding a dependency is a commitment that deserves a 2-minute check before it's merged.
For mission-critical projects, a CI integration provides automated enforcement. A custom CI step that fetches health scores for direct dependencies and fails the build if any score falls below a configurable threshold (typically 60–70) prevents low-quality dependencies from being introduced without intentional override. The override should require an explicit acknowledgment in a config file with a documented reason — this creates an audit trail of deliberate exceptions that the team can revisit during the next dependency review cycle.
One underappreciated pattern in health score monitoring: packages often signal decline through the security dimension before other dimensions degrade. Maintainer attention tends to fall off in a specific sequence — first new feature development slows, then response time on issues increases, then CVE patches become delayed. By the time a package's maintenance dimension score drops significantly, the security posture dimension may have already been declining for months. Teams that monitor security specifically — using tools like npm audit in CI and watching the security dimension of health scores separately from the composite — catch the signal earlier than those who only monitor the composite number.
The compounding value of consistent monitoring is catching problems early, when options are plentiful. A package that scores 72 today but has been declining for three quarters is likely to score 55 in a year. Migrating away from a package when you have time to evaluate alternatives and run a measured migration is a fundamentally different experience than doing it under pressure when a deprecation notice or security advisory forces the issue.
See detailed health scores and download trends for any npm package comparison on PkgPulse.
See also: React vs Vue and How to Evaluate npm Package Health Before Installing, The Average Lifespan of an npm Package.
See the live comparison
View react vs. vue on PkgPulse →