TL;DR: DuckDB is the embedded analytical database — runs in-process, reads Parquet/CSV/JSON directly, SQL on local files with zero infrastructure. ClickHouse is the column-oriented OLAP powerhouse — handles billions of rows, real-time ingestion, materialized views, and sub-second aggregation queries at scale. QuestDB is the time-series specialist — SQL with time-series extensions, high-throughput ingestion via InfluxDB Line Protocol, and optimized for ordered data. In 2026: DuckDB for local analytics and embedded OLAP, ClickHouse for large-scale analytical workloads, QuestDB for time-series and IoT data.
Key Takeaways
- DuckDB: Embedded (in-process), zero infrastructure. Reads Parquet, CSV, JSON, Arrow natively. SQL with window functions, CTEs, lateral joins. Best for local analytics, data engineering pipelines, and embedding analytics in applications
- ClickHouse: Distributed column store, petabyte-scale. MergeTree engine family, materialized views, real-time aggregation. Best for dashboards, log analytics, and any workload with billions of rows
- QuestDB: Time-series optimized, column-based. SQL with
SAMPLE BY,LATEST ON,ASOF JOIN. InfluxDB Line Protocol ingestion. Best for metrics, financial data, IoT, and any time-ordered dataset
DuckDB — Embedded Analytical Database
DuckDB runs in-process with zero infrastructure — think SQLite for analytics. It reads Parquet, CSV, and JSON files directly and executes complex OLAP queries locally.
Basic Setup and Queries
import duckdb from "duckdb";
// In-memory database
const db = new duckdb.Database(":memory:");
const conn = db.connect();
// Query Parquet files directly — no import step
conn.all(
`SELECT
product_category,
COUNT(*) as order_count,
SUM(amount) as total_revenue,
AVG(amount) as avg_order_value
FROM read_parquet('s3://analytics-bucket/orders/*.parquet')
WHERE order_date >= '2026-01-01'
GROUP BY product_category
ORDER BY total_revenue DESC
LIMIT 10`,
(err, rows) => {
console.table(rows);
}
);
// Query CSV files with automatic type detection
conn.all(
`SELECT * FROM read_csv_auto('data/users.csv') LIMIT 5`,
(err, rows) => console.table(rows)
);
// Query JSON files
conn.all(
`SELECT * FROM read_json_auto('data/events.json') LIMIT 5`,
(err, rows) => console.table(rows)
);
Advanced Analytics
// Window functions for running totals and rankings
conn.all(`
WITH daily_sales AS (
SELECT
date_trunc('day', order_date) as day,
product_category,
SUM(amount) as daily_revenue
FROM read_parquet('orders.parquet')
GROUP BY 1, 2
)
SELECT
day,
product_category,
daily_revenue,
SUM(daily_revenue) OVER (
PARTITION BY product_category
ORDER BY day
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) as cumulative_revenue,
RANK() OVER (
PARTITION BY day
ORDER BY daily_revenue DESC
) as daily_rank
FROM daily_sales
ORDER BY day DESC, daily_rank
`, (err, rows) => console.table(rows));
// Pivot tables
conn.all(`
PIVOT (
SELECT
date_part('month', order_date) as month,
product_category,
SUM(amount) as revenue
FROM read_parquet('orders.parquet')
WHERE order_date >= '2026-01-01'
GROUP BY 1, 2
)
ON product_category
USING SUM(revenue)
ORDER BY month
`, (err, rows) => console.table(rows));
Persistent Database and Exports
// Persistent database on disk
const db = new duckdb.Database("analytics.duckdb");
const conn = db.connect();
// Create tables from external sources
conn.run(`
CREATE TABLE orders AS
SELECT * FROM read_parquet('s3://bucket/orders/*.parquet')
`);
// Create indexes for frequently filtered columns
conn.run(`CREATE INDEX idx_orders_date ON orders(order_date)`);
// Export query results to Parquet
conn.run(`
COPY (
SELECT product_category, COUNT(*) as cnt, SUM(amount) as total
FROM orders
GROUP BY product_category
) TO 'summary.parquet' (FORMAT PARQUET)
`);
// Export to CSV
conn.run(`
COPY orders TO 'orders_backup.csv' (FORMAT CSV, HEADER)
`);
// Attach multiple databases
conn.run(`ATTACH 'users.duckdb' AS users_db`);
conn.all(`
SELECT o.*, u.name, u.email
FROM orders o
JOIN users_db.users u ON o.user_id = u.id
`, (err, rows) => console.table(rows));
S3 and Remote Data
// Configure S3 access
conn.run(`
SET s3_region = 'us-east-1';
SET s3_access_key_id = '${process.env.AWS_ACCESS_KEY_ID}';
SET s3_secret_access_key = '${process.env.AWS_SECRET_ACCESS_KEY}';
`);
// Query S3 Parquet files with glob patterns
conn.all(`
SELECT
date_trunc('hour', timestamp) as hour,
COUNT(*) as request_count,
AVG(response_time_ms) as avg_latency,
PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY response_time_ms) as p99
FROM read_parquet('s3://logs-bucket/2026/03/**/*.parquet')
WHERE status_code >= 500
GROUP BY 1
ORDER BY 1 DESC
`, (err, rows) => console.table(rows));
// Query remote HTTP files
conn.all(`
SELECT * FROM read_csv_auto(
'https://data.example.com/public/dataset.csv'
) LIMIT 100
`, (err, rows) => console.table(rows));
ClickHouse — Column-Oriented OLAP at Scale
ClickHouse is the high-performance column-oriented database for real-time analytical queries across billions of rows.
Node.js Client Setup
import { createClient } from "@clickhouse/client";
const client = createClient({
url: "http://localhost:8123",
username: "default",
password: process.env.CLICKHOUSE_PASSWORD,
database: "analytics",
});
// Create a table with MergeTree engine
await client.command({
query: `
CREATE TABLE IF NOT EXISTS events (
event_id UUID DEFAULT generateUUIDv4(),
timestamp DateTime64(3),
user_id UInt64,
event_type LowCardinality(String),
properties String, -- JSON stored as String
amount Decimal64(2),
country LowCardinality(String)
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(timestamp)
ORDER BY (event_type, timestamp, user_id)
TTL timestamp + INTERVAL 1 YEAR
`,
});
High-Throughput Ingestion
// Batch insert — millions of rows per second
await client.insert({
table: "events",
values: events.map((e) => ({
timestamp: e.timestamp,
user_id: e.userId,
event_type: e.type,
properties: JSON.stringify(e.properties),
amount: e.amount,
country: e.country,
})),
format: "JSONEachRow",
});
// Stream insert for large datasets
const stream = client.insert({
table: "events",
format: "JSONEachRow",
});
for await (const batch of eventBatches) {
await stream.add(batch);
}
await stream.end();
// Async insert for fire-and-forget (higher throughput)
await client.command({
query: `
INSERT INTO events FORMAT JSONEachRow
${events.map((e) => JSON.stringify(e)).join("\n")}
`,
clickhouse_settings: {
async_insert: 1,
wait_for_async_insert: 0, // Don't wait for flush
},
});
Analytical Queries
// Real-time dashboard query — sub-second on billions of rows
const result = await client.query({
query: `
SELECT
toStartOfHour(timestamp) AS hour,
event_type,
count() AS event_count,
uniq(user_id) AS unique_users,
sum(amount) AS total_amount,
quantile(0.99)(amount) AS p99_amount
FROM events
WHERE timestamp >= now() - INTERVAL 24 HOUR
GROUP BY hour, event_type
ORDER BY hour DESC, event_count DESC
`,
format: "JSONEachRow",
});
const rows = await result.json();
console.table(rows);
// Funnel analysis
const funnel = await client.query({
query: `
SELECT
level,
count() AS users,
round(users / first_value(users) OVER (ORDER BY level) * 100, 2) AS conversion_pct
FROM (
SELECT
user_id,
windowFunnel(86400)(
timestamp,
event_type = 'page_view',
event_type = 'add_to_cart',
event_type = 'checkout',
event_type = 'purchase'
) AS level
FROM events
WHERE timestamp >= now() - INTERVAL 7 DAY
GROUP BY user_id
)
GROUP BY level
ORDER BY level
`,
format: "JSONEachRow",
});
Materialized Views
// Real-time aggregation with materialized views
await client.command({
query: `
-- Target table for aggregated data
CREATE TABLE IF NOT EXISTS hourly_metrics (
hour DateTime,
event_type LowCardinality(String),
country LowCardinality(String),
event_count AggregateFunction(count, UInt64),
unique_users AggregateFunction(uniq, UInt64),
total_amount AggregateFunction(sum, Decimal64(2))
)
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(hour)
ORDER BY (hour, event_type, country)
`,
});
await client.command({
query: `
-- Materialized view auto-aggregates on insert
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_hourly_metrics
TO hourly_metrics
AS SELECT
toStartOfHour(timestamp) AS hour,
event_type,
country,
countState() AS event_count,
uniqState(user_id) AS unique_users,
sumState(amount) AS total_amount
FROM events
GROUP BY hour, event_type, country
`,
});
// Query the materialized view — instant results
const metrics = await client.query({
query: `
SELECT
hour,
event_type,
countMerge(event_count) AS events,
uniqMerge(unique_users) AS users,
sumMerge(total_amount) AS revenue
FROM hourly_metrics
WHERE hour >= now() - INTERVAL 7 DAY
GROUP BY hour, event_type
ORDER BY hour DESC
`,
format: "JSONEachRow",
});
QuestDB — Time-Series Database
QuestDB is optimized for time-series data with SQL extensions for time-based queries, high-throughput ingestion, and columnar storage.
Ingestion via InfluxDB Line Protocol
import net from "net";
// InfluxDB Line Protocol over TCP — fastest ingestion path
const socket = net.connect({ host: "localhost", port: 9009 });
// Send metrics in ILP format
function sendMetric(
measurement: string,
tags: Record<string, string>,
fields: Record<string, number>,
timestamp: bigint // nanoseconds
) {
const tagStr = Object.entries(tags)
.map(([k, v]) => `${k}=${v}`)
.join(",");
const fieldStr = Object.entries(fields)
.map(([k, v]) => `${k}=${v}`)
.join(",");
socket.write(`${measurement},${tagStr} ${fieldStr} ${timestamp}\n`);
}
// Send API metrics
sendMetric(
"api_requests",
{ endpoint: "/api/users", method: "GET", status: "200" },
{ duration_ms: 45.2, response_bytes: 1024 },
BigInt(Date.now()) * 1000000n // Convert ms to ns
);
// Batch sending for high throughput
const batch = metrics
.map(
(m) =>
`api_requests,endpoint=${m.endpoint},method=${m.method} ` +
`duration_ms=${m.duration},status_code=${m.status} ` +
`${BigInt(m.timestamp) * 1000000n}`
)
.join("\n");
socket.write(batch + "\n");
REST API Queries
// QuestDB REST API for SQL queries
async function queryQuestDB(sql: string) {
const response = await fetch(
`http://localhost:9000/exec?query=${encodeURIComponent(sql)}`,
{ method: "GET" }
);
return response.json();
}
// Basic aggregation
const hourly = await queryQuestDB(`
SELECT
timestamp,
endpoint,
count() as request_count,
avg(duration_ms) as avg_latency,
max(duration_ms) as max_latency
FROM api_requests
WHERE timestamp > dateadd('d', -1, now())
SAMPLE BY 1h
ALIGN TO CALENDAR
`);
console.table(hourly.dataset);
Time-Series SQL Extensions
// SAMPLE BY — time-based downsampling (QuestDB's killer feature)
const downsampled = await queryQuestDB(`
SELECT
timestamp,
endpoint,
count() as requests,
avg(duration_ms) as avg_ms,
percentile_cont(0.99) WITHIN GROUP (ORDER BY duration_ms) as p99_ms
FROM api_requests
WHERE timestamp > dateadd('h', -24, now())
SAMPLE BY 15m
FILL(NULL)
ALIGN TO CALENDAR
`);
// LATEST ON — get the most recent value per series
const latest = await queryQuestDB(`
SELECT * FROM sensor_readings
LATEST ON timestamp PARTITION BY sensor_id
WHERE timestamp > dateadd('h', -1, now())
`);
// ASOF JOIN — join time-series by closest timestamp
const correlated = await queryQuestDB(`
SELECT
t.timestamp,
t.symbol,
t.price as trade_price,
q.bid,
q.ask,
t.price - q.bid as spread
FROM trades t
ASOF JOIN quotes q ON (t.symbol = q.symbol)
WHERE t.timestamp > dateadd('h', -1, now())
`);
// LT JOIN — join where right timestamp is strictly less than left
const withPrevious = await queryQuestDB(`
SELECT
a.timestamp,
a.value as current_value,
b.value as previous_value,
a.value - b.value as delta
FROM metrics a
LT JOIN metrics b ON (a.sensor_id = b.sensor_id)
WHERE a.timestamp > dateadd('d', -7, now())
`);
Table Creation and Partitioning
// Create a partitioned table optimized for time-series
await queryQuestDB(`
CREATE TABLE IF NOT EXISTS api_requests (
timestamp TIMESTAMP,
endpoint SYMBOL CAPACITY 256 INDEX,
method SYMBOL CAPACITY 8,
status_code SHORT,
duration_ms DOUBLE,
response_bytes LONG,
user_id LONG,
region SYMBOL CAPACITY 32
) TIMESTAMP(timestamp) PARTITION BY DAY WAL
DEDUP UPSERT KEYS(timestamp, endpoint, user_id)
`);
// SYMBOL type = dictionary-encoded strings (like ClickHouse LowCardinality)
// WAL = Write-Ahead Log for concurrent readers/writers
// PARTITION BY DAY = one partition per day for efficient time range queries
// DEDUP UPSERT = handle duplicate events
// Detach old partitions for archival
await queryQuestDB(`
ALTER TABLE api_requests DETACH PARTITION
WHERE timestamp < dateadd('M', -6, now())
`);
Grafana Integration
// QuestDB has native Grafana support via PostgreSQL wire protocol
// grafana.ini datasource config:
// type: postgres
// host: questdb:8812
// database: qdb
// user: admin
// Dashboard query with Grafana variables
const grafanaQuery = `
SELECT
$__timeGroup(timestamp, $__interval) as time,
endpoint,
avg(duration_ms) as avg_latency,
count() as requests
FROM api_requests
WHERE $__timeFilter(timestamp)
AND endpoint = '$endpoint'
SAMPLE BY $__interval
FILL(NULL)
`;
Feature Comparison
| Feature | DuckDB | ClickHouse | QuestDB |
|---|---|---|---|
| Architecture | Embedded (in-process) | Distributed server | Server (single-node + cloud) |
| Storage | Columnar | Columnar (MergeTree) | Columnar (time-partitioned) |
| Deployment | Library / no server | Self-hosted or cloud | Self-hosted or cloud |
| Scale | Single machine (TBs) | Petabytes (distributed) | Single node (TBs) |
| Ingestion | File reads (Parquet/CSV) | Batch + streaming | ILP + REST + PostgreSQL |
| Query Language | SQL (PostgreSQL-compatible) | SQL (ClickHouse dialect) | SQL (PostgreSQL + extensions) |
| Time-Series | Standard SQL | DateTime functions | SAMPLE BY, LATEST ON, ASOF JOIN |
| Materialized Views | ❌ | ✅ (real-time) | ❌ |
| Window Functions | ✅ (full) | ✅ (full) | ✅ (basic) |
| JSON Support | ✅ (native) | ✅ (JSON functions) | Limited |
| S3/Remote Files | ✅ (native) | ✅ (s3 table function) | ❌ |
| Node.js Client | duckdb npm | @clickhouse/client | REST API / pg driver |
| Concurrency | Single-writer | High concurrency | WAL (concurrent R/W) |
| Replication | ❌ | ✅ (built-in) | Cloud only |
| License | MIT | Apache 2.0 | Apache 2.0 |
| Ideal Workload | Ad-hoc analytics, ETL | Dashboards, log analytics | Metrics, IoT, financial |
When to Use Each
Choose DuckDB if:
- You need analytics with zero infrastructure (embedded in your app)
- You're querying Parquet, CSV, or JSON files directly from S3 or local disk
- Your data fits on a single machine (up to low TBs)
- You're building data pipelines or ETL workflows
- You want SQLite-like simplicity for analytical queries
Choose ClickHouse if:
- You have billions of rows and need sub-second query performance
- Real-time dashboards and monitoring are your primary use case
- You need materialized views for automatic pre-aggregation
- You require distributed architecture across multiple nodes
- Funnel analysis, cohort queries, and complex aggregations are important
Choose QuestDB if:
- Your data is time-series (metrics, IoT sensors, financial ticks)
- You need SAMPLE BY for natural time-based downsampling
- ASOF JOIN and LATEST ON are critical for time-series correlation
- High ingestion throughput via InfluxDB Line Protocol matters
- You want SQL (not InfluxQL or PromQL) for time-series queries
Production Deployment and Infrastructure Considerations
Running ClickHouse in production requires meaningful infrastructure investment. The typical starting point is a three-node cluster with 32GB RAM each for workloads in the tens of billions of rows — smaller setups work but lose ClickHouse's strongest characteristic, which is distributed query execution across replicas. ClickHouse Cloud (the managed offering) abstracts away node management but introduces egress costs that can surprise teams doing data export or cross-region replication at scale. QuestDB's production deployment is simpler because it runs as a single binary with no external dependencies, but this also means it lacks built-in replication — the cloud product (QuestDB Cloud) handles replication, while self-hosted deployments rely on filesystem snapshots for backup. DuckDB's production story is intentionally minimal: it runs in-process with the application, which means database availability is tied to application availability. For persistent data, DuckDB writes to a single .duckdb file with a WAL for crash safety, but it has no replication or high-availability primitive by design. Teams that outgrow single-machine DuckDB typically move their data to cloud object storage (S3 or GCS as Parquet files) and continue querying it remotely.
Security and Access Control
The three databases take fundamentally different approaches to security. ClickHouse has the most mature RBAC system — users, roles, row-level security, and column-level access control are all first-class features, making it suitable for multi-tenant analytics where different business units should see different data slices. IP-based allow lists and SSL mutual TLS are available for network security. QuestDB's security model is simpler: authentication is handled at the HTTP API and PostgreSQL wire protocol level, with user management added relatively recently. For sensitive environments, teams typically put QuestDB behind a reverse proxy that handles authentication. DuckDB, running embedded, inherits the security of the process it runs in — there is no network listener and no authentication layer by default, which is secure for single-user local analytics but requires careful thought when embedding in a multi-tenant application. For shared DuckDB access, the MotherDuck cloud extension provides access control at the cloud layer.
TypeScript and Node.js Integration Patterns
Node.js integration quality varies significantly. The official @clickhouse/client is actively maintained by the ClickHouse team, ships with TypeScript types, and supports both streaming and batch query patterns — it is genuinely production-quality. The DuckDB Node.js client is functional but still uses a callback-based API at its core; the duckdb-async wrapper on npm provides promise wrappers, and a newer official WASM build (@duckdb/duckdb-wasm) enables DuckDB in browser environments. QuestDB lacks a dedicated Node.js client and relies on its PostgreSQL wire protocol (using the pg package) or HTTP REST API. The REST API is simple and sufficient for most use cases, but it lacks prepared statement support and connection pooling in the traditional sense. For type-safe SQL queries in Node.js across all three databases, tools like Kysely and Drizzle are beginning to add support, though coverage is strongest for the PostgreSQL wire protocol databases.
Ecosystem Context and Competitive Landscape
The analytical database space has moved remarkably fast since 2023. DuckDB went from a research project to the backbone of the "local-first analytics" movement, with embedded use in data science notebooks, Observable, and even browser applications via its WASM build. ClickHouse has benefited from the observability space's explosive growth — companies building internal developer portals, monitoring systems, and log analytics have found ClickHouse's materialized views and query performance unmatched for their specific needs. QuestDB competes in the time-series segment against InfluxDB, TimescaleDB, and newer entrants like GreptimeDB and TDengine. Its SQL-native approach differentiates it from InfluxQL and Flux-based systems, and the growing adoption of InfluxDB Line Protocol as a universal time-series ingestion standard benefits QuestDB since it speaks ILP natively. The broader trend is that analytical workloads that previously went to a dedicated data warehouse (Snowflake, BigQuery) are increasingly handled by embedded or single-binary systems for latency and cost reasons.
Choosing the Right Tool for Local vs. Distributed Workloads
The sharpest distinction between these three databases is the local-vs-distributed axis. DuckDB is explicitly designed to be embedded — it has no network listener, no authentication layer, and no replication. This is not a limitation but a deliberate design choice that makes it suitable for embedding analytics directly in application code without infrastructure overhead. ClickHouse and QuestDB both operate as server processes with network endpoints, authentication, and the operational overhead that implies. Before evaluating which server-side analytical database to deploy, teams should honestly assess whether DuckDB running against Parquet files on object storage covers their needs. Many analytics workloads that teams assume require a dedicated OLAP database can be handled by DuckDB reading S3 Parquet files, especially when combined with a data transformation pipeline that pre-aggregates data into Parquet snapshots. The infrastructure cost savings and operational simplicity of this approach are substantial compared to running a ClickHouse cluster.
Methodology
Feature comparison based on DuckDB v1.x, ClickHouse 24.x, and QuestDB 8.x documentation as of March 2026. Benchmarks reference official published results. Code examples use official Node.js clients and REST APIs. Evaluated on: query performance characteristics, ingestion methods, time-series capabilities, scalability, and developer experience with Node.js.
See also: How to Set Up Drizzle ORM with Next.js (2026 Guide), How to Migrate from Mongoose to Prisma, and Mongoose vs Prisma in 2026: MongoDB vs SQL-First