DuckDB vs ClickHouse vs QuestDB: Analytical Databases Compared (2026)
TL;DR: DuckDB is the embedded analytical database — runs in-process, reads Parquet/CSV/JSON directly, SQL on local files with zero infrastructure. ClickHouse is the column-oriented OLAP powerhouse — handles billions of rows, real-time ingestion, materialized views, and sub-second aggregation queries at scale. QuestDB is the time-series specialist — SQL with time-series extensions, high-throughput ingestion via InfluxDB Line Protocol, and optimized for ordered data. In 2026: DuckDB for local analytics and embedded OLAP, ClickHouse for large-scale analytical workloads, QuestDB for time-series and IoT data.
Key Takeaways
- DuckDB: Embedded (in-process), zero infrastructure. Reads Parquet, CSV, JSON, Arrow natively. SQL with window functions, CTEs, lateral joins. Best for local analytics, data engineering pipelines, and embedding analytics in applications
- ClickHouse: Distributed column store, petabyte-scale. MergeTree engine family, materialized views, real-time aggregation. Best for dashboards, log analytics, and any workload with billions of rows
- QuestDB: Time-series optimized, column-based. SQL with
SAMPLE BY,LATEST ON,ASOF JOIN. InfluxDB Line Protocol ingestion. Best for metrics, financial data, IoT, and any time-ordered dataset
DuckDB — Embedded Analytical Database
DuckDB runs in-process with zero infrastructure — think SQLite for analytics. It reads Parquet, CSV, and JSON files directly and executes complex OLAP queries locally.
Basic Setup and Queries
import duckdb from "duckdb";
// In-memory database
const db = new duckdb.Database(":memory:");
const conn = db.connect();
// Query Parquet files directly — no import step
conn.all(
`SELECT
product_category,
COUNT(*) as order_count,
SUM(amount) as total_revenue,
AVG(amount) as avg_order_value
FROM read_parquet('s3://analytics-bucket/orders/*.parquet')
WHERE order_date >= '2026-01-01'
GROUP BY product_category
ORDER BY total_revenue DESC
LIMIT 10`,
(err, rows) => {
console.table(rows);
}
);
// Query CSV files with automatic type detection
conn.all(
`SELECT * FROM read_csv_auto('data/users.csv') LIMIT 5`,
(err, rows) => console.table(rows)
);
// Query JSON files
conn.all(
`SELECT * FROM read_json_auto('data/events.json') LIMIT 5`,
(err, rows) => console.table(rows)
);
Advanced Analytics
// Window functions for running totals and rankings
conn.all(`
WITH daily_sales AS (
SELECT
date_trunc('day', order_date) as day,
product_category,
SUM(amount) as daily_revenue
FROM read_parquet('orders.parquet')
GROUP BY 1, 2
)
SELECT
day,
product_category,
daily_revenue,
SUM(daily_revenue) OVER (
PARTITION BY product_category
ORDER BY day
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) as cumulative_revenue,
RANK() OVER (
PARTITION BY day
ORDER BY daily_revenue DESC
) as daily_rank
FROM daily_sales
ORDER BY day DESC, daily_rank
`, (err, rows) => console.table(rows));
// Pivot tables
conn.all(`
PIVOT (
SELECT
date_part('month', order_date) as month,
product_category,
SUM(amount) as revenue
FROM read_parquet('orders.parquet')
WHERE order_date >= '2026-01-01'
GROUP BY 1, 2
)
ON product_category
USING SUM(revenue)
ORDER BY month
`, (err, rows) => console.table(rows));
Persistent Database and Exports
// Persistent database on disk
const db = new duckdb.Database("analytics.duckdb");
const conn = db.connect();
// Create tables from external sources
conn.run(`
CREATE TABLE orders AS
SELECT * FROM read_parquet('s3://bucket/orders/*.parquet')
`);
// Create indexes for frequently filtered columns
conn.run(`CREATE INDEX idx_orders_date ON orders(order_date)`);
// Export query results to Parquet
conn.run(`
COPY (
SELECT product_category, COUNT(*) as cnt, SUM(amount) as total
FROM orders
GROUP BY product_category
) TO 'summary.parquet' (FORMAT PARQUET)
`);
// Export to CSV
conn.run(`
COPY orders TO 'orders_backup.csv' (FORMAT CSV, HEADER)
`);
// Attach multiple databases
conn.run(`ATTACH 'users.duckdb' AS users_db`);
conn.all(`
SELECT o.*, u.name, u.email
FROM orders o
JOIN users_db.users u ON o.user_id = u.id
`, (err, rows) => console.table(rows));
S3 and Remote Data
// Configure S3 access
conn.run(`
SET s3_region = 'us-east-1';
SET s3_access_key_id = '${process.env.AWS_ACCESS_KEY_ID}';
SET s3_secret_access_key = '${process.env.AWS_SECRET_ACCESS_KEY}';
`);
// Query S3 Parquet files with glob patterns
conn.all(`
SELECT
date_trunc('hour', timestamp) as hour,
COUNT(*) as request_count,
AVG(response_time_ms) as avg_latency,
PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY response_time_ms) as p99
FROM read_parquet('s3://logs-bucket/2026/03/**/*.parquet')
WHERE status_code >= 500
GROUP BY 1
ORDER BY 1 DESC
`, (err, rows) => console.table(rows));
// Query remote HTTP files
conn.all(`
SELECT * FROM read_csv_auto(
'https://data.example.com/public/dataset.csv'
) LIMIT 100
`, (err, rows) => console.table(rows));
ClickHouse — Column-Oriented OLAP at Scale
ClickHouse is the high-performance column-oriented database for real-time analytical queries across billions of rows.
Node.js Client Setup
import { createClient } from "@clickhouse/client";
const client = createClient({
url: "http://localhost:8123",
username: "default",
password: process.env.CLICKHOUSE_PASSWORD,
database: "analytics",
});
// Create a table with MergeTree engine
await client.command({
query: `
CREATE TABLE IF NOT EXISTS events (
event_id UUID DEFAULT generateUUIDv4(),
timestamp DateTime64(3),
user_id UInt64,
event_type LowCardinality(String),
properties String, -- JSON stored as String
amount Decimal64(2),
country LowCardinality(String)
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(timestamp)
ORDER BY (event_type, timestamp, user_id)
TTL timestamp + INTERVAL 1 YEAR
`,
});
High-Throughput Ingestion
// Batch insert — millions of rows per second
await client.insert({
table: "events",
values: events.map((e) => ({
timestamp: e.timestamp,
user_id: e.userId,
event_type: e.type,
properties: JSON.stringify(e.properties),
amount: e.amount,
country: e.country,
})),
format: "JSONEachRow",
});
// Stream insert for large datasets
const stream = client.insert({
table: "events",
format: "JSONEachRow",
});
for await (const batch of eventBatches) {
await stream.add(batch);
}
await stream.end();
// Async insert for fire-and-forget (higher throughput)
await client.command({
query: `
INSERT INTO events FORMAT JSONEachRow
${events.map((e) => JSON.stringify(e)).join("\n")}
`,
clickhouse_settings: {
async_insert: 1,
wait_for_async_insert: 0, // Don't wait for flush
},
});
Analytical Queries
// Real-time dashboard query — sub-second on billions of rows
const result = await client.query({
query: `
SELECT
toStartOfHour(timestamp) AS hour,
event_type,
count() AS event_count,
uniq(user_id) AS unique_users,
sum(amount) AS total_amount,
quantile(0.99)(amount) AS p99_amount
FROM events
WHERE timestamp >= now() - INTERVAL 24 HOUR
GROUP BY hour, event_type
ORDER BY hour DESC, event_count DESC
`,
format: "JSONEachRow",
});
const rows = await result.json();
console.table(rows);
// Funnel analysis
const funnel = await client.query({
query: `
SELECT
level,
count() AS users,
round(users / first_value(users) OVER (ORDER BY level) * 100, 2) AS conversion_pct
FROM (
SELECT
user_id,
windowFunnel(86400)(
timestamp,
event_type = 'page_view',
event_type = 'add_to_cart',
event_type = 'checkout',
event_type = 'purchase'
) AS level
FROM events
WHERE timestamp >= now() - INTERVAL 7 DAY
GROUP BY user_id
)
GROUP BY level
ORDER BY level
`,
format: "JSONEachRow",
});
Materialized Views
// Real-time aggregation with materialized views
await client.command({
query: `
-- Target table for aggregated data
CREATE TABLE IF NOT EXISTS hourly_metrics (
hour DateTime,
event_type LowCardinality(String),
country LowCardinality(String),
event_count AggregateFunction(count, UInt64),
unique_users AggregateFunction(uniq, UInt64),
total_amount AggregateFunction(sum, Decimal64(2))
)
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(hour)
ORDER BY (hour, event_type, country)
`,
});
await client.command({
query: `
-- Materialized view auto-aggregates on insert
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_hourly_metrics
TO hourly_metrics
AS SELECT
toStartOfHour(timestamp) AS hour,
event_type,
country,
countState() AS event_count,
uniqState(user_id) AS unique_users,
sumState(amount) AS total_amount
FROM events
GROUP BY hour, event_type, country
`,
});
// Query the materialized view — instant results
const metrics = await client.query({
query: `
SELECT
hour,
event_type,
countMerge(event_count) AS events,
uniqMerge(unique_users) AS users,
sumMerge(total_amount) AS revenue
FROM hourly_metrics
WHERE hour >= now() - INTERVAL 7 DAY
GROUP BY hour, event_type
ORDER BY hour DESC
`,
format: "JSONEachRow",
});
QuestDB — Time-Series Database
QuestDB is optimized for time-series data with SQL extensions for time-based queries, high-throughput ingestion, and columnar storage.
Ingestion via InfluxDB Line Protocol
import net from "net";
// InfluxDB Line Protocol over TCP — fastest ingestion path
const socket = net.connect({ host: "localhost", port: 9009 });
// Send metrics in ILP format
function sendMetric(
measurement: string,
tags: Record<string, string>,
fields: Record<string, number>,
timestamp: bigint // nanoseconds
) {
const tagStr = Object.entries(tags)
.map(([k, v]) => `${k}=${v}`)
.join(",");
const fieldStr = Object.entries(fields)
.map(([k, v]) => `${k}=${v}`)
.join(",");
socket.write(`${measurement},${tagStr} ${fieldStr} ${timestamp}\n`);
}
// Send API metrics
sendMetric(
"api_requests",
{ endpoint: "/api/users", method: "GET", status: "200" },
{ duration_ms: 45.2, response_bytes: 1024 },
BigInt(Date.now()) * 1000000n // Convert ms to ns
);
// Batch sending for high throughput
const batch = metrics
.map(
(m) =>
`api_requests,endpoint=${m.endpoint},method=${m.method} ` +
`duration_ms=${m.duration},status_code=${m.status} ` +
`${BigInt(m.timestamp) * 1000000n}`
)
.join("\n");
socket.write(batch + "\n");
REST API Queries
// QuestDB REST API for SQL queries
async function queryQuestDB(sql: string) {
const response = await fetch(
`http://localhost:9000/exec?query=${encodeURIComponent(sql)}`,
{ method: "GET" }
);
return response.json();
}
// Basic aggregation
const hourly = await queryQuestDB(`
SELECT
timestamp,
endpoint,
count() as request_count,
avg(duration_ms) as avg_latency,
max(duration_ms) as max_latency
FROM api_requests
WHERE timestamp > dateadd('d', -1, now())
SAMPLE BY 1h
ALIGN TO CALENDAR
`);
console.table(hourly.dataset);
Time-Series SQL Extensions
// SAMPLE BY — time-based downsampling (QuestDB's killer feature)
const downsampled = await queryQuestDB(`
SELECT
timestamp,
endpoint,
count() as requests,
avg(duration_ms) as avg_ms,
percentile_cont(0.99) WITHIN GROUP (ORDER BY duration_ms) as p99_ms
FROM api_requests
WHERE timestamp > dateadd('h', -24, now())
SAMPLE BY 15m
FILL(NULL)
ALIGN TO CALENDAR
`);
// LATEST ON — get the most recent value per series
const latest = await queryQuestDB(`
SELECT * FROM sensor_readings
LATEST ON timestamp PARTITION BY sensor_id
WHERE timestamp > dateadd('h', -1, now())
`);
// ASOF JOIN — join time-series by closest timestamp
const correlated = await queryQuestDB(`
SELECT
t.timestamp,
t.symbol,
t.price as trade_price,
q.bid,
q.ask,
t.price - q.bid as spread
FROM trades t
ASOF JOIN quotes q ON (t.symbol = q.symbol)
WHERE t.timestamp > dateadd('h', -1, now())
`);
// LT JOIN — join where right timestamp is strictly less than left
const withPrevious = await queryQuestDB(`
SELECT
a.timestamp,
a.value as current_value,
b.value as previous_value,
a.value - b.value as delta
FROM metrics a
LT JOIN metrics b ON (a.sensor_id = b.sensor_id)
WHERE a.timestamp > dateadd('d', -7, now())
`);
Table Creation and Partitioning
// Create a partitioned table optimized for time-series
await queryQuestDB(`
CREATE TABLE IF NOT EXISTS api_requests (
timestamp TIMESTAMP,
endpoint SYMBOL CAPACITY 256 INDEX,
method SYMBOL CAPACITY 8,
status_code SHORT,
duration_ms DOUBLE,
response_bytes LONG,
user_id LONG,
region SYMBOL CAPACITY 32
) TIMESTAMP(timestamp) PARTITION BY DAY WAL
DEDUP UPSERT KEYS(timestamp, endpoint, user_id)
`);
// SYMBOL type = dictionary-encoded strings (like ClickHouse LowCardinality)
// WAL = Write-Ahead Log for concurrent readers/writers
// PARTITION BY DAY = one partition per day for efficient time range queries
// DEDUP UPSERT = handle duplicate events
// Detach old partitions for archival
await queryQuestDB(`
ALTER TABLE api_requests DETACH PARTITION
WHERE timestamp < dateadd('M', -6, now())
`);
Grafana Integration
// QuestDB has native Grafana support via PostgreSQL wire protocol
// grafana.ini datasource config:
// type: postgres
// host: questdb:8812
// database: qdb
// user: admin
// Dashboard query with Grafana variables
const grafanaQuery = `
SELECT
$__timeGroup(timestamp, $__interval) as time,
endpoint,
avg(duration_ms) as avg_latency,
count() as requests
FROM api_requests
WHERE $__timeFilter(timestamp)
AND endpoint = '$endpoint'
SAMPLE BY $__interval
FILL(NULL)
`;
Feature Comparison
| Feature | DuckDB | ClickHouse | QuestDB |
|---|---|---|---|
| Architecture | Embedded (in-process) | Distributed server | Server (single-node + cloud) |
| Storage | Columnar | Columnar (MergeTree) | Columnar (time-partitioned) |
| Deployment | Library / no server | Self-hosted or cloud | Self-hosted or cloud |
| Scale | Single machine (TBs) | Petabytes (distributed) | Single node (TBs) |
| Ingestion | File reads (Parquet/CSV) | Batch + streaming | ILP + REST + PostgreSQL |
| Query Language | SQL (PostgreSQL-compatible) | SQL (ClickHouse dialect) | SQL (PostgreSQL + extensions) |
| Time-Series | Standard SQL | DateTime functions | SAMPLE BY, LATEST ON, ASOF JOIN |
| Materialized Views | ❌ | ✅ (real-time) | ❌ |
| Window Functions | ✅ (full) | ✅ (full) | ✅ (basic) |
| JSON Support | ✅ (native) | ✅ (JSON functions) | Limited |
| S3/Remote Files | ✅ (native) | ✅ (s3 table function) | ❌ |
| Node.js Client | duckdb npm | @clickhouse/client | REST API / pg driver |
| Concurrency | Single-writer | High concurrency | WAL (concurrent R/W) |
| Replication | ❌ | ✅ (built-in) | Cloud only |
| License | MIT | Apache 2.0 | Apache 2.0 |
| Ideal Workload | Ad-hoc analytics, ETL | Dashboards, log analytics | Metrics, IoT, financial |
When to Use Each
Choose DuckDB if:
- You need analytics with zero infrastructure (embedded in your app)
- You're querying Parquet, CSV, or JSON files directly from S3 or local disk
- Your data fits on a single machine (up to low TBs)
- You're building data pipelines or ETL workflows
- You want SQLite-like simplicity for analytical queries
Choose ClickHouse if:
- You have billions of rows and need sub-second query performance
- Real-time dashboards and monitoring are your primary use case
- You need materialized views for automatic pre-aggregation
- You require distributed architecture across multiple nodes
- Funnel analysis, cohort queries, and complex aggregations are important
Choose QuestDB if:
- Your data is time-series (metrics, IoT sensors, financial ticks)
- You need SAMPLE BY for natural time-based downsampling
- ASOF JOIN and LATEST ON are critical for time-series correlation
- High ingestion throughput via InfluxDB Line Protocol matters
- You want SQL (not InfluxQL or PromQL) for time-series queries
Methodology
Feature comparison based on DuckDB v1.x, ClickHouse 24.x, and QuestDB 8.x documentation as of March 2026. Benchmarks reference official published results. Code examples use official Node.js clients and REST APIs. Evaluated on: query performance characteristics, ingestion methods, time-series capabilities, scalability, and developer experience with Node.js.