Database Performance

PgBouncer vs Odyssey vs pgcat: Choosing a PostgreSQL Connection Pooler

A detailed comparison of PgBouncer, Odyssey, and pgcat for PostgreSQL connection pooling — architecture differences, threading models, pool modes, TLS support, and decision criteria for each workload type.

JusDB Team
September 19, 2023
12 min read
230 views

A nationwide energy provider came to JusDB after their PostgreSQL metrics table — storing readings from 120,000 smart meters every 15 seconds — had grown past 800 million rows. Queries that once completed in 40ms were now timing out at 30 seconds. Their DBA had tried adding a composite index on (meter_id, recorded_at) and partitioning by month using declarative partitioning, but VACUUM could not keep up with the write rate and the planner kept choosing sequential scans on the latest partition. INSERT throughput had degraded so badly that their ingestion pipeline was dropping readings during peak load windows.

The fix was not more indexing or more RAM. The fix was TimescaleDB. Within 48 hours of migrating to hypertables with automatic chunking, compression policies, and continuous aggregates, query latency on rolling 7-day windows dropped from 28 seconds to under 200 milliseconds — with no application code changes beyond the connection string and a single DDL command.

This guide covers exactly how TimescaleDB works, how to set it up on an existing PostgreSQL installation, and when to choose it over InfluxDB, Prometheus, or raw PostgreSQL partitioning.

TL;DR
  • TimescaleDB is a PostgreSQL extension that turns regular tables into hypertables — automatically partitioned time-series stores that behave like normal SQL tables from the application's perspective.
  • Hypertables partition data into fixed-time chunks behind the scenes; queries are automatically parallelised across chunks and chunk exclusion eliminates full-table scans.
  • Native columnar compression can reduce storage by 90–97% on cold chunks, and add_compression_policy() automates the schedule without DBA intervention.
  • Retention policies (add_retention_policy()) replace manual partition-drop scripts; continuous aggregates replace expensive GROUP BY time queries with pre-computed materialised views that refresh incrementally.
  • TimescaleDB remains 100% SQL — every ORM, BI tool, and PostgreSQL driver that already works against your database continues to work without modification.
  • For pure no-SQL write-heavy workloads with no relational joins, InfluxDB or Prometheus may be simpler; for anything that needs SQL, JOINs, transactions, or existing PostgreSQL tooling, TimescaleDB wins.

Background

Time-series data has a deceptively simple structure: a timestamp, one or more measurement values, and some metadata tags. The problem is scale and access pattern. Regular PostgreSQL tables accumulate billions of rows, and most queries target only the most recent slice of data — the last hour, day, or week. A standard B-tree index on a 2-billion-row table requires scanning index nodes representing years of historical data just to satisfy a query for the last 15 minutes.

TimescaleDB, developed by Timescale Inc. and first released in 2017, solves this with a transparent partitioning layer built directly into PostgreSQL's extension API. It ships as a standard PostgreSQL extension — installed via CREATE EXTENSION, no separate process, no separate port, no separate replication stream. Your existing connection pool, backups, monitoring agents, and read replicas all continue to work unchanged.

The extension's core primitive is the hypertable: a table that looks like a regular PostgreSQL table but is physically backed by a collection of time-ordered child tables called chunks. Each chunk holds a configurable time interval of data. When you query a hypertable, the query planner uses chunk exclusion to skip every chunk outside the time range being queried — effectively turning every time-bounded query into a partition-pruned scan over a tiny fraction of total data.

Tip

TimescaleDB is fully open-source (Apache 2.0) for all core features including hypertables, compression, and retention policies. Continuous aggregates with real-time tails, multi-node distributed hypertables, and tiered storage to object storage (S3) are available in the Timescale Cloud and TimescaleDB Community Edition (Timescale License).

How TimescaleDB Works

Hypertables and Automatic Chunk Management

When you call create_hypertable() on an existing table, TimescaleDB registers it as a hypertable in its internal catalog and sets up trigger-based routing. Every INSERT is transparently redirected to the correct time-range chunk. If the chunk for the incoming timestamp does not yet exist, TimescaleDB creates it automatically. Each chunk is a real PostgreSQL table with its own indexes, autovacuum settings, and storage parameters.

The default chunk_time_interval is 7 days. For high-velocity workloads (100K+ rows/second), setting it to 1 day keeps chunks small enough that a single-day query hits only one chunk. For low-velocity workloads (thousands of rows per day), 30 days or even 90 days prevents chunk proliferation overhead.

Parallel Query Across Chunks

PostgreSQL's parallel query infrastructure treats each hypertable chunk as a separate relation. Queries spanning multiple chunks can distribute work across parallel workers — one worker per chunk in many cases. A 30-day rolling aggregate that would require a full sequential scan on a monolithic table can run as 30 parallel single-chunk scans, dramatically improving throughput on multi-core hardware.

Warning

Chunk exclusion only fires when the time predicate references the partitioning column directly. Wrapping recorded_at in a function call — for example, WHERE date_trunc('day', recorded_at) = '2025-01-01' — prevents the planner from using chunk exclusion and causes a full hypertable scan. Always use range predicates: WHERE recorded_at >= '2025-01-01' AND recorded_at < '2025-01-02'.

Setting Up Hypertables

Step 1: Install the Extension

On Ubuntu/Debian with the Timescale APT repository:

sql
-- After installing the OS package (apt install timescaledb-2-postgresql-16),
-- add to postgresql.conf:
--   shared_preload_libraries = 'timescaledb'
-- Then reload PostgreSQL and enable in your database:

CREATE EXTENSION IF NOT EXISTS timescaledb;

-- Verify
SELECT extname, extversion FROM pg_extension WHERE extname = 'timescaledb';
--  extname    | extversion
-- ------------+------------
--  timescaledb | 2.17.0
Important

timescaledb must be listed in shared_preload_libraries in postgresql.conf before the extension can be created. Forgetting this step produces a confusing error: "timescaledb must be loaded via shared_preload_libraries". Restart PostgreSQL after adding it; a reload is not sufficient.

Step 2: Create a Regular Table and Convert It

sql
-- Create the base table first (standard PostgreSQL DDL)
CREATE TABLE meter_readings (
    recorded_at   TIMESTAMPTZ NOT NULL,
    meter_id      BIGINT      NOT NULL,
    kwh           DOUBLE PRECISION,
    voltage       DOUBLE PRECISION,
    current_amps  DOUBLE PRECISION,
    location_id   INT
);

-- Convert to a hypertable, partitioned by recorded_at, 1-day chunks
SELECT create_hypertable(
    'meter_readings',
    'recorded_at',
    chunk_time_interval => INTERVAL '1 day'
);

-- Add standard indexes — TimescaleDB creates them on each chunk automatically
CREATE INDEX ON meter_readings (meter_id, recorded_at DESC);
CREATE INDEX ON meter_readings (location_id, recorded_at DESC);

Step 3: Verify Chunk Creation

After inserting data, inspect the chunks TimescaleDB has created:

sql
-- List all chunks for a hypertable with their time ranges and sizes
SELECT
    chunk_schema,
    chunk_name,
    range_start,
    range_end,
    pg_size_pretty(total_bytes) AS total_size,
    compression_status
FROM timescaledb_information.chunks
WHERE hypertable_name = 'meter_readings'
ORDER BY range_start DESC
LIMIT 10;

Time-Series Query Functions

TimescaleDB ships a set of SQL functions designed for time-series analytics. The most important is time_bucket(), which rounds timestamps to arbitrary bucket widths — the equivalent of date_trunc() but with support for any interval:

sql
-- Average kWh per meter per 15-minute bucket, last 24 hours
SELECT
    time_bucket('15 minutes', recorded_at) AS bucket,
    meter_id,
    AVG(kwh)        AS avg_kwh,
    MAX(kwh)        AS max_kwh,
    first(voltage, recorded_at) AS opening_voltage,
    last(voltage, recorded_at)  AS closing_voltage
FROM meter_readings
WHERE recorded_at >= NOW() - INTERVAL '24 hours'
GROUP BY bucket, meter_id
ORDER BY bucket DESC, meter_id;

The first(value, time) and last(value, time) functions return the value associated with the earliest or latest timestamp in a group — essential for OHLC-style queries without a subquery. The histogram() aggregate builds a frequency distribution in a single pass.

Compression and Retention Policies

Columnar Compression

TimescaleDB's native columnar compression converts cold chunks from row format to a column-oriented layout. Columns of the same data type stored contiguously compress far better than row-interleaved data — typical compression ratios are 10:1 to 20:1 for numeric time-series, with some IoT datasets achieving 30:1.

sql
-- Enable compression on the hypertable, ordering by meter_id within each chunk
-- orderby improves both compression ratio and query performance on that column
ALTER TABLE meter_readings
    SET (
        timescaledb.compress,
        timescaledb.compress_segmentby = 'meter_id',
        timescaledb.compress_orderby   = 'recorded_at DESC'
    );

-- Automate: compress chunks older than 7 days on a daily schedule
SELECT add_compression_policy('meter_readings', INTERVAL '7 days');

-- Check current compression stats
SELECT
    pg_size_pretty(before_compression_total_bytes) AS uncompressed,
    pg_size_pretty(after_compression_total_bytes)  AS compressed,
    ROUND(
        (1 - after_compression_total_bytes::numeric /
             NULLIF(before_compression_total_bytes, 0)) * 100, 1
    ) AS compression_pct
FROM hypertable_compression_stats('meter_readings');
Tip

Set compress_segmentby to the column you most commonly filter on (usually a device or entity ID). This groups all rows for one meter into the same compressed segment, allowing chunk-level skip logic when querying a specific meter. Using a high-cardinality column like a UUID without segmentation will produce many tiny segments and reduce compression effectiveness.

Retention Policies

Dropping old data in plain PostgreSQL requires a scheduled job that runs DELETE or DROP TABLE on old partitions, with the attendant VACUUM overhead. TimescaleDB's retention policies drop entire chunks atomically — no VACUUM required, because the chunk table is simply unlinked from the hypertable catalog:

sql
-- Drop raw data older than 90 days (chunks are dropped in full, no VACUUM needed)
SELECT add_retention_policy('meter_readings', INTERVAL '90 days');

-- View all scheduled jobs (compression + retention appear here)
SELECT
    job_id,
    proc_name,
    schedule_interval,
    next_start,
    last_run_status
FROM timescaledb_information.jobs
ORDER BY next_start;

Continuous Aggregates

The most common performance bottleneck in time-series PostgreSQL is a dashboard query like "give me hourly averages for all meters over the last 30 days." On a hypertable with 2 billion rows, even with compression and chunk exclusion, scanning 30 days of 15-second raw readings to compute hourly averages is expensive at dashboard refresh frequency.

Continuous aggregates materialise the result of a time_bucket() aggregate query into a separate storage layer and refresh only the portions that have changed since the last refresh:

sql
-- Create a continuous aggregate: hourly rollups per meter
CREATE MATERIALIZED VIEW meter_readings_hourly
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 hour', recorded_at) AS bucket,
    meter_id,
    location_id,
    AVG(kwh)             AS avg_kwh,
    SUM(kwh)             AS total_kwh,
    MAX(kwh)             AS max_kwh,
    MIN(kwh)             AS min_kwh,
    COUNT(*)             AS reading_count
FROM meter_readings
GROUP BY bucket, meter_id, location_id
WITH NO DATA;  -- populate on first refresh, not at creation time

-- Automate incremental refresh every hour, refreshing the last 2 hours of data
SELECT add_continuous_aggregate_policy(
    'meter_readings_hourly',
    start_offset => INTERVAL '2 hours',
    end_offset   => INTERVAL '0',
    schedule_interval => INTERVAL '1 hour'
);

-- Query the aggregate — the planner uses it automatically in recent versions,
-- or query it directly for explicit control
SELECT
    bucket,
    meter_id,
    avg_kwh,
    total_kwh
FROM meter_readings_hourly
WHERE
    bucket >= NOW() - INTERVAL '30 days'
    AND location_id = 42
ORDER BY bucket DESC;
Warning

Continuous aggregates have restrictions on the underlying query: no subqueries, no window functions, no HAVING clauses, and the GROUP BY must include the time_bucket() expression. Attempting to use an unsupported construct produces a clear error at CREATE MATERIALIZED VIEW time — but if you design your aggregate schema without testing, you may discover the constraint only after writing a complex query. Validate the aggregate definition against a small dataset first.

Hierarchical Continuous Aggregates

TimescaleDB 2.9+ supports building continuous aggregates on top of other continuous aggregates. The daily rollup can be built from the hourly aggregate rather than re-scanning raw data:

sql
-- Daily rollup built on top of the hourly aggregate (no raw table re-scan)
CREATE MATERIALIZED VIEW meter_readings_daily
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 day', bucket) AS day_bucket,
    meter_id,
    location_id,
    SUM(total_kwh)   AS total_kwh,
    MAX(max_kwh)     AS peak_kwh,
    SUM(reading_count) AS reading_count
FROM meter_readings_hourly
GROUP BY day_bucket, meter_id, location_id
WITH NO DATA;

SELECT add_continuous_aggregate_policy(
    'meter_readings_daily',
    start_offset      => INTERVAL '2 days',
    end_offset        => INTERVAL '0',
    schedule_interval => INTERVAL '1 day'
);

TimescaleDB vs InfluxDB vs Prometheus

Choosing a time-series backend is not purely a performance question — operational complexity, query flexibility, and ecosystem fit matter just as much. Here is an honest comparison against the most common alternatives:

Criteria TimescaleDB InfluxDB v3 Prometheus QuestDB PostgreSQL (partitioned)
Query language Full SQL (PostgreSQL dialect) SQL (Apache Arrow Flight SQL) + Flux (legacy) PromQL only SQL (PostgreSQL-compatible subset) Full SQL
Write throughput 300K–1M rows/s per node (tuned) 1M–10M points/s (columnar, Arrow-native) ~1M samples/s per instance 4M+ rows/s (optimised columnar) 50K–200K rows/s (dependent on indexes)
SQL JOINs & transactions Full ACID, any JOIN type, row-level security Limited SQL joins; no multi-statement transactions No SQL; no joins JOINs supported; limited transaction support Full ACID, any JOIN type
Compression Native columnar, 10–30x on numeric data Apache Parquet-based, 10–50x Custom chunk encoding, ~4–8x Columnar with LZ4/Zstd, 10–20x PostgreSQL TOAST (row-level), 2–5x
Ecosystem / integrations All PostgreSQL clients, ORMs, BI tools, Grafana, dbt Telegraf, Grafana, custom SDKs; narrower ecosystem Grafana, Alertmanager; purpose-built monitoring ecosystem Growing; Grafana, Kafka connect All PostgreSQL tools; no time-series-specific extras
Operational model PostgreSQL extension — same ops, same backups, same DBA Separate service; separate backup/restore procedures Separate service; stateful TSDB storage on local disk Separate service; separate deployment and ops Same as TimescaleDB minus time-series primitives
Best fit IoT, fintech, observability requiring SQL + existing Postgres infra High-velocity telemetry with no relational joins required Application metrics and alerting (Kubernetes, microservices) Ultra-high throughput financial tick data, no joins needed Low-velocity time-series mixed with relational data, modest scale

The practical rule: if your team already runs PostgreSQL and your time-series data needs to be joined against users, devices, contracts, or any other relational entity — TimescaleDB eliminates an entire database tier. If you are building a pure observability pipeline with no relational joins, InfluxDB or Prometheus may be operationally simpler.

Important

Prometheus is not a general-purpose time-series database. It is designed specifically for short-term metrics storage with alerting — the default retention is 15 days and it does not support ad-hoc SQL queries or joins. Teams that try to use Prometheus as a long-term analytics store consistently hit scaling walls. For long-term metrics retention with analytics, use Thanos or Cortex on top of Prometheus, or migrate the metrics pipeline to TimescaleDB.

Key Takeaways

Key Takeaways
  • Install TimescaleDB as a PostgreSQL extension — CREATE EXTENSION timescaledb after adding it to shared_preload_libraries — to get access to hypertables with zero application changes.
  • Choose chunk_time_interval based on your write rate: 1 day for high-velocity ingestion (100K+ rows/s), 7–30 days for moderate workloads; chunk size directly controls query pruning efficiency.
  • Enable columnar compression with compress_segmentby set to your primary filter column (device ID, sensor ID) to achieve 10–20x storage reduction on cold data without sacrificing query performance on hot data.
  • Use add_retention_policy() instead of scheduled DELETE jobs — chunk drops are instantaneous and require no VACUUM, keeping autovacuum resources available for hot data.
  • Build continuous aggregates for any rollup query that runs more than once per minute; hierarchical aggregates let daily views refresh from hourly views without re-reading raw rows.
  • TimescaleDB is the correct choice when time-series data needs to coexist with relational data, SQL joins, row-level security, or existing PostgreSQL BI tooling — not a tradeoff, a genuine capability advantage over every purpose-built TSDB.

Working with JusDB on TimescaleDB

JusDB manages TimescaleDB deployments for engineering teams handling IoT telemetry, financial tick data, application observability, and operational metrics at scale. Our DBAs have migrated production PostgreSQL tables holding billions of rows to hypertables without downtime, tuned chunk intervals and compression policies for specific ingestion rates, and designed continuous aggregate hierarchies that reduced dashboard query times from minutes to milliseconds.

Common engagements include migration from plain PostgreSQL or InfluxDB to TimescaleDB, retention and compression policy design, continuous aggregate architecture for analytics workloads, and capacity planning for time-series workloads on RDS, Aurora, or self-managed PostgreSQL. We handle the setup, the monitoring, and the on-call response — so your team focuses on the data, not the database infrastructure.

Explore JusDB PostgreSQL Services →  |  Talk to a TimescaleDB Expert

Related reading:

Share this article