TimescaleDB 2.11 vs InfluxDB 2.7 for Industrial Sensor Storage
TL;DR — InfluxDB 2.7 is faster to bootstrap for pure metrics; TimescaleDB 2.11 wins anywhere you need joins, transactional writes, or SQL the rest of your stack already speaks / Hypertables with native compression hit 95%+ ratios on typical sensor payloads / Continuous aggregates beat Flux tasks operationally because they’re SQL views you can reason about.
“Should we use InfluxDB or TimescaleDB” is the second most common question I get after “which broker should we use,” and the answer is the same shape: it depends on what surrounds the database, not the database itself. If your data team already runs Postgres for everything, dragging in InfluxDB is going to cost you joins and tooling. If your data is purely metric-shaped and your team has never written SQL, InfluxDB is genuinely easier.
I run both in production right now, for different customers. This post is the decision framework, plus the schema patterns that actually work in 2023. For context on how data arrives, see my earlier post on the Telegraf to InfluxDB pipeline.
Mental Model: What Are They Really?
InfluxDB 2.7 is a purpose-built time-series engine. Tags are indexed, fields are not, the storage format is TSM, and queries are Flux (with a SQL preview in 3.0 alpha). It doesn’t pretend to be a relational database. The win is operational simplicity and write throughput for narrow schemas.
TimescaleDB 2.11 is PostgreSQL with an extension that adds hypertables — partitioned tables that look like a single table to the user but are physically chunked by time. You get all of Postgres (JOINs, foreign keys, JSONB, GIS, partial indexes) plus native columnar compression and continuous aggregates. The win is that it’s still Postgres.
When InfluxDB Wins
- Write throughput on narrow schemas. A 30k pts/sec sensor stream into a single bucket is comfortable on a 4-core InfluxDB. Same load on Postgres needs careful tuning even with Timescale.
- Operator simplicity. One binary, one config file, a built-in UI. No connection pool tuning, no autovacuum drama.
- Telegraf out of the box. The line-protocol path from Telegraf is the smoothest in the industry.
- Pure metrics with no relational neighbors. If “what’s the 95th percentile temperature at line 3 last hour” is the dominant query shape, InfluxDB is honestly fine.
When TimescaleDB Wins
- You need joins with non-time-series data. Devices have owners, sites, work orders, maintenance schedules. In Influx you push that into tags (cardinality explosion) or join in application code. In Timescale you
JOIN. - Transactional ingest. Writing a sensor reading and updating a device’s last-seen timestamp atomically is trivial in Postgres and impossible in Influx.
- Complex aggregations. Window functions, lateral joins, recursive CTEs. SQL has 40 years of expressiveness Flux doesn’t try to match.
- The team already runs Postgres. Backup, monitoring, connection pooling, schema migrations — you reuse the stack.
Schema That Actually Holds Up
Here’s the Timescale schema I deploy for typical industrial sensor data:
-- PostgreSQL 15, TimescaleDB 2.11
CREATE TABLE devices (
device_id TEXT PRIMARY KEY,
site TEXT NOT NULL,
line TEXT NOT NULL,
machine_type TEXT NOT NULL,
installed_at TIMESTAMPTZ NOT NULL,
metadata JSONB NOT NULL DEFAULT '{}'::jsonb
);
CREATE INDEX idx_devices_site_line ON devices (site, line);
CREATE TABLE sensor_readings (
ts TIMESTAMPTZ NOT NULL,
device_id TEXT NOT NULL REFERENCES devices(device_id),
metric TEXT NOT NULL,
value DOUBLE PRECISION NOT NULL,
quality SMALLINT NOT NULL DEFAULT 192,
PRIMARY KEY (device_id, metric, ts)
);
SELECT create_hypertable(
'sensor_readings',
'ts',
chunk_time_interval => INTERVAL '1 day',
if_not_exists => TRUE
);
-- Time-first index for "recent readings for a device" queries
CREATE INDEX idx_readings_device_ts ON sensor_readings (device_id, ts DESC);
CREATE INDEX idx_readings_metric_ts ON sensor_readings (metric, ts DESC);
A few choices worth defending:
- Composite PK on
(device_id, metric, ts). This is the natural uniqueness, prevents duplicate writes from retried ingest, and is the natural cluster key for compression. chunk_time_interval => 1 day. For 30k pts/sec that’s about 2.6 billion rows per chunk uncompressed, which sounds insane but compresses by 95%+ once you enable compression. Smaller chunks make planning faster but compression less effective.quality SMALLINT. OPC UA quality code, 192 is the “good” default. Always carry quality; aggregations should filter on it.
Compression policy:
ALTER TABLE sensor_readings SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'device_id, metric',
timescaledb.compress_orderby = 'ts DESC'
);
SELECT add_compression_policy('sensor_readings', INTERVAL '7 days');
The compress_segmentby choice is what gives us 20:1 ratios. Same device, same metric, consecutive timestamps — values delta-encode beautifully. The TimescaleDB compression docs explain the encoding internals.
Continuous Aggregates vs Flux Tasks
Continuous aggregates are the feature I miss most when I’m in Influx-land. They’re materialized views that incrementally refresh as new data arrives:
CREATE MATERIALIZED VIEW sensor_readings_1m
WITH (timescaledb.continuous) AS
SELECT
time_bucket(INTERVAL '1 minute', ts) AS bucket,
device_id,
metric,
AVG(value) AS avg_value,
MIN(value) AS min_value,
MAX(value) AS max_value,
COUNT(*) AS sample_count
FROM sensor_readings
WHERE quality >= 192
GROUP BY bucket, device_id, metric
WITH NO DATA;
SELECT add_continuous_aggregate_policy(
'sensor_readings_1m',
start_offset => INTERVAL '2 hours',
end_offset => INTERVAL '1 minute',
schedule_interval => INTERVAL '1 minute'
);
What I get for free:
- The view is queryable like any table.
SELECT ... FROM sensor_readings_1m WHERE device_id = ...works. - Real-time aggregation: queries against the view automatically combine materialized buckets with not-yet-materialized recent data.
- It’s a SQL view. Code review is the same as any schema change.
Flux tasks in InfluxDB cover the same use case, but they’re a separate task system with its own failure modes (silent task failures, no transactional guarantee, deployment via Influx CLI). For an ops team that already understands Postgres scheduling and replication, continuous aggregates are operationally lighter.
A Real Query: Worst Sensor Per Line This Hour
InfluxDB 2.7 Flux:
from(bucket: "telemetry_1m")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> group(columns: ["line", "device"])
|> max()
|> group(columns: ["line"])
|> top(n: 1, columns: ["_value"])
TimescaleDB:
WITH ranked AS (
SELECT
d.line,
r.device_id,
MAX(r.max_value) AS peak_temp,
ROW_NUMBER() OVER (
PARTITION BY d.line
ORDER BY MAX(r.max_value) DESC
) AS rk
FROM sensor_readings_1m r
JOIN devices d USING (device_id)
WHERE r.bucket >= NOW() - INTERVAL '1 hour'
AND r.metric = 'temperature'
GROUP BY d.line, r.device_id
)
SELECT line, device_id, peak_temp
FROM ranked
WHERE rk = 1;
The Flux version is shorter. The SQL version composes with the rest of your data — JOIN devices gives you machine_type, installed_at, anything in the metadata JSONB. That’s the trade.
Common Pitfalls
- Putting Timescale on a generic Postgres tuning baseline. Default
shared_buffers,work_mem, andmax_wal_sizeare too small. Usetimescaledb-tuneor follow the docs —work_memin particular needs raising because continuous-aggregate refresh queries hash a lot. - InfluxDB cardinality drift. Adding
firmware_versionas a tag during a fleet rollout will double your series count overnight. Cardinality is a one-way ratchet in InfluxDB; plan tags up front and reject high-cardinality additions in code review. - Compression before the data settles. Don’t compress chunks under a day old. Late-arriving data has to decompress the chunk to write, which is painfully expensive. The 7-day policy above leaves headroom.
- Foreign keys on the hypertable to a regular table. Works, but every insert checks the FK and you’ll see it in p99. Either denormalize the device fields onto the hypertable for hot lookups, or drop the FK and enforce it at the application layer.
- InfluxDB 3.0 alpha in production. The Parquet-backed storage engine is exciting but it’s alpha for a reason. Don’t ship it for IIoT in 2023 unless you’re prepared to migrate again next year.
Wrapping Up
There’s no clean winner — there’s a fit. InfluxDB is a sharper tool for narrow metric workloads; TimescaleDB is the safer general-purpose choice when sensor data is one entity among many in your data model. Pick based on the shape of your queries six months from now, not the demo. Next post I’ll get into Kafka as the durable buffer between MQTT and whichever of these stores you picked.