Building a Unified Developer Productivity Dashboard With n8n, Postgres, and Metabase
TL;DR — Build the data model first; the dashboard is the easy part. / Measure flow, not volume — cycle time and review latency tell you something; commits-per-day doesn’t. / Make the dashboard read-only and self-serve, never tied to performance review, or you’ll get exactly the gaming you’d expect.
This is the last post in this month’s series on automation for engineering productivity. The other eight posts have been about plumbing — wiring n8n to Jira, Linear, GitHub, Slack, OpenAI, and so on. This post is about what to do with all that plumbing once it exists: build something that surfaces signal you can actually act on.
I’ve built three of these in the last six years. The first two were what I’d now call vanity dashboards. They tracked commits, PRs opened, tickets closed. Pretty graphs, no behavior change. The third one — the one I’m describing here — is built around four metrics and gets used. The reason for the difference is mostly in what gets measured and how it gets surfaced, not in the technology.
The technical stack is straightforward: n8n for ingestion (covered in detail in the self-hosting post earlier this month), Postgres for the warehouse, and Metabase for the visualization layer. None of it is glamorous; all of it works.
A side note on the Node version: I’m running this dashboard on Node 20.1 LTS rather than 18 — Node 20 went GA on April 18, 2023, and the node:test built-in plus the stable fetch API have been genuinely useful for the small ingestion scripts.
What to measure (and what to ignore)
The biggest mistake in dev productivity dashboards is measuring things that are easy to measure. Commit count, lines of code, PRs per week. These are all proxies for typing volume, not for engineering output. They get gamed instantly.
The four metrics that have held up across teams I’ve worked with:
- Cycle time per ticket. Time from “in progress” to “done.” Tells you how long real work actually takes.
- PR review latency. Time from PR opened to first review. Tells you whether review is a bottleneck.
- Backlog age distribution. How many tickets are sitting in the backlog, by age bucket. Tells you whether tickets are getting dropped on the floor.
- Deploy frequency. How often your service ships to production. The DORA classic, and it correlates with almost everything else worth caring about.
What I deliberately don’t measure on this dashboard:
- Individual contributor metrics. Anything that names a person and gives them a number on a graph is going to get gamed and is going to hurt morale. Team-level only.
- Story points anything. Story points are negotiated guesses. Aggregating them doesn’t make them more accurate.
- Commit frequency. Already covered. Skip it.
- Lines of code. I shouldn’t have to say it but I keep seeing it on dashboards. No.
The principle is simple: measure flow through the system, not work units within it. The Accelerate book and the DORA research are the canonical reference here.
The data model
The warehouse schema is denormalized enough to make Metabase queries fast and normalized enough that bad data doesn’t propagate through the whole thing.
CREATE TABLE dim_team (
team_id TEXT PRIMARY KEY,
team_name TEXT NOT NULL,
product_area TEXT NOT NULL
);
CREATE TABLE dim_engineer (
engineer_id TEXT PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
github_login TEXT,
jira_account_id TEXT,
linear_user_id TEXT,
team_id TEXT REFERENCES dim_team(team_id),
active BOOLEAN NOT NULL DEFAULT true
);
CREATE TABLE fact_ticket_event (
id BIGSERIAL PRIMARY KEY,
source TEXT NOT NULL, -- 'jira' or 'linear'
external_id TEXT NOT NULL,
ticket_key TEXT NOT NULL,
event_type TEXT NOT NULL, -- created, state_changed, assigned, etc.
from_state TEXT,
to_state TEXT,
occurred_at TIMESTAMPTZ NOT NULL,
actor_engineer_id TEXT REFERENCES dim_engineer(engineer_id),
team_id TEXT REFERENCES dim_team(team_id),
payload JSONB NOT NULL,
UNIQUE (source, external_id, event_type, occurred_at)
);
CREATE TABLE fact_pr_event (
id BIGSERIAL PRIMARY KEY,
repo TEXT NOT NULL,
pr_number INT NOT NULL,
event_type TEXT NOT NULL, -- opened, reviewed, merged, closed
actor_engineer_id TEXT REFERENCES dim_engineer(engineer_id),
occurred_at TIMESTAMPTZ NOT NULL,
payload JSONB NOT NULL,
UNIQUE (repo, pr_number, event_type, occurred_at, actor_engineer_id)
);
CREATE TABLE fact_deploy (
id BIGSERIAL PRIMARY KEY,
service TEXT NOT NULL,
environment TEXT NOT NULL,
deployed_at TIMESTAMPTZ NOT NULL,
commit_sha TEXT NOT NULL,
succeeded BOOLEAN NOT NULL,
duration_seconds INT,
team_id TEXT REFERENCES dim_team(team_id),
UNIQUE (service, environment, commit_sha, deployed_at)
);
CREATE INDEX idx_ticket_event_team_time ON fact_ticket_event (team_id, occurred_at DESC);
CREATE INDEX idx_pr_event_repo_time ON fact_pr_event (repo, occurred_at DESC);
CREATE INDEX idx_deploy_team_time ON fact_deploy (team_id, deployed_at DESC);
The fact tables store events, not states. Anything stateful — “current status of this ticket,” “time spent in review” — is derived at query time from the event stream. The reason is that derived metrics change over time as we redefine them, but event streams don’t. Re-deriving from an event log is cheap; recovering events from an aggregate is impossible.
Ingestion via n8n
Each fact table has at least one n8n workflow feeding it. The pattern is consistent: a webhook handler for real-time events plus a nightly reconciliation that re-pulls the last 30 days from the source API and upserts.
For Jira events:
// n8n Function node, fed by Jira webhook -> signature verify -> this
const issueEvent = $input.item.json;
const eventType = mapJiraWebhookEvent(issueEvent.webhookEvent);
return {
source: 'jira',
external_id: issueEvent.issue.id,
ticket_key: issueEvent.issue.key,
event_type: eventType,
from_state: issueEvent.changelog?.items?.find(i => i.field === 'status')?.fromString,
to_state: issueEvent.changelog?.items?.find(i => i.field === 'status')?.toString,
occurred_at: issueEvent.timestamp,
actor_email: issueEvent.user?.emailAddress,
payload: issueEvent,
};
The output goes into a Postgres node configured to INSERT ... ON CONFLICT DO NOTHING on the unique key. Idempotency is built in.
For GitHub PR events, the workflow is similar but the webhook is the GitHub side covered in the GitHub Actions orchestration post. The fact that I’m signing both directions of those flows means the dashboard data is also trustworthy by extension.
The nightly reconciliation is a scheduled n8n workflow that hits each source API with a “since=24h ago” filter and replays any events that didn’t come through via webhook. Belt-and-braces, and it’s caught real gaps maybe four or five times in the last year.
The four queries
Each metric reduces to a SQL query against the fact tables. I’ll show one in full.
Cycle time per ticket, by team, last 30 days
WITH state_transitions AS (
SELECT
ticket_key,
team_id,
occurred_at,
to_state,
LAG(occurred_at) OVER (PARTITION BY ticket_key ORDER BY occurred_at) AS prev_at,
LAG(to_state) OVER (PARTITION BY ticket_key ORDER BY occurred_at) AS prev_state
FROM fact_ticket_event
WHERE event_type = 'state_changed'
AND occurred_at > NOW() - INTERVAL '60 days'
),
in_progress_starts AS (
SELECT ticket_key, team_id, occurred_at AS started_at
FROM state_transitions
WHERE to_state IN ('In Progress', 'In Development')
),
done_ends AS (
SELECT ticket_key, occurred_at AS done_at
FROM state_transitions
WHERE to_state IN ('Done', 'Resolved', 'Closed')
)
SELECT
s.team_id,
date_trunc('week', d.done_at) AS week,
COUNT(*) AS tickets_completed,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY EXTRACT(EPOCH FROM (d.done_at - s.started_at)) / 86400.0) AS median_cycle_days,
PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY EXTRACT(EPOCH FROM (d.done_at - s.started_at)) / 86400.0) AS p90_cycle_days
FROM in_progress_starts s
JOIN done_ends d ON s.ticket_key = d.ticket_key AND d.done_at > s.started_at
WHERE d.done_at > NOW() - INTERVAL '30 days'
GROUP BY s.team_id, week
ORDER BY week DESC, s.team_id;
The p90 is more useful than the median for cycle time. The median tells you whether typical work is fast; the p90 tells you whether you have a long tail of stuck tickets, which is usually the more important signal.
I won’t reproduce the other three queries in full, but the structure is the same: window functions over the event stream, grouped by team and week, surfaced as percentiles rather than means. Percentiles handle outliers better, and engineering work distribution is always heavy-tailed.
Surfacing the data
Metabase 0.46 is the visualization layer. A handful of pragmatic choices:
- One dashboard per team, plus an org-wide overview. Teams care about their own numbers; leadership cares about the rollup.
- Trend lines, not point-in-time numbers. “Cycle time this week” is meaningless. “Cycle time over the last 12 weeks” tells a story.
- Annotations on the graphs. When the team kicks off a new initiative, mark it on the chart. When the cycle time spikes, you want to know whether it correlates with the holiday week or the migration project.
- No traffic-light “is this team healthy” scores. They infantilize the engineers and they’re always wrong at the edges. Let people read the trends.
The dashboard is open to everyone. Anyone in the org can see any team’s numbers. The transparency is part of the deal — if a metric is good enough to look at, it’s good enough to share.
What it’s not for
I’ll repeat this because it matters. The dashboard is for the teams to look at their own flow and notice when something’s off. It is not for performance reviews. It is not used to compare engineers. It is not used to set quotas.
The moment a productivity dashboard becomes input to performance management, every metric on it gets gamed and the data quality collapses. I’ve seen this happen and it’s painful to walk back. Set the cultural expectations up front and revisit them every quarter.
Common Pitfalls
- State name proliferation. Different teams configure their Jira workflows differently. “In Progress” might be “In Development” or “Active” or “Started.” Normalize aggressively at ingestion time, not at query time.
- Bot accounts polluting the data. Automated PRs from Dependabot, Renovate, or your own automation will skew PR metrics. Tag them with an
is_botflag in the engineer dimension and filter them out by default. - Time zone bugs in cycle-time math. Subtracting two timestamps gives you a duration, but only if both are properly TZ-aware. Postgres
TIMESTAMPTZhandles this;TIMESTAMPdoesn’t. Use the former everywhere. - Metabase auto-refresh hammering Postgres. A dashboard with 10 panels and a 1-minute refresh is 600 queries an hour. Materialize the heavy queries into a
dashboard_*set of tables refreshed every 15 minutes; point Metabase at those. - DORA metric definitions. The DORA team is fairly specific about how to measure deploy frequency, lead time for changes, change failure rate, and time to restore. Read the DORA documentation before you implement; my definitions in this post are aligned but not identical.
What’s Next
This wraps up May’s series on automating engineering productivity. The pattern that’s emerged across the nine posts is consistent: pick boring infrastructure, draw clear lines between orchestration and execution, version your projections, instrument everything, and keep humans in the loop for any decision that’s irreversible.
The dashboard is the visible artifact. The real value is everything underneath — the n8n workflows, the data model, the integrations to Jira and Linear and GitHub, the Slack approvals, the LLM-assisted triage. None of it is novel in isolation. Wired together, it’s the difference between a team that knows where its work is and a team that doesn’t.