Loki for Logs, Self-Hosted Aggregation
TL;DR — Loki indexes log labels (not content). 10× cheaper storage than Elasticsearch. LogQL feels like PromQL. Self-host as one Docker container for small setups. Doesn’t replace log analytics platforms; replaces grep-on-files perfectly.
After Alertmanager, the logs pillar. Loki is Grafana’s logs solution. Different shape from Elasticsearch — sometimes a better fit, sometimes not.
How Loki differs from ELK
Elasticsearch indexes everything. Every word in every log line is searchable. Powerful; expensive in storage + RAM.
Loki indexes only labels. {app="api", level="error"} is the searchable key. The log line content is stored compressed but not indexed. Queries grep through compressed chunks.
The trade-off:
- ELK answers “find all logs containing user_id=42” instantly
- Loki answers “show me error-level logs from the api app, last hour, containing user_id=42” — fast by label, then grep the matching chunks
For most operational use, the second pattern is what you actually do. Loki wins on cost.
Storage model
Logs are stored as compressed chunks (gzip / snappy). One chunk per stream (unique label set) per time period (default 1 hour).
Index is small — just label combinations → chunk locations. The actual data sits in object storage (S3, GCS, MinIO) or local filesystem.
For our factory project: ~5 GB/day of logs, stored at ~600 MB/day after compression. Loki running on $20/month VM handles it.
Compose setup
services:
loki:
image: grafana/loki:2.6.1
command: -config.file=/etc/loki/local-config.yaml
volumes:
- ./loki-config.yaml:/etc/loki/local-config.yaml
- loki-data:/loki
ports: ["3100:3100"]
promtail:
image: grafana/promtail:2.6.1
volumes:
- /var/log:/var/log:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/run/docker.sock:/var/run/docker.sock
- ./promtail-config.yaml:/etc/promtail/config.yml
command: -config.file=/etc/promtail/config.yml
Basic loki-config.yaml:
auth_enabled: false
server:
http_listen_port: 3100
ingester:
lifecycler:
address: 127.0.0.1
ring:
kvstore:
store: inmemory
replication_factor: 1
chunk_idle_period: 1h
max_chunk_age: 1h
schema_config:
configs:
- from: 2022-01-01
store: boltdb-shipper
object_store: filesystem
schema: v12
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /loki/index
cache_location: /loki/cache
shared_store: filesystem
filesystem:
directory: /loki/chunks
limits_config:
retention_period: 720h # 30 days
This config is single-instance. For HA, separate ingesters / queriers / index gateways. Most teams don’t need that.
Promtail config
Promtail tails files (or container logs) and ships to Loki:
server:
http_listen_port: 9080
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: containers
docker_sd_configs:
- host: unix:///var/run/docker.sock
refresh_interval: 5s
relabel_configs:
- source_labels: ['__meta_docker_container_name']
regex: '/(.*)'
target_label: 'container'
- source_labels: ['__meta_docker_container_log_stream']
target_label: 'stream'
- job_name: system
static_configs:
- targets: [localhost]
labels:
job: varlogs
__path__: /var/log/*.log
Promtail discovers Docker containers, attaches container and stream labels, ships logs.
LogQL
LogQL has two parts: log queries (return log lines) and metric queries (compute over logs).
Log queries:
{container="api"} # all logs from container "api"
{container="api"} |= "error" # logs containing "error"
{container="api"} != "healthcheck" # logs NOT containing "healthcheck"
{container="api"} |~ "user_id=\\d+" # regex match
{container="api"} |= "error" | json # parse JSON, expose fields
{container="api"} | json | level="error" # filter on parsed field
|=, !=, |~, !~ are line filters. | json, | logfmt parse structured logs.
Metric queries:
rate({container="api"} |= "error"[5m]) # errors/sec
sum by (path) (count_over_time({container="api"} | json | __error__="" [5m]))
# parsed log volume
Behaves like PromQL but operates on log streams.
Cardinality matters here too
Loki indexes labels. High-cardinality labels → many small streams → many chunks → expensive.
Bad labels:
user_id— millions of usersrequest_id— unique per requestpod_namefrom a deployment that scales (transient names)
Good labels:
container/service(bounded by deployment count)level(info, warn, error, debug)env(prod, staging)cluster(bounded by infra)
Keep total active streams under ~10K for small Loki. ~100K for distributed.
For per-request data, put it in the log line and grep with |= or extract with | json. Don’t make it a label.
Querying in Grafana
Configure Loki as data source. Then in Explore mode or dashboards:
{container="api", level="error"} | json | line_format "{{.timestamp}} {{.message}}"
Returns matching log lines with formatted output.
For dashboards, log panels show recent matches. Combine with metrics:
- Panel 1: error rate (Prometheus)
- Panel 2: recent error logs (Loki) for the same time range
Click an error spike → see the actual errors below.
Retention
limits_config:
retention_period: 720h # 30 days
Plus a compactor:
compactor:
working_directory: /loki/compactor
shared_store: filesystem
compaction_interval: 10m
retention_enabled: true
retention_delete_delay: 2h
Logs older than 30 days are deleted automatically. Tune to your compliance + cost needs.
For very long retention at low cost: S3 with Glacier tiering. Loki supports it.
Common Pitfalls
High-cardinality labels. Stream count explodes; query performance dies. Strict label discipline.
Indexing every word. Wrong tool. Use ELK if you need full-text search.
No retention. Disk fills.
Single Loki for huge volume. Past ~50 GB/day, switch to distributed mode (separate ingester/querier/store).
Forgetting | json. Structured logs without parsing = grep through raw JSON strings. Slow.
Promtail mounted at / reading everything. Specific paths only.
Wrapping Up
Loki = label-indexed logs + LogQL + Grafana. Cheap, fast at the operational level. Monday: Promtail pipelines for log parsing.