background-shape
Opentelemetry article cover illustration on a gradient background
February 27, 2026 · 9 min read · by Muhammad Amal programming
Advertisement

TL;DR — An autonomous agent without guardrails is an unbounded liability / enforce token budgets, policy gates on tool calls, and OpenTelemetry traces on every run / the budget and the policy must be hard limits the agent cannot reason its way around.

The first time an agent in our CI loop ran up a four-figure API bill overnight, the cause was mundane. A flaky test made the agent retry, the retry produced a slightly different diff, which triggered another review, which spawned another agent. No infinite loop in the code — just a feedback cycle nobody had bounded. The agent did exactly what it was told. Nobody had told it to stop.

That’s the recurring shape of agentic DevOps failures. Not malice, not hallucination — absence of limits. An agent that can open pull requests, run shell commands, and call an LLM in a loop is a powerful tool and an unbounded one. The job of guardrails is to make the bounds explicit, external, and impossible for the agent to talk its way past.

Advertisement

This post covers three layers of agentic devops guardrails I now treat as non-negotiable before any agent touches production: hard token budgets, policy gates on tool calls, and OpenTelemetry tracing so you can actually see what the thing did. If you’re building the agents themselves, the autonomous engineering squad post covers the orchestration; this is the safety layer that wraps it.

The principle, guardrails live outside the agent

The single mistake that makes guardrails useless: implementing them as instructions in the prompt. “Do not spend more than 50,000 tokens” in a system prompt is a suggestion, not a limit. The agent might respect it, might miscount, might decide the task justifies more. A real guardrail is code the agent runs inside but cannot modify — a wrapper that counts tokens and raises, a policy function that inspects a tool call and refuses it, a tracer that records regardless.

Everything below is enforcement code, not prompt text.

Layer one, hard token budgets

A token budget is a counter with a ceiling. Every model call passes through a tracker that adds usage and raises the moment the ceiling is crossed. The agent’s loop catches that exception and shuts down gracefully.

# guardrails/budget.py
from dataclasses import dataclass, field
from threading import Lock


class BudgetExceeded(RuntimeError):
    """Raised when an agent run crosses its token ceiling."""


@dataclass
class TokenBudget:
    max_input_tokens: int
    max_output_tokens: int
    max_usd: float
    # Pricing per million tokens for the model in use, early 2026.
    input_price_per_mtok: float = 3.00
    output_price_per_mtok: float = 15.00

    _input_used: int = field(default=0, init=False)
    _output_used: int = field(default=0, init=False)
    _lock: Lock = field(default_factory=Lock, init=False)

    def charge(self, input_tokens: int, output_tokens: int) -> None:
        with self._lock:
            self._input_used += input_tokens
            self._output_used += output_tokens
            self._check()

    def _check(self) -> None:
        if self._input_used > self.max_input_tokens:
            raise BudgetExceeded(
                f"input tokens {self._input_used} > {self.max_input_tokens}"
            )
        if self._output_used > self.max_output_tokens:
            raise BudgetExceeded(
                f"output tokens {self._output_used} > {self.max_output_tokens}"
            )
        if self.spent_usd > self.max_usd:
            raise BudgetExceeded(
                f"spend ${self.spent_usd:.2f} > ${self.max_usd:.2f}"
            )

    @property
    def spent_usd(self) -> float:
        return (
            self._input_used / 1_000_000 * self.input_price_per_mtok
            + self._output_used / 1_000_000 * self.output_price_per_mtok
        )

    def remaining_usd(self) -> float:
        return max(0.0, self.max_usd - self.spent_usd)

The Lock matters if your agent fans out tool calls concurrently — two threads charging the same budget without it will race past the ceiling. Now wrap the model client so charging is automatic and no call path can skip it.

# guardrails/budgeted_client.py
from anthropic import Anthropic
from guardrails.budget import TokenBudget


class BudgetedClient:
    """Anthropic client that charges a TokenBudget on every call."""

    def __init__(self, budget: TokenBudget):
        self._client = Anthropic()
        self._budget = budget

    def create(self, **kwargs):
        # Budget check BEFORE the call: refuse if already drained.
        if self._budget.remaining_usd() <= 0:
            from guardrails.budget import BudgetExceeded
            raise BudgetExceeded("budget drained before call")

        resp = self._client.messages.create(**kwargs)
        self._budget.charge(
            resp.usage.input_tokens,
            resp.usage.output_tokens,
        )
        return resp

The agent only ever sees BudgetedClient. There’s no path to the raw client, so there’s no path around the budget. That’s what “outside the agent” means in practice — the limit is structural.

Layer two, policy gates on tool calls

Token budgets cap cost. Policy gates cap blast radius. Before any tool runs — shell command, file write, PR creation — its arguments pass through a gate that returns allow, deny, or escalate. The gate is a pure function over the tool name and arguments. No model in the decision path.

# guardrails/policy.py
import re
import shlex
from dataclasses import dataclass
from enum import Enum


class Decision(Enum):
    ALLOW = "allow"
    DENY = "deny"
    ESCALATE = "escalate"   # needs human approval


@dataclass
class PolicyResult:
    decision: Decision
    reason: str


# Commands an agent must never run, no exceptions.
_BLOCKED_SHELL = (
    r"\brm\s+-rf\b", r"\bgit\s+push\b.*--force", r"\bcurl\b.*\|\s*sh",
    r"\bsudo\b", r"\bchmod\s+777", r":\(\)\s*\{", r"\bdd\s+if=",
)
# Path prefixes an agent may write under. Anything else escalates.
_WRITABLE_PREFIXES = ("src/", "tests/", "docs/")
# Files that always require human sign-off to change.
_PROTECTED = (".github/workflows/", "Dockerfile", "pyproject.toml",
              "requirements.txt", ".env")


def evaluate_tool_call(tool: str, args: dict) -> PolicyResult:
    if tool == "run_shell":
        return _shell_policy(args.get("command", ""))
    if tool == "write_file":
        return _write_policy(args.get("path", ""))
    if tool == "create_pull_request":
        return _pr_policy(args)
    # Unknown tools are denied by default — fail closed.
    return PolicyResult(Decision.DENY, f"unknown tool: {tool}")


def _shell_policy(command: str) -> PolicyResult:
    for pattern in _BLOCKED_SHELL:
        if re.search(pattern, command):
            return PolicyResult(Decision.DENY, f"blocked pattern: {pattern}")
    try:
        shlex.split(command)   # reject unparseable commands
    except ValueError as exc:
        return PolicyResult(Decision.DENY, f"unparseable command: {exc}")
    return PolicyResult(Decision.ALLOW, "shell command within policy")


def _write_policy(path: str) -> PolicyResult:
    norm = path.lstrip("./")
    if any(norm.startswith(p) for p in _PROTECTED):
        return PolicyResult(Decision.ESCALATE, f"protected path: {norm}")
    if any(norm.startswith(p) for p in _WRITABLE_PREFIXES):
        return PolicyResult(Decision.ALLOW, "path in writable zone")
    return PolicyResult(Decision.ESCALATE, f"path outside writable zone: {norm}")


def _pr_policy(args: dict) -> PolicyResult:
    files_changed = args.get("files_changed", 0)
    # A PR touching dozens of files is almost never what you want.
    if files_changed > 20:
        return PolicyResult(
            Decision.ESCALATE, f"large PR: {files_changed} files"
        )
    return PolicyResult(Decision.ALLOW, "PR within size policy")

Two design choices to defend. The gate fails closed — an unknown tool is denied, not allowed. And it has an ESCALATE state distinct from DENY, because most risky actions aren’t forbidden, they just need a human. Writing to pyproject.toml isn’t wrong; it’s a decision that wants a person.

Wrap the agent’s tool dispatch so every call goes through the gate:

# guardrails/guarded_executor.py
from guardrails.policy import evaluate_tool_call, Decision


class PolicyViolation(RuntimeError):
    pass


class GuardedExecutor:
    def __init__(self, tools: dict, escalation_handler):
        self._tools = tools
        self._escalate = escalation_handler

    def execute(self, tool: str, args: dict):
        result = evaluate_tool_call(tool, args)

        if result.decision is Decision.DENY:
            raise PolicyViolation(f"denied {tool}: {result.reason}")

        if result.decision is Decision.ESCALATE:
            approved = self._escalate(tool, args, result.reason)
            if not approved:
                raise PolicyViolation(
                    f"escalation rejected for {tool}: {result.reason}"
                )

        return self._tools[tool](**args)

Layer three, OpenTelemetry traces

Budgets and gates keep the agent safe. Tracing keeps it debuggable. When an agent makes a decision you didn’t expect, you need the full causal chain — which prompt, which tool calls, which token counts, in what order. OpenTelemetry spans give you that, and they export to whatever backend you already run.

# guardrails/tracing.py
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import (
    OTLPSpanExporter,
)
from opentelemetry.sdk.resources import Resource

_provider = TracerProvider(
    resource=Resource.create({"service.name": "agentic-devops"})
)
_provider.add_span_processor(
    BatchSpanProcessor(OTLPSpanExporter(endpoint="localhost:4317"))
)
trace.set_tracer_provider(_provider)
tracer = trace.get_tracer("agentic-devops")

Now instrument the agent loop. Each iteration is a span; each model call and tool call is a child span carrying its cost and decision as attributes.

# guardrails/traced_agent.py
from opentelemetry.trace import Status, StatusCode
from guardrails.tracing import tracer
from guardrails.budget import BudgetExceeded
from guardrails.guarded_executor import PolicyViolation


def run_agent_step(agent, step: int, budgeted_client, executor):
    with tracer.start_as_current_span("agent.step") as span:
        span.set_attribute("agent.step_index", step)
        try:
            with tracer.start_as_current_span("agent.model_call") as mspan:
                resp = budgeted_client.create(
                    model="claude-sonnet-4-5-20250929",
                    max_tokens=2000,
                    messages=agent.messages,
                )
                mspan.set_attribute("llm.input_tokens", resp.usage.input_tokens)
                mspan.set_attribute("llm.output_tokens",
                                    resp.usage.output_tokens)
                mspan.set_attribute("llm.budget_spent_usd",
                                    round(budgeted_client._budget.spent_usd, 4))

            for call in agent.extract_tool_calls(resp):
                with tracer.start_as_current_span("agent.tool_call") as tspan:
                    tspan.set_attribute("tool.name", call.name)
                    out = executor.execute(call.name, call.args)
                    tspan.set_attribute("tool.status", "ok")
                    agent.record_result(call, out)

        except BudgetExceeded as exc:
            span.set_status(Status(StatusCode.ERROR, "budget exceeded"))
            span.set_attribute("agent.halt_reason", "budget")
            span.record_exception(exc)
            raise
        except PolicyViolation as exc:
            span.set_status(Status(StatusCode.ERROR, "policy violation"))
            span.set_attribute("agent.halt_reason", "policy")
            span.record_exception(exc)
            raise

Recording agent.halt_reason as a span attribute means you can query “how often do agents halt on budget vs policy” straight from your tracing backend. That single metric tells you whether your budgets are too tight or your agents too greedy. The OpenTelemetry Python docs cover exporter configuration for production backends.

Tying it together as a CI gate

In CI, these three layers become a single guarded entrypoint. The budget is set per run, the executor enforces policy, the tracer exports, and the job exits non-zero on any guardrail trip so the pipeline fails loudly.

# run_guarded_agent.py
import sys
from guardrails.budget import TokenBudget, BudgetExceeded
from guardrails.budgeted_client import BudgetedClient
from guardrails.guarded_executor import GuardedExecutor, PolicyViolation
from guardrails.traced_agent import run_agent_step


def human_escalation(tool, args, reason) -> bool:
    # In CI there is no human — escalations fail closed.
    print(f"::error::escalation required for {tool}: {reason}")
    return False


def main() -> int:
    budget = TokenBudget(
        max_input_tokens=400_000,
        max_output_tokens=80_000,
        max_usd=5.00,
    )
    client = BudgetedClient(budget)
    executor = GuardedExecutor(tools=load_tools(),
                               escalation_handler=human_escalation)
    agent = build_agent()

    try:
        for step in range(1, 25):
            run_agent_step(agent, step, client, executor)
            if agent.is_done():
                break
    except BudgetExceeded as exc:
        print(f"::error::halted on budget: {exc}")
        return 2
    except PolicyViolation as exc:
        print(f"::error::halted on policy: {exc}")
        return 3

    print(f"agent finished, spent ${budget.spent_usd:.2f}")
    return 0


if __name__ == "__main__":
    sys.exit(main())

In CI there’s no human to approve an escalation, so human_escalation returns False — escalations fail closed. An agent that needs to touch a protected file in an unattended run should stop, not guess.

Common Pitfalls

Guardrails as prompt instructions. “Stay under budget” in a system prompt is not a control. The agent must run inside enforcement code it cannot edit.

Counting tokens after the fact. If you tally usage at the end of the run, the overspend already happened. Charge per call and raise immediately.

No ESCALATE state. A binary allow/deny forces you to either over-restrict the agent or hand it dangerous capability. Most risky actions want a human, not a hard no.

Failing open on unknown tools. A policy gate that allows anything it doesn’t recognize is no gate. Default to deny.

Tracing without cost attributes. A span that records “model call happened” but not how many tokens it cost can’t answer the question you’ll actually ask.

Troubleshooting

Symptom: agent blows the budget despite the TokenBudget. Cause: a code path reaches the raw Anthropic client instead of BudgetedClient. Fix: grep for Anthropic( and messages.create — there should be exactly one, inside BudgetedClient.

Symptom: budget exceeded under concurrency but each call looks small. Cause: parallel tool calls charging without the lock. Fix: confirm TokenBudget._lock wraps charge; it does by default, so check nobody bypassed it.

Symptom: legitimate file writes get escalated and the CI run fails. Cause: the target path isn’t under _WRITABLE_PREFIXES. Fix: add the prefix if the path is genuinely safe, or restructure so the agent writes inside the allowed zone.

Symptom: no spans appear in the tracing backend. Cause: BatchSpanProcessor buffers and the process exits before flush. Fix: call _provider.shutdown() in a finally block so the buffer drains.

Symptom: agent halts on policy but the action looked safe. Cause: a _BLOCKED_SHELL regex is too broad and matched an innocent command. Fix: tighten the pattern with word boundaries and re-test against your real command corpus.

Wrapping Up

Agentic DevOps is only safe to run unattended when the limits live outside the agent — a budget it can’t uncount, a policy gate it can’t argue with, traces it can’t suppress. Build those three layers before you give an agent write access to anything that matters. Next, feed the halt_reason attribute into a dashboard and tune budgets against real run data instead of guesses.

Advertisement