Trust Builds Through Repetition, Not Demos

From supervised to autonomous: trust builds gradually through repetition.

Chad Metcalf

06 Nov 2025 • 4 min read

Nobody goes from manual coding to fully autonomous agents overnight. Trust doesn't work that way.

In my last post, I wrote about how composed workflows compound. But you can't compose workflows until you have reliable individual workflows. And you can't have reliable workflows until you build trust through repetition.

The teams successfully deploying Continuous AI workflows all follow a similar path. They start small, build confidence through repetition, and gradually increase autonomy as they learn what works.

Here's what that actually looks like.

Phase 1: Supervised Agents

You start by running an agent locally where you can watch it work. Continue CLI in TUI mode is built for this. You give it a task, and you watch every step.

"Fix this Sentry error" becomes a learning experience. You see which context it pulls. You see how it reasons through the problem. You see where it gets stuck. You intervene when needed.

This feels slow at first. You're tempted to just fix it yourself. But you're not optimizing for this one task. You're learning what the agent needs to succeed: better prompts, clearer rules, the right tools connected via MCP servers.

After a few runs, patterns emerge. The agent consistently handles certain types of errors well. It struggles with others. You start to see where the boundaries are.

Phase 2: Parallel Automation

Something shifts once you trust an agent to run for five minutes without constant supervision. You realize you don't need to watch it work.

So you spin up a second agent with the Continue CLI to work on something else in parallel. Then a third. Then a fourth.

You're not coding anymore. You're orchestrating. Three agents updating documentation. Two agents fixing test failures. One agent investigating a performance regression.

This is where velocity changes. You're no longer limited by typing speed. You're limited by how well you can define tasks and review results.

Phase 3: Event-Driven Workflows

Once you're comfortable running multiple agents in parallel, the next realization hits: you don't need to manually start these agents.

New Sentry error comes in? Trigger an agent automatically. GitHub issue gets labeled "bug"? Agent spins up. PostHog feature flag hits 80% rollout? Agent updates the docs.

This is where it becomes Continuous AI. The workflows run without you. You define the trigger, the agent does the work, you review the outcome.

We're seeing teams set up triggers for:

Sentry errors above a threshold (>100 occurrences in an hour)
GitHub issues with specific labels ("good first issue", "bug", "tech-debt")
Failed CI builds on main branch
Outdated documentation detected by tests
Customer support tickets mentioning specific keywords
PostHog insights showing feature adoption patterns

Start with triggers that fire 2-5 times per day. Enough to build confidence, not so much you're drowning in reviews.

Phase 4: Composed Workflows

Here's where it gets interesting. You realize the output of one workflow can trigger the next workflow.

Customer feedback in Slack → doc search → drafted response → human review → posted answer.

Sentry error → investigation → GitHub issue → PR → review → merge → changelog → Slack summary.

PostHog feature at 80% rollout → update docs → update marketing copy → draft release notes → post to changelog.

You're managing a system of workflows that feed each other. The workflows handle repetition. You handle judgment calls.

How Trust Actually Builds

The pattern we see at Continue: trust builds through repetition on low-stakes tasks.

You don't start by letting an agent refactor your auth system. You start by letting it update documentation after a code change. Or fix a linting error. Or update test fixtures.

Tasks where the worst case is "I have to fix this myself" rather than "I just shipped a security vulnerability."

As the agent succeeds on these small tasks, you give it slightly more complex ones. Fix this test failure. Implement this small feature. Update these API clients.

After a few weeks, you're letting agents handle hour-long tasks. After a few months, half-day tasks. The review gets faster because you've learned what to look for.

Measuring Success: Intervention Rates

The key metric for Continuous AI workflows is intervention rate: how often do humans need to step in?

I've written before about how intervention rates are the new build times. Just like teams obsessed over build times in the CI/CD era, you need to track when and why you're interrupting autonomous workflows.

Different phases have different intervention rate profiles:

50%+ intervention rate: Still in Phase 1, needs more supervision
20-50% intervention rate: Phase 2, good for parallel work but not event-driven yet
5-20% intervention rate: Phase 3, ready for event-driven triggers
<5% intervention rate: Phase 4, reliable enough to compose with other workflows

The goal isn't zero interventions. Some tasks require human judgment. The goal is predictable interventions where you know when and why you'll need to step in.

What Slows Teams Down

The biggest mistake: trying to skip phases. Teams try to set up event-driven workflows for complex tasks immediately. The agents fail unpredictably. Trust breaks. They give up.

The successful path is boring. Pick one simple task. Run it supervised until it works. Add another. Build up reliable workflows. Then connect them.

The other mistake: not measuring intervention rates. Without tracking, you can't tell if you're improving. You end up with workflows that feel autonomous but need constant babysitting.

Where To Start

Pick one manual workflow you do at least weekly. Something concrete and well-defined:

Resolve Sentry errors for a specific service
Update documentation when API endpoints change
Fix linting errors in PRs
Triage customer support tickets to the right team

Run it with the Continue CLI in TUI mode. Watch it work. Intervene when it gets stuck. Learn what it needs to succeed.

Do this five times. By the fifth time, you'll know if this workflow can become autonomous or if it needs more human judgment than you thought.

If it works, deploy it with an event-driven trigger. If it doesn't, pick a simpler workflow and try again.

The teams building competitive advantages with Continuous AI didn't start with complex systems. They started with one workflow, made it reliable, and built from there.

That's how trust builds. That's how autonomous systems emerge.