← All articles

How to run multiple AI coding agents in parallel

Run multiple AI coding agents simultaneously on different features. Learn why Docker isolation is essential for parallel execution and how Trimo orchestrates concurrent agent workflows.

11 min read

To run multiple AI coding agents in parallel, you need three things: Docker isolation so agents don't conflict with each other, branch management so code changes don't collide, and a monitoring layer so you can see what every agent is doing at a glance. Without all three, parallel execution breaks down — agents overwrite each other's files, push to the same branch, or fail silently while you're watching a different terminal window.

Tools like Claude Code and Codex can run unattended on a single task. They'll read your codebase, write code, run tests, and commit results. But running multiple instances of these agents simultaneously — each working on a different feature — requires orchestration that the agents themselves don't provide. You need something to manage the containers, assign branches, relay output, and prevent resource exhaustion.

Trimo provides this orchestration out of the box. Each task runs as an independent pipeline with its own Docker container, Git branch, and prompt history. The daemon handles resource monitoring, and the dashboard shows all pipelines in a single view. But even if you roll your own setup, the principles in this guide apply.


Why run agents in parallel?

Throughput. A single autonomous coding agent takes 5-30 minutes per task, depending on complexity. If you have 5 features to build, serial execution means 25 minutes to 2.5 hours of waiting — and you're blocked on each task before you can dispatch the next one.

Parallel execution means all 5 run simultaneously. Your total wall-clock time drops from the sum of all tasks to the duration of the longest single task. Five 15-minute features finish in 15 minutes instead of 75.

This changes the developer's role. Instead of writing code, you're dispatching and reviewing. You write clear prompts, dispatch agents to separate branches, and review the output as it arrives. One developer can realistically manage 3-5 concurrent agent tasks, reviewing diffs and writing corrective prompts as needed. The bottleneck shifts from typing speed to the clarity of your specifications.

This is especially powerful for tasks that are independent of each other: new API endpoints, UI components, test suites, migration scripts, documentation. Any work that doesn't require coordination between features is a candidate for parallel execution.


The orchestration challenge

Running a single agent is straightforward. Running multiple agents at the same time introduces coordination problems that compound as you add more concurrent tasks.

Branch conflicts

Every agent working on the same repository needs its own branch. If two agents commit to the same branch, you get merge conflicts, overwritten changes, or corrupted Git history. With a single agent, you can manage this manually. With five agents, you need automated branch creation, naming conventions, and push isolation.

Resource contention

Each agent consumes CPU, memory, and API tokens. Claude Code typically uses 500MB-1GB of RAM per instance. Running four agents on a 16GB machine is fine. Running eight might cause out-of-memory errors, swap thrashing, or container kills — and you won't know which agent died unless you're watching all of them.

Monitoring at scale

With a single agent, you can watch its output in a terminal. With three or more running concurrently, you need a way to see status at a glance: which agent is actively working, which finished successfully, which hit an error and needs intervention. Tabbing between terminal windows doesn't scale.

Git workflow automation

Each agent needs its own branch, automatic commits (so work isn't lost if the agent crashes), automatic pushes (so the code is available for review), and eventually a pull request. Managing this manually for one agent is tedious. For five agents, it's a full-time job.


Docker isolation — the foundation

Docker containers are the non-negotiable foundation for running agents in parallel. Each agent must run in its own container. Here's what isolation gives you:

  • Isolated filesystem. Each agent has its own copy of the repository. Agent A can delete files, rewrite modules, or restructure directories without affecting Agent B's working copy. There are no file-level conflicts during execution.
  • Independent network. Agents running dev servers, test suites, or database connections don't compete for ports. Container-internal port 3000 for Agent A is completely separate from Agent B's port 3000.
  • Separate process space. If an agent's process crashes, hangs, or consumes excessive resources, it only affects its own container. Other agents continue running normally. You can kill and restart one container without touching the rest.
  • Clean environments. Each container starts from the same base image with the same dependencies. No state leaks between runs. No "it worked on the last agent's container" problems.

Without container isolation, parallel agents on the same codebase will corrupt each other's state. Two agents writing to the same src/app.ts simultaneously produces garbage. Two agents running npm install concurrently can corrupt node_modules. These aren't edge cases — they're guaranteed failures.

Trimo's local-execution Docker sandbox model was designed around this requirement. Every pipeline gets its own container built from a hardened base image that includes Git safety wrappers, automatic commit and push, and a communication channel back to the daemon.


Manual approach — Docker + shell scripts

You can build a parallel agent setup yourself. Here's what the minimum viable version looks like:

Step 1: Create a Dockerfile

Start with a base image that has your language runtime, Git, and the AI coding tool installed. The Dockerfile needs to clone your repository, install dependencies, and set up the agent.

FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y git nodejs npm
# Install Claude Code or your agent of choice
# Clone the target repository
# Set up branch and working directory

Step 2: Write a launch script

A shell script that starts N containers, each with a different branch name and prompt:

#!/bin/bash
TASKS=("Add JWT auth" "Build product API" "Create admin layout")
BRANCHES=("feat/jwt-auth" "feat/product-api" "feat/admin-layout")

for i in "${!TASKS[@]}"; do
  docker run -d \
    --name "agent-$i" \
    -e BRANCH="${BRANCHES[$i]}" \
    -e PROMPT="${TASKS[$i]}" \
    your-agent-image:latest
done

Step 3: Monitor manually

Check on each container with docker logs agent-0, docker logs agent-1, etc. Parse through terminal output to figure out if the agent is still working, finished, or errored.

Where this breaks down

This approach works for 2-3 agents if you're patient. But it has real limitations:

  • No real-time monitoring. You're polling logs manually. You don't know an agent errored until you check.
  • No resource awareness. If your machine runs low on memory, containers get OOM-killed with no warning and no recovery.
  • No git automation. You need to handle branch creation, commit, push, and PR creation per agent — either in the Dockerfile or in another script.
  • No continuity. If an agent finishes and you want to send a follow-up prompt ("now add input validation"), you need to start a new container, re-clone, checkout the branch, and feed the new prompt. The previous conversation context is gone.
  • No unified view. You can't see all agents' status in one place without building a dashboard yourself.

Managed approach — Trimo

Trimo is built specifically for the dispatch-and-review workflow with parallel agents. The core abstraction is the pipeline: each feature or task gets its own pipeline, and each pipeline is an independent agent execution with its own Docker container, Git branch, and prompt history.

Pipeline system

Each pipeline tracks a single line of work. You create a pipeline with a prompt, and Trimo handles the rest: container creation, branch setup, agent launch, output streaming, and commit/push. Multiple pipelines run concurrently, each fully isolated from the others.

# Dispatch three pipelines from the CLI
trimo run create --prompt "Add JWT authentication middleware"
trimo run create --prompt "Create REST API for product catalog"
trimo run create --prompt "Build admin dashboard layout with sidebar nav"

Resource awareness

The Trimo daemon monitors CPU and memory on your machine. If resources are tight, it queues new pipelines instead of launching them immediately. This prevents OOM kills and ensures running agents have enough resources to complete their work. On macOS, the daemon accounts for the difference between "truly free" and "available" memory — a distinction that trips up naive monitoring.

Git automation

Every pipeline gets automatic branch creation, auto-commit on file changes, and auto-push to the remote. The agent doesn't need to remember to commit or push — the infrastructure handles it. Git safety wrappers prevent destructive operations like force-push or branch deletion, so agents can't accidentally damage the repository.

Dashboard

The Trimo dashboard shows all pipelines at a glance. Each pipeline displays its current status (working, idle, needs review, complete), the latest commit diff, streaming agent output, and controls for follow-up prompts. You can review one agent's output, write a corrective prompt, and switch to the next — all in one view.

Pipeline continuity

When an agent finishes a task and you want to iterate — "now add error handling to the auth middleware" — you send a follow-up prompt on the same pipeline. Trimo launches a new run in the same branch, with the system prompt updated to reflect the current state of the code. The agent picks up where the previous run left off. This continuity is what makes the dispatch-and-review workflow practical: you're not starting from scratch on every iteration.


A practical example

Suppose you're building a web application and have three independent features to implement. Here's how parallel execution works in practice with Trimo:

Dispatch

You create three pipelines, each with a clear, scoped prompt:

  • Pipeline 1: "Add user authentication with JWT. Create login/register endpoints, middleware for protected routes, and tests. Use bcrypt for password hashing and jsonwebtoken for tokens."
  • Pipeline 2: "Create a REST API for the product catalog. CRUD endpoints for products with pagination, filtering by category, and search by name. Include validation and tests."
  • Pipeline 3: "Build the admin dashboard layout. Sidebar navigation with collapsible sections, top header with user menu, and a responsive grid for the main content area. Use the existing design system components."

Execution

Each pipeline runs in its own Docker container on its own Git branch (trimo/pipeline-1, trimo/pipeline-2, trimo/pipeline-3). The agents work simultaneously — reading the codebase, writing code, running tests, committing changes. The dashboard shows real-time output from all three.

Review

Pipeline 3 finishes first (UI scaffolding is fast). You review the diff — the layout looks right but the sidebar needs a logout button. You send a follow-up: "Add a logout button at the bottom of the sidebar that calls POST /api/auth/logout." The pipeline launches a new run on the same branch.

Pipeline 1 finishes next. The auth implementation is solid but missing rate limiting on the login endpoint. Another follow-up prompt. Pipeline 2 is still running — the product API has more surface area.

Merge

Once each pipeline's output passes your review, you merge the branches. Because each agent worked on independent features in isolated containers, the branches merge cleanly. Total wall-clock time: about 20 minutes for work that would have taken over an hour serially.


Resource requirements

How many agents you can run in parallel depends on your hardware. Here are practical guidelines based on Claude Code's resource usage:

System RAM Comfortable concurrency Maximum concurrency
8 GB 1-2 agents 2 agents
16 GB 3-4 agents 5 agents
32 GB 6-8 agents 10 agents
64 GB+ 10+ agents Limited by CPU

Each Claude Code agent instance uses approximately 500MB-1GB of RAM for the Node.js process, plus additional memory for any tools it runs (compilers, test runners, dev servers). CPU usage is bursty — agents spend most of their time waiting for API responses, with brief spikes when running builds or tests.

The practical bottleneck is usually API throughput, not local resources. Each agent makes its own API calls to Claude, and those calls are rate-limited per API key. With a standard API key, 3-4 concurrent agents is the sweet spot where local resources and API rate limits align.

The Trimo daemon monitors these resources continuously. If available memory drops below a safe threshold, new pipeline launches are queued until resources free up. This prevents the worst-case scenario: an OOM kill that wipes out an agent's uncommitted work.


Frequently asked questions

How many agents can I run in parallel?

It depends on your hardware. Each Claude Code agent uses approximately 500MB-1GB of RAM. A machine with 16GB RAM can comfortably run 3-4 concurrent agents. With 32GB or more, you can run 6-8 agents simultaneously. Trimo's free tier includes 2 parallel pipelines, with higher tiers supporting more. The Trimo daemon also monitors resources in real time and queues new runs if your machine is overloaded, so you won't accidentally OOM-kill running agents.

Do parallel agents conflict with each other?

Not when using Docker isolation. Each agent runs in its own container with its own filesystem, Git branch, and process space. They cannot see or modify each other's work. Without isolation — for example, running multiple agents in separate terminal windows on the same checkout — you will get file conflicts, corrupted node_modules, and Git merge disasters. Docker containers eliminate this entire class of problems.

What agents does Trimo support for parallel execution?

Trimo currently supports Claude Code for parallel agent execution. Codex support is on the roadmap. The architecture is agent-agnostic — the Docker isolation, branch management, and monitoring layer work independently of which agent runs inside the container. The base image can be extended to support any agent that runs inside a Docker container.

Can I run parallel agents without Trimo?

Yes. You can set up Docker containers manually, write shell scripts to manage branches, and monitor agents via terminal windows or docker logs. This works for 2-3 agents but becomes difficult to manage at scale. You lose real-time monitoring, Git workflow automation, resource awareness, and pipeline continuity. If you're evaluating whether to build or buy, the manual approach in this article gives you a starting point.

Does parallel execution use more API tokens?

Yes. Each agent consumes tokens independently — reading files, reasoning about code, and generating output all cost tokens. Running 3 agents in parallel uses roughly 3x the tokens of a single agent. The benefit is wall-clock time: 3x the tokens but 1/3 the wait. Trimo never charges for or marks up API tokens. You use your own Anthropic API key directly, and you pay Anthropic's standard rates.