NAX: Multi-Model Agent Workflows on Netlify

June 29, 2026 aiagentsautomationdev-tools

One agent is useful.

Three agents arguing with each other is more useful.

NAX, the Netlify Agent Executor, is a CLI and workflow layer for Netlify Agent Runners: define a repeatable flow, submit the agent work to Netlify, wait for results, save the artifacts, and hand the final output back to a person.

The code is public at github.com/netlify-labs/nax.

The docs live at netlify-agent-executor.netlify.app.

I already wrote about using Netlify Agent Runners to launch Codex, Claude, and Gemini from GitHub issues. That flow is the remote control.

NAX is the playbook.

Instead of asking one model to do one job, NAX lets me define a workflow like:

Ask Claude, Gemini, and Codex to independently review the same codebase change.
Feed each model the other models' findings.
Ask them to challenge, confirm, or reject the findings.
Summarize only the issues with enough agreement or enough evidence.
Save the handoff artifact so a human can decide what happens next.

That pattern matters because a lot of agent work fails at the same boring point: the first answer sounds plausible, but nobody cross-examines it.

Demo

The council pattern

I think of NAX as a council of models.

The value comes from difference, not magic.

Codex, Claude, and Gemini have different strengths, different blind spots, and different failure modes. When they independently inspect the same problem, the overlap is useful. When one model finds something the others missed, the disagreement is useful too.

The workflow is not "vote and trust the majority." That is too shallow.

The useful loop is:

flowchart TD
  Input[Task, repo, PR, or issue] --> Fanout[Run selected models independently]
  Fanout --> Claude[Claude pass]
  Fanout --> Gemini[Gemini pass]
  Fanout --> Codex[Codex pass]
  Claude --> CrossCheck[Cross-review findings]
  Gemini --> CrossCheck
  Codex --> CrossCheck
  CrossCheck --> Consensus[Consensus summary]
  Consensus --> Artifact[Saved artifacts in .nax]
  Consensus --> Human[Human review or approval]
  Human --> PR[Optional PR or follow-up run]

The point is to preserve independent judgment before the models influence each other.

If you show every model the same prior answer too early, you get agreement theater. They converge around the first confident response. NAX keeps the first pass separate, then uses later steps for cross-review.

What NAX runs

NAX workflows are directories with a flow.yml file and prompt files. The flow defines the steps, agents, submit mode, prior-step inputs, and wait behavior.

The bundled review workflow looks like this in shape:

id: review
title: Review
description: Review, cross-review, and synthesize findings with multiple Netlify agents.

defaults:
  transport: auto
  agents: [claude, gemini, codex]

steps:
  - id: review
    prompt: prompts/1_review.md
    action: issue
    submit: new-run
    agents: [claude, gemini, codex]
    waitFor: agent-results

  - id: cross-check
    prompt: prompts/2_cross-review.md
    action: comment
    submit: follow-up
    agents: [claude, gemini, codex]
    input:
      - step: review
        results: all
    waitFor: agent-results

  - id: synthesize
    prompt: prompts/3_summarize-consensus.md
    action: issue
    submit: new-run
    agents: [codex]
    input:
      - step: review
        results: all
      - step: cross-review
        results: all
    waitFor: agent-results

The prompt files can change, but the process stays stable.

You can swap the task from code review to security review, performance review, documentation work, test generation, or idea ranking without rebuilding the orchestration from scratch.

Why run this

NAX is useful because it turns agent work from a one-off prompt into a repeatable workflow.

The practical win is structure around work I already ask agents to do:

Run the same review playbook more than once.
Keep the independent model passes separate before cross-review.
Save the intermediate prompts and outputs as artifacts.
Make model choice explicit per run.
Keep approval explicit before code changes or merge decisions.

I care about the reusable process and the evidence I can inspect later.

Why use a remote agent service

The CLI can run locally, and CI/CD can trigger a workflow, but the agent job itself runs remotely in Netlify Agent Runners.

Netlify Agent Runners provide that execution environment. The agent has project context and can do real work against the repo without me keeping a terminal open.

NAX has two transport paths:

netlify-api, where my machine or the local dashboard orchestrates the workflow and submits Agent Runner jobs through the Netlify API.
github-actions, where GitHub Actions hosts the orchestration and uploads the .nax artifacts.

Both paths send the agent work to Netlify. The difference is where orchestration, logs, resume state, and artifact upload live.

CLI first

The CLI is the fastest way to start:

nax run review

Or with a narrower model set when I want to save tokens:

nax run review --models claude,codex

For local orchestration and dashboard event streaming:

nax run review --transport netlify-api

Model toggling matters.

Sometimes I want the full council. Sometimes two models are enough for a cheap confirmation pass. Model choice belongs at runtime, not only in the workflow file.

GitHub as the trigger surface

The GitHub integration is where this becomes practical for a team.

Use the NAX GitHub Action when a named workflow should run from a PR, a manual workflow dispatch, or another GitHub Actions trigger:

name: Run NAX

on:
  workflow_dispatch:
    inputs:
      flow:
        default: review
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  nax:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v6
      - uses: netlify-labs/nax@v1.0.2
        with:
          flow: ${{ github.event.inputs.flow || 'review' }}
          repo: ${{ github.repository }}
          branch: ${{ github.head_ref || github.ref_name }}
          netlify-auth-token: ${{ secrets.NETLIFY_AUTH_TOKEN }}
          netlify-site-id: ${{ secrets.NETLIFY_SITE_ID }}
          github-token: ${{ github.token }}

The action installs netlify-agent-executor, runs nax run <flow> --transport netlify-api, prints the latest summary, and can upload .nax artifacts.

Direct @netlify comments are a separate path through netlify-labs/agent-runner-action. I use those for ad hoc single Agent Runner sessions, not for named NAX workflows.

The NAX Action flow is straightforward:

A PR opens, a PR updates, or a human starts workflow_dispatch.
GitHub Actions starts the selected NAX workflow.
NAX submits each configured step to Netlify Agent Runners.
The workflow summary and Agent Runner artifacts are saved under .nax.
A follow-up GitHub Actions step can post the summary back to a PR, issue, Slack channel, or release note.

Human review still owns the merge decision, but the expensive review and synthesis work happens before a person has to read anything.

The dashboard

The CLI is for running workflows.

The dashboard is for inspecting them.

nax dashboard review

The dashboard is local. It opens from the CLI and gives me a browser workbench for workflow graphs, dry runs, live events, run details, and follow-up handoffs.

The dashboard gives me:

Workflow list and graph canvas.
Dry runs before creating Agent Runner jobs.
Recent runs from .nax/workflows.
Run details with prompts, outputs, and artifacts.
Follow-up controls for sending selected artifacts to the next agent.

A dry run catches obvious mistakes before spending a real run: missing prompt files, wrong step inputs, bad model overrides, or a synthesis step that has no prior results.

Artifacts are the product

NAX saves workflow output into .nax.

That directory is the audit trail.

For each run, I want to keep:

The workflow config.
The normalized task input.
Every model prompt.
Every model response.
Cross-review results.
Final summaries.
Agent Runner and session summaries.
Token and cost metadata when available.

The common handoff paths are:

.nax/workflows/{workflow-run-id}/artifacts/summary.md
.nax/agent-runners/{runner-id}/summary.md
.nax/agent-sessions/{session-id}/summary.md

Agent workflows are easier to trust when the intermediate outputs are inspectable. If the final summary says all three models agreed, I want the evidence. If one model disagreed, I want to know why.

The artifact trail also makes workflows improvable. You can compare runs, tune prompts, remove useless steps, and identify which model combinations actually help for a given task.

Bundled workflows

NAX ships with bundled workflows for common repo work:

Code review.
Security review.
Performance review.
Analytics and SEO audits.
Accessibility and mobile responsiveness audits.
Documentation improvements.
Unit and E2E test generation.
Error handling.
UX copy polish.
Idea ranking and next-task selection.

Where NAX is useful

I use it first for code review, but repeatability is the real win.

One-off prompts are fine until you find yourself pasting the same instructions into three tools every week. NAX turns the useful prompt into a reusable workflow.

The workflows I keep reaching for:

Code review: independently inspect a diff, cross-check findings, and post the final review.
Security review: focus on auth, billing, webhooks, data exposure, dependency risk, and deployment configuration.
Performance review: find likely bottlenecks, measurement gaps, and safe optimization targets.
Documentation: compare docs to the codebase, synthesize a focused update plan, and let Codex implement the chosen slice.
Tests: identify missing unit or E2E coverage, synthesize a first test plan, and implement the safest slice.
Do next: ask multiple models what to work on next, then synthesize one ranked recommendation.

The common thread is structured disagreement before action.

The real lesson

The first wave of coding agents made it easy to ask one model to do one thing.

The next useful layer is orchestration:

Which model should run?
In what order?
With what context?
What artifacts should be passed forward?
What should be checked by another model?
What requires human approval?
What should be saved so the workflow can improve?

NAX gives me a repeatable way to run agentic workflows where Claude, Gemini, and Codex can do independent work, challenge each other, produce an inspectable artifact trail, and hand the final decision back to a person.

For serious AI-assisted development, I want fast agents, explicit workflows, preserved evidence, and a human still holding the merge button.

Get the next useful thing I publish

Occasional practical writing, project notes, and tools when they are worth sending.

David Wells

Builder of things

Software architect
& product wrangler

Get the next post

NAX: Multi-Model Agent Workflows on Netlify

Demo

The council pattern

What NAX runs

Why run this

Why use a remote agent service

CLI first

GitHub as the trigger surface

The dashboard

Artifacts are the product

Bundled workflows

Where NAX is useful

The real lesson

Get the next useful thing I publish

David Wells

Builder of things

Software architect& product wrangler

Get the next post

NAX: Multi-Model Agent Workflows on Netlify

Demo

The council pattern

What NAX runs

Why run this

Why use a remote agent service

CLI first

GitHub as the trigger surface

The dashboard

Artifacts are the product

Bundled workflows

Where NAX is useful

The real lesson

Get the next useful thing I publish

Software architect
& product wrangler