mech.app
Dev Tools

GitHub Copilot CLI Custom Agents: How Terminal Workflows Replace One-Off Prompts

Custom agents in GitHub Copilot CLI encode stack context and team conventions into repeatable terminal workflows with auditable command sequences.

Source: github.blog
GitHub Copilot CLI Custom Agents: How Terminal Workflows Replace One-Off Prompts

GitHub Copilot CLI now supports custom agents that turn ad-hoc terminal commands into repeatable, reviewable workflows. Instead of prompting “deploy staging” and hoping the LLM remembers your stack, you define an agent that knows your Kubernetes namespace, your Terraform modules, and your team’s approval gates. The agent persists those decisions and generates auditable command sequences every time.

This is the infrastructure pattern that separates one-shot prompt tools from workflow automation. Custom agents encode context once and replay it with guardrails.

What Custom Agents Actually Do

A custom agent in GitHub Copilot CLI is a named workflow definition that combines:

  • Stack context: Language versions, framework conventions, dependency managers.
  • Project structure: Monorepo layout, service boundaries, config file locations.
  • Team conventions: Branch naming, commit message format, deployment approval steps.
  • Command templates: Parameterized shell sequences with validation hooks.

When you invoke a custom agent, it generates a command plan before execution. You review the plan, approve it, and the CLI runs the sequence. Each step logs its output, so you can trace failures back to the specific command that broke.

The agent does not run in a sandbox by default. It generates shell commands that execute in your current working directory with your current user permissions. The safety boundary is the approval gate, not process isolation.

Persistence and Storage Model

GitHub stores custom agent definitions in three places, depending on scope:

Storage LocationScopeUse Case
.github/copilot/agents/RepositoryTeam workflows that ship with the codebase
~/.config/github-copilot/agents/UserPersonal shortcuts and local tooling
GitHub Cloud RegistryOrganizationShared agents across repos, versioned centrally

Repository-scoped agents travel with the code. When you clone a repo, you inherit its custom agents. This makes onboarding faster because new engineers get the same deployment commands, test runners, and debugging workflows as the rest of the team.

User-scoped agents stay local. You can define a quick-deploy agent that skips approval gates for your personal dev environment without forcing that shortcut on your teammates.

Organization-scoped agents live in GitHub’s cloud registry. They require authentication and support versioning. If your security team publishes a secure-deploy agent that enforces compliance checks, every repo in the org can reference it by name and version.

Context Indexing and Retrieval

Custom agents access project context without re-scanning the filesystem on every invocation. GitHub Copilot CLI maintains a local index that tracks:

  • Dependency manifests (package.json, requirements.txt, go.mod).
  • CI/CD config files (.github/workflows/, .gitlab-ci.yml, Jenkinsfile).
  • Infrastructure-as-code definitions (terraform/, k8s/, docker-compose.yml).
  • Service discovery metadata (Consul, etcd, service mesh configs).

The index updates when you run gh copilot sync or when the CLI detects file changes in watched directories. The sync operation is incremental. It diffs the filesystem against the last known state and updates only the changed entries.

Agents query this index using a structured API. An agent definition includes a context block that specifies which files and metadata it needs. The CLI resolves those queries against the index and injects the results into the agent’s prompt context.

name: deploy-staging
context:
  - type: file
    path: terraform/staging/main.tf
  - type: dependency
    manifest: package.json
  - type: service
    name: api-gateway
commands:
  - validate: terraform validate terraform/staging
  - plan: terraform plan -out=staging.tfplan terraform/staging
  - apply: terraform apply staging.tfplan
approval: required

The context block tells the CLI to load the Terraform config, parse the Node.js dependencies, and look up the api-gateway service definition before generating commands. The agent can reference these values in its command templates using variable interpolation.

Execution Flow and Approval Gates

When you invoke a custom agent, the CLI follows this sequence:

  1. Load agent definition: Resolve the agent name to a YAML file in one of the three storage locations.
  2. Fetch context: Query the local index for the files and metadata specified in the context block.
  3. Generate command plan: Send the agent definition and context to the Copilot LLM, which returns a sequence of shell commands.
  4. Display plan: Print the command sequence to the terminal with syntax highlighting and risk annotations.
  5. Wait for approval: If approval: required, block until the user types yes or y.
  6. Execute commands: Run each command in order, streaming stdout and stderr to the terminal.
  7. Log results: Write the full command history and output to ~/.config/github-copilot/logs/.

The approval gate is the only safety boundary. If you skip approval by setting approval: optional, the agent runs immediately. This is useful for read-only agents (log queries, status checks) but dangerous for agents that mutate state.

GitHub does not sandbox the execution environment. Commands run with your shell’s environment variables, PATH, and file permissions. The CLI does not validate command safety at runtime, so destructive commands will execute if approved. The LLM is trained to avoid destructive patterns, but this is a behavioral guardrail, not a technical control.

Failure Modes and Observability

Custom agents fail in predictable ways:

  • Context resolution failure: The agent references a file or service that does not exist. The CLI aborts before generating commands.
  • LLM timeout: The Copilot API does not respond within 30 seconds. The CLI retries once, then fails.
  • Command execution failure: A shell command exits with a non-zero status. The CLI stops the workflow and logs the error.
  • Approval timeout: The user does not respond to the approval prompt within 5 minutes. The CLI cancels the workflow.

Each failure writes a structured log entry to ~/.config/github-copilot/logs/. The log includes the agent name, the command that failed, the exit code, and the full stderr output. You can replay failed workflows by running gh copilot replay <log-id>.

Observability is local by default. GitHub does not collect telemetry on custom agent usage unless you opt in. If you enable telemetry, the CLI sends anonymized metrics (agent name, command count, success rate) to GitHub’s analytics pipeline. It does not send command arguments or output.

Security Boundaries and Trust Model

Custom agents inherit the security posture of the GitHub Copilot CLI. The CLI authenticates to GitHub using OAuth and stores tokens in the system keychain. Agents run with the same permissions as the user who invoked them.

Repository-scoped agents are a supply chain risk. If an attacker compromises a repo and modifies .github/copilot/agents/deploy.yml, they can inject malicious commands into your deployment workflow. The CLI does not verify agent definitions against a signature or checksum.

Organization-scoped agents mitigate this risk by centralizing control. Your security team can publish signed agents to the cloud registry and enforce a policy that blocks local agent definitions. This requires GitHub Enterprise and the Copilot for Business plan.

User-scoped agents are a privilege escalation risk. If an agent runs sudo commands without approval gates, it can bypass OS-level access controls. The CLI does not restrict which commands an agent can generate.

Technical Verdict

Use GitHub Copilot CLI custom agents if your team runs the same multi-step workflows more than five times per week, your team has three or more engineers who need consistent tooling, or you need auditable command history for compliance or post-incident review. The local index and approval gates make sense when workflows depend on project-specific context that changes frequently (service names, environment variables, feature flags).

Avoid custom agents if you need deterministic execution and cannot tolerate LLM variability in command generation. Skip them if your workflows are simple enough to script in Bash or Make without the overhead of context indexing. Do not use them in ephemeral CI/CD environments where the CLI cannot maintain a persistent local index, or in security-critical operations unless you can enforce organization-scoped agents with centralized review and signing. If your security policy prohibits unapproved command execution or requires sandboxed environments, the approval gate alone is not sufficient protection.

The real value is in codifying team conventions so new engineers do not have to ask “how do I deploy this?” every time. The trust model assumes the LLM will not generate destructive commands, but the CLI does not enforce this at runtime. You are trading convenience for control.