Copilot Guardian

The CI Healer That Knows When AI Is Wrong

When GitHub Actions fails, Guardian analyzes logs with multi-hypothesis reasoning, proposes risk-stratified patches, and blocks unsafe AI output before it touches your code.

Production-ready | 90-second setup | Full audit trail

Quick Test (90 seconds):

npx copilot-guardian@latest run --repo YOUR/REPO --last-failed --fast

Why This Is a Copilot CLI Challenge Submission

Guardian demonstrates five advanced Copilot usage patterns under real CI failures:

Multi-hypothesis reasoning: generate three competing theories with explicit confidence and evidence.
Risk-stratified generation: synthesize Conservative/Balanced/Aggressive patches at different risk levels.
Deterministic fail-closed guardrails: override AI decisions when bypass patterns or malformed outputs are detected.
MCP-enriched context: use Model Context Protocol for deeper repository context.
Complete artifact trail: export analysis.json, raw Copilot responses, and patch index for auditability.

Runtime clarification:

Guardian is a terminal CLI tool built with @github/copilot-sdk as the primary runtime.
gh copilot CLI is available as an optional fallback for local reproducibility.
Copilot interactions are logged to .raw.txt files for traceability.

Real proof:

Conservative patch was auto-blocked because Copilot returned malformed JSON.
See Real Output Showcase.

Demo (GIF)

Runtime: 3m43s | Profile: --fast --max-log-chars 20000

What happens in this run:

Analyze a real CI failure with multi-hypothesis reasoning.
Generate three strategies: Conservative, Balanced, Aggressive.
Run independent quality review for each strategy.
Block malformed AI output with fail-closed guardrails.
Export raw artifacts for full auditability.

Browse the exact files from this demo:

examples/real-output/standard/

Why This Matters

The problem:

AI-assisted CI fixes can silently introduce security bypasses or malformed outputs.
A green-looking suggestion is not proof of safe behavior.
Most tools generate one patch and hope it works.

Guardian's approach:

Generate multiple hypotheses and patch strategies instead of one guess.
Validate every candidate with deterministic controls.
Fail closed (NO_GO) when output is malformed or risky.

Real case from the demo above:

Copilot returned malformed quality JSON for Conservative review.
Deterministic guard detected parse failure.
Guardian blocked with NO_GO verdict.
Broken output was never auto-applied.

This is fail-closed engineering: trust AI, but verify with hard rules.

Real Output Showcase

This is not a mock. Files below are unmodified outputs from actual Guardian runs.

Fail-Closed in Action: When AI Gets It Wrong

Conservative strategy was automatically rejected due to malformed Copilot JSON:

{
  "verdict": "NO_GO",
  "risk_level": "high",
  "slop_score": 1.0,
  "reasons": [
    "Parse error: Unbalanced JSON object in Copilot response - missing closing brace"
  ],
  "suggested_adjustments": [
    "Re-run quality review with stricter JSON-only response",
    "Inspect copilot.quality.*.raw.txt for malformed output"
  ]
}

What happened:

AI generated a patch for src/engine/github.ts.
Quality review response was incomplete JSON.
Deterministic guard caught the parse error.
System blocked the patch (fail-closed enforcement).
User was protected from applying broken code.

Source files:

Patch Quality Spectrum

Strategy	Target File	Risk	Verdict	Slop	What It Does
Conservative	`src/engine/github.ts`	HIGH	NO_GO	1.0	Blocked - malformed AI output
Balanced	`src/engine/github.ts`	LOW	GO	0.08	Null-safe fallback
Aggressive	`tests/quality_guard_regression_matrix.test.ts`	LOW	GO	0.08	Test update

Source files:

Complete Artifact Trail

examples/real-output/standard/
  patch_options.json
  fix.conservative.patch
  fix.balanced.patch
  fix.aggressive.patch
  quality_review.*.json
  copilot.*.raw.txt

Browse all files:

examples/real-output/standard/

Additional evidence:

examples/real-output/abstain/guardian.report.json

Key insight: Guardian does not trust AI blindly; deterministic checks can override model output.

Judge Quick Test (90 seconds)

Prerequisites:

gh auth status succeeds.
GitHub Copilot is enabled for your account/session.

npx copilot-guardian@latest run \
  --repo flamehaven01/copilot-guardian \
  --last-failed \
  --show-options \
  --fast \
  --max-log-chars 20000

Expected output:

Structured diagnosis in analysis.json.
Strategy index in patch_options.json.
Safety verdicts in quality_review.*.json.

For extended trace mode (slower), add --show-reasoning.

Quick Start

Prerequisites

Node.js 18+
GitHub CLI (gh) authenticated
GitHub Copilot subscription (SDK access)

Installation

# Global install
npm i -g copilot-guardian@latest

# Or run without install
npx copilot-guardian@latest --help

Package:

https://www.npmjs.com/package/copilot-guardian

Core Commands

# Stable demo profile
copilot-guardian run \
  --repo owner/repo \
  --last-failed \
  --show-options \
  --fast \
  --max-log-chars 20000

# Analysis only
copilot-guardian analyze \
  --repo owner/repo \
  --run-id <run_id> \
  --fast

# Evaluate multiple failed runs
copilot-guardian eval \
  --repo owner/repo \
  --failed-limit 5 \
  --fast

# Interactive follow-up
copilot-guardian debug \
  --repo owner/repo \
  --last-failed

How It Works

Full architecture:

docs/ARCHITECTURE.md

graph TB
    A[GitHub Actions Failure] --> B[Guardian CLI]
    B --> C[Context Fetch]
    C --> D[Multi-Hypothesis Analysis]
    D --> E[Copilot SDK]
    E --> F[Patch Strategies]
    F --> G[Deterministic Quality Guard]
    G --> H{GO?}
    H -->|NO_GO| I[Reject and Re-diagnose]
    H -->|GO| J[Patch Candidate]
    J --> K[Safe Branch PR or Auto-Heal]

Key Modules

Layer	Module	Purpose
Detection	`src/engine/github.ts`	Collect failure context
Intelligence	`src/engine/analyze.ts`	Multi-hypothesis diagnosis
Decision	`src/engine/patch_options.ts`	Strategy generation
Validation	Deterministic + model review	Slop and bypass control
Action	`src/engine/auto-apply.ts`	Safe branch/PR workflow

Forced Abstain Policy

Guardian intentionally abstains for non-patchable failure classes:

401/403 authentication failures
token permission errors
API rate-limit or infrastructure unavailability

When abstaining:

abstain.report.json is emitted with classification.
Patch generation is skipped.
User receives actionable recommendations.

Example:

examples/real-output/abstain/guardian.report.json

Output Files

Artifacts are generated under .copilot-guardian/:

File	Purpose	Example
`analysis.json`	Diagnosis + selected hypothesis	demo context
`reasoning_trace.json`	Complete hypothesis trace	demo context
`patch_options.json`	Strategy index + verdicts	view
`fix.*.patch`	Generated patch files	view
`quality_review.*.json`	Per-strategy quality results	view
`abstain.report.json`	Forced abstain classification	view
`copilot.*.raw.txt`	Raw model output snapshots	view

Tip:

Check examples/real-output/ for evidence from standard and abstain paths.

Documentation Links

Real output evidence: examples/real-output/README.md
Demo walkthrough: examples/demo-failure/README.md
Architecture: docs/ARCHITECTURE.md
Changelog: CHANGELOG.md
Security: SECURITY.md
Contributing: CONTRIBUTING.md

Try It Now

# Test on your repo (90 seconds)
npx copilot-guardian@latest run \
  --repo YOUR/REPO \
  --last-failed \
  --fast

# Or reproduce this exact demo
npx copilot-guardian@latest run \
  --repo flamehaven01/copilot-guardian \
  --last-failed \
  --fast

Questions?

License

MIT License. See LICENSE.

Copilot Guardian Project by Flamehaven (Yun) for the GitHub Copilot CLI Challenge

Trust is built on receipts, not promises.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
prompts		prompts
schemas		schemas
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
jest.config.cjs		jest.config.cjs
package-lock.json		package-lock.json
package.json		package.json
test-sdk.mjs		test-sdk.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Copilot Guardian

The CI Healer That Knows When AI Is Wrong

Why This Is a Copilot CLI Challenge Submission

Demo (GIF)

Why This Matters

Real Output Showcase

Fail-Closed in Action: When AI Gets It Wrong

Patch Quality Spectrum

Complete Artifact Trail

Judge Quick Test (90 seconds)

Quick Start

Prerequisites

Installation

Core Commands

How It Works

Key Modules

Forced Abstain Policy

Output Files

Documentation Links

Try It Now

License

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Copilot Guardian

The CI Healer That Knows When AI Is Wrong

Why This Is a Copilot CLI Challenge Submission

Demo (GIF)

Why This Matters

Real Output Showcase

Fail-Closed in Action: When AI Gets It Wrong

Patch Quality Spectrum

Complete Artifact Trail

Judge Quick Test (90 seconds)

Quick Start

Prerequisites

Installation

Core Commands

How It Works

Key Modules

Forced Abstain Policy

Output Files

Documentation Links

Try It Now

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages