Engineering AI Agent Playbook

GitHub issues pile up. PRs sit without context. Release notes get written at 11pm before a deploy. This playbook describes four workflow variants — issue triage, PR summary, release note drafting, and build failure analysis — that a ProvenanceOne engineering agent can handle. Each variant is independently deployable. Start with issue triage: lowest disruption, fastest feedback, easiest to validate.

Warning: Never configure this agent to make autonomous changes to production systems, merge pull requests without human approval, or deploy code. These actions require explicit human review regardless of agent confidence.

Warning: Do not give the agent access to production secrets, API keys, or deployment credentials. Scope GitHub access to read-only unless writes are specifically needed for a step with approval gates in place.


What this agent does

Four independently deployable workflow variants:

Variant A — Issue triage (webhook trigger: GitHub issue created)

An engineer files an issue. Within seconds, the agent classifies it, assigns labels, creates a linked Jira ticket for P1/P2 issues, and notifies the right team channel — so no issue sits unclassified in the backlog.

  1. Reads issue title, body, labels, and author context via a data step using the GitHub MCP server.
  2. Classifies by type (bug / feature / question) and severity (P1–P4) via an agent step (category: reasoning).
  3. Applies labels, milestone, and team assignment via an action step (GitHub write, trust: medium).
  4. If severity is P1 or P2: a logic step triggers an action step to create a linked Jira ticket.
  5. A notify step posts to the relevant team Slack channel.
  6. A storage step writes the triage decision and run ID to the datastore for calibration tracking.

Variant B — PR summary (webhook trigger: GitHub PR opened)

A PR is opened. The agent reads the diff, generates a summary, identifies risk areas, and posts it as a PR comment — so reviewers arrive with context rather than starting from scratch.

  1. Reads the diff, commit messages, and linked issue via a data step (GitHub MCP, read-only).
  2. Generates a concise summary of changes and rationale via an agent step (category: coding).
  3. Identifies risk areas — changes touching auth, payments, data models, or security-sensitive paths — via a second agent step.
  4. If risk areas are identified: an approval step fires (risk: medium, SLA: 240 minutes, assignee: security-review@).
  5. Posts the summary as a PR comment via an action step (GitHub write scope — pull-requests:write).
  6. Logs the run to the audit datastore.

Variant C — Release notes (manual trigger)

An engineer triggers a release note run. The agent reads merged PRs since the last release, groups them, drafts notes, and routes them for manager review before posting.

  1. Reads merged PRs since the last release tag from GitHub via a data step.
  2. Groups by type — features, fixes, security patches — via a skill step (changelog generation).
  3. Drafts release notes in your standard format via an agent step (category: coding).
  4. Routes to the engineering manager for review via an approval step (risk: low, SLA: 1,440 minutes).
  5. On approval: posts to the #releases Slack channel via an action step.

Variant D — Build failure analysis (event trigger: ci/build-failed bus event)

A CI build fails. The agent reads the build log, identifies the failure category, suggests a probable fix, creates a Jira ticket, and notifies the PR author — without the on-call engineer needing to manually triage the log.

  1. Reads the build log via a data step (MCP tool or skill, depending on your CI provider).
  2. Identifies the failure category — test failure, compilation error, dependency conflict — via an agent step.
  3. Suggests a probable fix via a second agent step (category: coding, trust: high).
  4. Creates a Jira ticket with the diagnosis via an action step.
  5. Notifies the PR author via a notify step (Slack).

Best-fit use cases

  • Issue triage — high-volume repos where issues frequently sit unlabelled for days; teams where the on-call rotation is responsible for triage
  • PR summaries — teams with frequent large PRs where reviewer context-loading is a known bottleneck; onboarding new engineers who need codebase orientation
  • Release notes — teams that ship frequently and find release note writing time-consuming or inconsistent
  • Build failure analysis — monorepos with complex build graphs where identifying the root cause of a CI failure takes 15+ minutes manually

When not to use this agent

  • Autonomous PR merges — never. PR merges require human approval regardless of CI status or agent confidence.
  • Production deployments — the agent must not trigger, approve, or initiate production deployments.
  • Security vulnerability remediation — the agent can surface a Snyk finding and draft a ticket; it cannot commit a fix.
  • Performance reviews or headcount decisions — do not use GitHub activity data surfaced by this agent for any people management purpose.
  • Low-volume repos where manual triage takes less time than workflow setup — the setup investment is worth it at meaningful scale (roughly 20+ issues or PRs per week).

Required connections and data sources

ConnectionPurposeAuth method
GitHubIssue data, PR diffs, merged PR lists, PR commenting, label managementOAuth 2.0
JiraRemediation and triage ticket creationAPI Key
SlackTeam notifications, PR author alerts, release note publishingOAuth 2.0
Snyk (optional)Security vulnerability context in PR summary and issue triageAPI Key
PagerDuty (optional)P1 incident escalation from issue triageAPI Key
Datadog (optional)Metrics context for build failure analysisAPI Key

Configure connections at Settings → Connections. The GitHub OAuth 2.0 connection requires read scopes (issues:read, pull-requests:read, contents:read) for read-only steps. Add write scopes (issues:write, pull-requests:write) only for steps that post comments or apply labels, and only with approval gates on those action steps.


Issue triage agent (Variant A):

You are an engineering issue triage assistant. Your job is to classify GitHub issues and recommend labels, severity, and team assignment.

Rules:
1. Classify each issue as one of: bug, feature-request, question, documentation, or chore.
2. Assign severity: P1 (production outage / data loss / security), P2 (major functionality broken, no workaround), P3 (functionality impaired, workaround exists), P4 (minor / cosmetic).
3. Include a confidence score (0–1) for your severity classification. If confidence is below 0.6, recommend escalation to a human triager rather than assigning automatically.
4. Never infer severity from author seniority or account history — classify based on the issue content only.
5. If the issue mentions credentials, secrets, API keys, or security vulnerabilities, immediately classify as P1 and flag for security team review.

PR summary agent (Variant B):

You are an engineering PR summary assistant. Your job is to summarise pull request changes for reviewers.

Rules:
1. Summarise: what changed, why (based on commit messages and linked issue), and what the reviewer should focus on.
2. Identify risk areas: changes touching authentication, authorisation, payment flows, data models, cryptography, secrets handling, or infrastructure configuration. Flag these explicitly.
3. Do not evaluate code style or formatting — focus on what the change does and its potential impact.
4. Label your summary clearly as "AI-generated summary — review before acting."
5. If the diff is too large to summarise accurately, say so and list the files changed by category rather than fabricating a summary.

Required skills and tools

StepKindDescription
GitHub MCP servermcpProvides rich code context: issue read, PR diff read, file content read, PR comment write, label write. Scope tool allowlist to read-only by default.
Changelog generationskillCategory: transform. Groups merged PR titles and descriptions by type (feature / fix / security). Input: list of PR objects. Output: structured changelog sections.
Semantic diff analysis (optional)skillCategory: data. Identifies high-risk file paths in a PR diff (auth, payments, infra). Used to improve risk area detection in Variant B.

GitHub MCP server — scoping (critical):

Use a GitHub MCP server for richer code context than the REST API alone. In the MCP Gateway policy, set the tool allowlist to read-only tools by default:

Allowed (read-only):
  - get-issue
  - list-issues
  - get-pull-request
  - get-pull-request-diff
  - get-file-contents
  - list-commits

Requires explicit justification and trust: high to add:
  - create-issue-comment
  - add-labels
  - create-pull-request-review

Configure the MCP Gateway policy at MCP Servers → Gateway Policies. Every tool call through the MCP Gateway is logged to the audit trail. See /docs/mcp-servers/index.


Variant A — Issue triage:

[Trigger: webhook — GitHub issue created]
       ↓
[Data: read issue — GitHub MCP (read-only)]
       ↓
[Agent: classify type + severity — reasoning category]
       ↓
[Logic: is severity P1 or P2?]
       ↓ Yes                              ↓ No
[Action: create Jira ticket]     [Action: apply labels + team]
       ↓                                  ↓
[Notify: Slack team channel]
       ↓
[Storage: triage decision + run ID]

Variant B — PR summary:

[Trigger: webhook — GitHub PR opened]
       ↓
[Data: read diff + commits — GitHub MCP (read-only)]
       ↓
[Agent: generate summary — coding category]
       ↓
[Agent: identify risk areas — coding category]
       ↓
[Logic: risk areas found?]
       ↓ Yes                              ↓ No
[Approval: security-review@            [Action: post PR comment]
 risk: medium, SLA: 240min]                    ↓
       ↓ approved               [Storage: audit log]
[Action: post PR comment]
       ↓
[Storage: audit log]

Human approval rules

Approval required for:

  • Any PR merge
  • Production deploys (regardless of CI passing)
  • Closing or modifying issues filed by external contributors
  • Any security-related label change
  • Access to production the secrets vault values

No approval required for:

  • Reading GitHub issues, PRs, and diffs (read-only operations)
  • Posting automated PR summaries labelled "AI-generated"
  • Creating Jira tickets
  • Sending Slack notifications
  • Drafting release notes (approval required before publishing)
  • Applying non-security labels to issues the agent triaged

For all action steps that write to GitHub (posting comments, applying labels), set the step's trust to medium. For any write that touches security-related data or configuration, set trust to high and add an approval step upstream.

Example approval for a high-risk PR summary:

action: "Post PR summary flagging auth-related changes"
summary: "PR #847 modifies the session token validation logic. Risk areas identified: authentication flow, token expiry handling. Security review recommended before merge."
risk: medium
slaMinutes: 240
assignees:
  - [email protected]
rationale: "Changes detected in auth/session.ts and middleware/token-validator.ts. These files are in the high-risk path list."
confidence: 0.85
evidence:
  - label: "Files changed"
    value: "auth/session.ts, middleware/token-validator.ts"
    tone: amber
  - label: "Risk category"
    value: "Authentication"
    tone: red
  - label: "PR author"
    value: "external-contributor"
    tone: amber

Security and permission model

GitHub scopes — principle of least privilege:

ScopeRequired forNotes
issues:readVariants A, B, DRead-only; always permitted
pull-requests:readVariants B, CRead-only; always permitted
contents:readVariants B, DRead-only; always permitted
issues:writeVariant A (labels)Needs approval gate on action step
pull-requests:writeVariant B (PR comment)Needs approval gate on action step

Secret and credentials protection:

Configure the MCP Gateway policy to deny tool calls that read file paths matching these patterns:

Deny file read for paths matching:
  - .env
  - .env.*
  - *.key
  - *.pem
  - *.p12
  - secrets/
  - .github/
  - config/credentials*

Test this policy in a staging repo with a known .env file before enabling in production.

Roles:

RoleCan do
adminCreate and modify agent and workflow, update system prompts, publish to production, modify MCP Gateway policy
editorModify workflow steps, update skill parameters, view run debugger
viewerView run history and agent outputs — cannot modify configuration
approvers groupAction approval requests for security-flagged PR summaries and P1 triage escalations

Audit events emitted: run.started, run.completed, run.failed, approval.granted, approval.rejected, approval.sla_breach, agent.updated, connection.accessed.


Evaluation checklist

  • Agent PR summaries tested on 20 historical PRs — accuracy reviewed by a senior engineer who knows those PRs
  • GitHub MCP Gateway policy limits to read-only tools by default — verify in the policy configuration, not just the README
  • No production secrets are accessible to the agent — verify via the MCP Gateway policy deny rules in a staging test
  • High-risk code changes (auth, payments, infra file paths) reliably trigger the approval step — test with three known high-risk PRs
  • Run debugger shows which files were read and what the agent reasoned — no black-box steps
  • Jira ticket quality reviewed against 10 manually created tickets — field completeness and description accuracy
  • Snyk integration (if used) tested with a known-vulnerable dependency — confirm the agent surfaces it correctly
  • Issue triage severity classifications compared against engineer-assigned severity for 20 historical issues

Rollout plan

Weeks 1–2 (shadow mode — issue triage only): Run Variant A in shadow mode. Agent classifies issues but takes no action. Compare agent triage to engineer triage daily. Track false positive rate on severity classification.

Weeks 3–4 (live issue triage): Enable label application and Jira ticket creation for Variant A. Monitor output daily. Keep the calibration datastore running. Review any P1/P2 escalations immediately.

Month 2 (PR summaries): Enable Variant B on one low-traffic repository. Start with read-only summary posting — no risk-area gating yet. Collect engineer feedback via Slack or PR comment reactions.

Month 3 (release notes): Enable Variant C with manual trigger. Human approval required before every publish. This is a low-risk workflow to build trust in the agent's output quality.

Month 4 and beyond (build failure analysis): Enable Variant D if CI integration is stable and the engineering team has built confidence from the first three variants. Track time-to-diagnosis metrics.


Common failure modes

Agent misclassifies severity (marks P1 as P3) High-severity issues are under-triaged and sit in the backlog. Mitigation: add a severity keyword list to the system prompt (e.g. "production outage", "data loss", "security vulnerability" → always P1). Require the agent to include a confidence score; route low-confidence triage to a human reviewer. Track misclassification rate in the calibration datastore.

Agent posts an inaccurate PR summary Engineers act on an incorrect summary and miss a review concern. Mitigation: always label summaries "AI-generated — review before acting." Monitor PR comment reactions for corrections. Review the run debugger when an engineer flags an inaccurate summary. Retrain system prompt based on flagged examples.

GitHub write action fails due to scope The action step that posts a PR comment or applies labels fails because the GitHub connection lacks write scope. Mitigation: test all write operations in a sandbox repository with the same connection before enabling in production. Use the run debugger to verify each action step's output. Check the connection configuration for the correct OAuth scopes.

Agent reads secrets from repository via MCP GitHub tool The agent uses the get-file-contents tool to read a .env file or credentials file. Mitigation: configure the MCP Gateway deny list for file path patterns before enabling the GitHub MCP server. Test the deny rules in staging: attempt to read a .env file and verify the Gateway blocks it. Do not rely on the system prompt alone to prevent this — use the Gateway policy as the enforcement layer.


ROI assumptions

The table below uses illustrative assumptions. Replace with your team's actual values. Track build-failure diagnosis time savings post-implementation rather than estimating.

InputIllustrative value
Issues triaged per week80
Minutes per issue triage — current (engineer time)8
Minutes per issue triage — with agent (review + confirm)2
PRs per week40
Minutes per PR summary — current (reviewer context-loading)6
Minutes per PR summary — with agent (review AI summary)1
Loaded hourly cost (senior engineer)$120
Build-failure diagnosis time savedTrack post-implementation

At these assumptions: issue triage savings = 80 issues/week × (8 − 2) min × 52 weeks / 60 × $120 = approximately $49,920 / year. PR summary savings = 40 PRs/week × (6 − 1) min × 52 weeks / 60 × $120 = approximately $20,800 / year. Combined illustrative annual value: approximately $70,000+, before build failure analysis savings.

Use the interactive calculator to adjust these inputs: /tools/ai-agent-roi-calculator?use_case=engineering


FAQ

Can the agent merge pull requests automatically?

No. PR merges must have a human approval step regardless of CI status or agent confidence score. This is a hard requirement — do not remove or bypass the approval gate. If you want to reduce merge friction, use the PR summary workflow to give reviewers better context, not to replace their decision.

Can the agent write code?

The agent can draft code suggestions in PR comments or issue replies. It cannot commit code to a repository without a write-capable GitHub action step, which requires an approval gate. Even with approval gates in place, committing agent-generated code requires an engineer to review the diff before approving.

How do I prevent the agent from accessing secrets in the repo?

Configure the MCP Gateway policy to deny tool calls that read file paths matching secret patterns — .env, *.key, *.pem, secrets/, .github/. Test the policy in a staging repository with a known .env file before enabling in production. Do not rely on the system prompt alone: enforce it at the Gateway policy layer.

What GitHub permissions does the agent need?

Read-only for most workflows: issues:read, pull-requests:read, contents:read. Write permissions — issues:write, pull-requests:write — only where explicitly needed (label application, PR comment posting), with approval gates on those action steps. Never grant repo:admin, deploy keys, or workflow write scopes.

Will the agent slow down our engineering team?

Poorly configured: yes. Well configured: no. The agent handles classification and summary work asynchronously — engineers review the output in their own time rather than waiting for the agent. Start with issue triage, the lowest-disruption variant. Measure time-to-triage and false positive rate for the first two weeks before expanding to PR summaries or build failure analysis.