Engineering AI Agent Playbook
GitHub issues pile up. PRs sit without context. Release notes get written at 11pm before a deploy. This playbook describes four workflow variants — issue triage, PR summary, release note drafting, and build failure analysis — that a ProvenanceOne engineering agent can handle. Each variant is independently deployable. Start with issue triage: lowest disruption, fastest feedback, easiest to validate.
Warning: Never configure this agent to make autonomous changes to production systems, merge pull requests without human approval, or deploy code. These actions require explicit human review regardless of agent confidence.
Warning: Do not give the agent access to production secrets, API keys, or deployment credentials. Scope GitHub access to read-only unless writes are specifically needed for a step with approval gates in place.
What this agent does
Four independently deployable workflow variants:
Variant A — Issue triage (webhook trigger: GitHub issue created)
An engineer files an issue. Within seconds, the agent classifies it, assigns labels, creates a linked Jira ticket for P1/P2 issues, and notifies the right team channel — so no issue sits unclassified in the backlog.
- Reads issue title, body, labels, and author context via a
datastep using the GitHub MCP server. - Classifies by type (bug / feature / question) and severity (P1–P4) via an
agentstep (category:reasoning). - Applies labels, milestone, and team assignment via an
actionstep (GitHub write, trust:medium). - If severity is P1 or P2: a
logicstep triggers anactionstep to create a linked Jira ticket. - A
notifystep posts to the relevant team Slack channel. - A
storagestep writes the triage decision and run ID to the datastore for calibration tracking.
Variant B — PR summary (webhook trigger: GitHub PR opened)
A PR is opened. The agent reads the diff, generates a summary, identifies risk areas, and posts it as a PR comment — so reviewers arrive with context rather than starting from scratch.
- Reads the diff, commit messages, and linked issue via a
datastep (GitHub MCP, read-only). - Generates a concise summary of changes and rationale via an
agentstep (category:coding). - Identifies risk areas — changes touching auth, payments, data models, or security-sensitive paths — via a second
agentstep. - If risk areas are identified: an
approvalstep fires (risk: medium, SLA: 240 minutes, assignee:security-review@). - Posts the summary as a PR comment via an
actionstep (GitHub write scope —pull-requests:write). - Logs the run to the audit datastore.
Variant C — Release notes (manual trigger)
An engineer triggers a release note run. The agent reads merged PRs since the last release, groups them, drafts notes, and routes them for manager review before posting.
- Reads merged PRs since the last release tag from GitHub via a
datastep. - Groups by type — features, fixes, security patches — via a
skillstep (changelog generation). - Drafts release notes in your standard format via an
agentstep (category:coding). - Routes to the engineering manager for review via an
approvalstep (risk: low, SLA: 1,440 minutes). - On approval: posts to the
#releasesSlack channel via anactionstep.
Variant D — Build failure analysis (event trigger: ci/build-failed bus event)
A CI build fails. The agent reads the build log, identifies the failure category, suggests a probable fix, creates a Jira ticket, and notifies the PR author — without the on-call engineer needing to manually triage the log.
- Reads the build log via a
datastep (MCP tool or skill, depending on your CI provider). - Identifies the failure category — test failure, compilation error, dependency conflict — via an
agentstep. - Suggests a probable fix via a second
agentstep (category:coding, trust:high). - Creates a Jira ticket with the diagnosis via an
actionstep. - Notifies the PR author via a
notifystep (Slack).
Best-fit use cases
- Issue triage — high-volume repos where issues frequently sit unlabelled for days; teams where the on-call rotation is responsible for triage
- PR summaries — teams with frequent large PRs where reviewer context-loading is a known bottleneck; onboarding new engineers who need codebase orientation
- Release notes — teams that ship frequently and find release note writing time-consuming or inconsistent
- Build failure analysis — monorepos with complex build graphs where identifying the root cause of a CI failure takes 15+ minutes manually
When not to use this agent
- Autonomous PR merges — never. PR merges require human approval regardless of CI status or agent confidence.
- Production deployments — the agent must not trigger, approve, or initiate production deployments.
- Security vulnerability remediation — the agent can surface a Snyk finding and draft a ticket; it cannot commit a fix.
- Performance reviews or headcount decisions — do not use GitHub activity data surfaced by this agent for any people management purpose.
- Low-volume repos where manual triage takes less time than workflow setup — the setup investment is worth it at meaningful scale (roughly 20+ issues or PRs per week).
Required connections and data sources
| Connection | Purpose | Auth method |
|---|---|---|
| GitHub | Issue data, PR diffs, merged PR lists, PR commenting, label management | OAuth 2.0 |
| Jira | Remediation and triage ticket creation | API Key |
| Slack | Team notifications, PR author alerts, release note publishing | OAuth 2.0 |
| Snyk (optional) | Security vulnerability context in PR summary and issue triage | API Key |
| PagerDuty (optional) | P1 incident escalation from issue triage | API Key |
| Datadog (optional) | Metrics context for build failure analysis | API Key |
Configure connections at Settings → Connections. The GitHub OAuth 2.0 connection requires read scopes (issues:read, pull-requests:read, contents:read) for read-only steps. Add write scopes (issues:write, pull-requests:write) only for steps that post comments or apply labels, and only with approval gates on those action steps.
Recommended agent instructions
Issue triage agent (Variant A):
You are an engineering issue triage assistant. Your job is to classify GitHub issues and recommend labels, severity, and team assignment.
Rules:
1. Classify each issue as one of: bug, feature-request, question, documentation, or chore.
2. Assign severity: P1 (production outage / data loss / security), P2 (major functionality broken, no workaround), P3 (functionality impaired, workaround exists), P4 (minor / cosmetic).
3. Include a confidence score (0–1) for your severity classification. If confidence is below 0.6, recommend escalation to a human triager rather than assigning automatically.
4. Never infer severity from author seniority or account history — classify based on the issue content only.
5. If the issue mentions credentials, secrets, API keys, or security vulnerabilities, immediately classify as P1 and flag for security team review.
PR summary agent (Variant B):
You are an engineering PR summary assistant. Your job is to summarise pull request changes for reviewers.
Rules:
1. Summarise: what changed, why (based on commit messages and linked issue), and what the reviewer should focus on.
2. Identify risk areas: changes touching authentication, authorisation, payment flows, data models, cryptography, secrets handling, or infrastructure configuration. Flag these explicitly.
3. Do not evaluate code style or formatting — focus on what the change does and its potential impact.
4. Label your summary clearly as "AI-generated summary — review before acting."
5. If the diff is too large to summarise accurately, say so and list the files changed by category rather than fabricating a summary.
Required skills and tools
| Step | Kind | Description |
|---|---|---|
| GitHub MCP server | mcp | Provides rich code context: issue read, PR diff read, file content read, PR comment write, label write. Scope tool allowlist to read-only by default. |
| Changelog generation | skill | Category: transform. Groups merged PR titles and descriptions by type (feature / fix / security). Input: list of PR objects. Output: structured changelog sections. |
| Semantic diff analysis (optional) | skill | Category: data. Identifies high-risk file paths in a PR diff (auth, payments, infra). Used to improve risk area detection in Variant B. |
GitHub MCP server — scoping (critical):
Use a GitHub MCP server for richer code context than the REST API alone. In the MCP Gateway policy, set the tool allowlist to read-only tools by default:
Allowed (read-only):
- get-issue
- list-issues
- get-pull-request
- get-pull-request-diff
- get-file-contents
- list-commits
Requires explicit justification and trust: high to add:
- create-issue-comment
- add-labels
- create-pull-request-review
Configure the MCP Gateway policy at MCP Servers → Gateway Policies. Every tool call through the MCP Gateway is logged to the audit trail. See /docs/mcp-servers/index.
Recommended workflow design
Variant A — Issue triage:
[Trigger: webhook — GitHub issue created]
↓
[Data: read issue — GitHub MCP (read-only)]
↓
[Agent: classify type + severity — reasoning category]
↓
[Logic: is severity P1 or P2?]
↓ Yes ↓ No
[Action: create Jira ticket] [Action: apply labels + team]
↓ ↓
[Notify: Slack team channel]
↓
[Storage: triage decision + run ID]
Variant B — PR summary:
[Trigger: webhook — GitHub PR opened]
↓
[Data: read diff + commits — GitHub MCP (read-only)]
↓
[Agent: generate summary — coding category]
↓
[Agent: identify risk areas — coding category]
↓
[Logic: risk areas found?]
↓ Yes ↓ No
[Approval: security-review@ [Action: post PR comment]
risk: medium, SLA: 240min] ↓
↓ approved [Storage: audit log]
[Action: post PR comment]
↓
[Storage: audit log]
Human approval rules
Approval required for:
- Any PR merge
- Production deploys (regardless of CI passing)
- Closing or modifying issues filed by external contributors
- Any security-related label change
- Access to production the secrets vault values
No approval required for:
- Reading GitHub issues, PRs, and diffs (read-only operations)
- Posting automated PR summaries labelled "AI-generated"
- Creating Jira tickets
- Sending Slack notifications
- Drafting release notes (approval required before publishing)
- Applying non-security labels to issues the agent triaged
For all action steps that write to GitHub (posting comments, applying labels), set the step's trust to medium. For any write that touches security-related data or configuration, set trust to high and add an approval step upstream.
Example approval for a high-risk PR summary:
action: "Post PR summary flagging auth-related changes"
summary: "PR #847 modifies the session token validation logic. Risk areas identified: authentication flow, token expiry handling. Security review recommended before merge."
risk: medium
slaMinutes: 240
assignees:
- [email protected]
rationale: "Changes detected in auth/session.ts and middleware/token-validator.ts. These files are in the high-risk path list."
confidence: 0.85
evidence:
- label: "Files changed"
value: "auth/session.ts, middleware/token-validator.ts"
tone: amber
- label: "Risk category"
value: "Authentication"
tone: red
- label: "PR author"
value: "external-contributor"
tone: amber
Security and permission model
GitHub scopes — principle of least privilege:
| Scope | Required for | Notes |
|---|---|---|
issues:read | Variants A, B, D | Read-only; always permitted |
pull-requests:read | Variants B, C | Read-only; always permitted |
contents:read | Variants B, D | Read-only; always permitted |
issues:write | Variant A (labels) | Needs approval gate on action step |
pull-requests:write | Variant B (PR comment) | Needs approval gate on action step |
Secret and credentials protection:
Configure the MCP Gateway policy to deny tool calls that read file paths matching these patterns:
Deny file read for paths matching:
- .env
- .env.*
- *.key
- *.pem
- *.p12
- secrets/
- .github/
- config/credentials*
Test this policy in a staging repo with a known .env file before enabling in production.
Roles:
| Role | Can do |
|---|---|
admin | Create and modify agent and workflow, update system prompts, publish to production, modify MCP Gateway policy |
editor | Modify workflow steps, update skill parameters, view run debugger |
viewer | View run history and agent outputs — cannot modify configuration |
approvers group | Action approval requests for security-flagged PR summaries and P1 triage escalations |
Audit events emitted: run.started, run.completed, run.failed, approval.granted, approval.rejected, approval.sla_breach, agent.updated, connection.accessed.
Evaluation checklist
- Agent PR summaries tested on 20 historical PRs — accuracy reviewed by a senior engineer who knows those PRs
- GitHub MCP Gateway policy limits to read-only tools by default — verify in the policy configuration, not just the README
- No production secrets are accessible to the agent — verify via the MCP Gateway policy deny rules in a staging test
- High-risk code changes (auth, payments, infra file paths) reliably trigger the approval step — test with three known high-risk PRs
- Run debugger shows which files were read and what the agent reasoned — no black-box steps
- Jira ticket quality reviewed against 10 manually created tickets — field completeness and description accuracy
- Snyk integration (if used) tested with a known-vulnerable dependency — confirm the agent surfaces it correctly
- Issue triage severity classifications compared against engineer-assigned severity for 20 historical issues
Rollout plan
Weeks 1–2 (shadow mode — issue triage only): Run Variant A in shadow mode. Agent classifies issues but takes no action. Compare agent triage to engineer triage daily. Track false positive rate on severity classification.
Weeks 3–4 (live issue triage): Enable label application and Jira ticket creation for Variant A. Monitor output daily. Keep the calibration datastore running. Review any P1/P2 escalations immediately.
Month 2 (PR summaries): Enable Variant B on one low-traffic repository. Start with read-only summary posting — no risk-area gating yet. Collect engineer feedback via Slack or PR comment reactions.
Month 3 (release notes): Enable Variant C with manual trigger. Human approval required before every publish. This is a low-risk workflow to build trust in the agent's output quality.
Month 4 and beyond (build failure analysis): Enable Variant D if CI integration is stable and the engineering team has built confidence from the first three variants. Track time-to-diagnosis metrics.
Common failure modes
Agent misclassifies severity (marks P1 as P3) High-severity issues are under-triaged and sit in the backlog. Mitigation: add a severity keyword list to the system prompt (e.g. "production outage", "data loss", "security vulnerability" → always P1). Require the agent to include a confidence score; route low-confidence triage to a human reviewer. Track misclassification rate in the calibration datastore.
Agent posts an inaccurate PR summary Engineers act on an incorrect summary and miss a review concern. Mitigation: always label summaries "AI-generated — review before acting." Monitor PR comment reactions for corrections. Review the run debugger when an engineer flags an inaccurate summary. Retrain system prompt based on flagged examples.
GitHub write action fails due to scope
The action step that posts a PR comment or applies labels fails because the GitHub connection lacks write scope.
Mitigation: test all write operations in a sandbox repository with the same connection before enabling in production. Use the run debugger to verify each action step's output. Check the connection configuration for the correct OAuth scopes.
Agent reads secrets from repository via MCP GitHub tool
The agent uses the get-file-contents tool to read a .env file or credentials file.
Mitigation: configure the MCP Gateway deny list for file path patterns before enabling the GitHub MCP server. Test the deny rules in staging: attempt to read a .env file and verify the Gateway blocks it. Do not rely on the system prompt alone to prevent this — use the Gateway policy as the enforcement layer.
ROI assumptions
The table below uses illustrative assumptions. Replace with your team's actual values. Track build-failure diagnosis time savings post-implementation rather than estimating.
| Input | Illustrative value |
|---|---|
| Issues triaged per week | 80 |
| Minutes per issue triage — current (engineer time) | 8 |
| Minutes per issue triage — with agent (review + confirm) | 2 |
| PRs per week | 40 |
| Minutes per PR summary — current (reviewer context-loading) | 6 |
| Minutes per PR summary — with agent (review AI summary) | 1 |
| Loaded hourly cost (senior engineer) | $120 |
| Build-failure diagnosis time saved | Track post-implementation |
At these assumptions: issue triage savings = 80 issues/week × (8 − 2) min × 52 weeks / 60 × $120 = approximately $49,920 / year. PR summary savings = 40 PRs/week × (6 − 1) min × 52 weeks / 60 × $120 = approximately $20,800 / year. Combined illustrative annual value: approximately $70,000+, before build failure analysis savings.
Use the interactive calculator to adjust these inputs: /tools/ai-agent-roi-calculator?use_case=engineering
FAQ
Can the agent merge pull requests automatically?▾
No. PR merges must have a human approval step regardless of CI status or agent confidence score. This is a hard requirement — do not remove or bypass the approval gate. If you want to reduce merge friction, use the PR summary workflow to give reviewers better context, not to replace their decision.
Can the agent write code?▾
The agent can draft code suggestions in PR comments or issue replies. It cannot commit code to a repository without a write-capable GitHub action step, which requires an approval gate. Even with approval gates in place, committing agent-generated code requires an engineer to review the diff before approving.
How do I prevent the agent from accessing secrets in the repo?▾
Configure the MCP Gateway policy to deny tool calls that read file paths matching secret patterns — .env, *.key, *.pem, secrets/, .github/. Test the policy in a staging repository with a known .env file before enabling in production. Do not rely on the system prompt alone: enforce it at the Gateway policy layer.
What GitHub permissions does the agent need?▾
Read-only for most workflows: issues:read, pull-requests:read, contents:read. Write permissions — issues:write, pull-requests:write — only where explicitly needed (label application, PR comment posting), with approval gates on those action steps. Never grant repo:admin, deploy keys, or workflow write scopes.
Will the agent slow down our engineering team?▾
Poorly configured: yes. Well configured: no. The agent handles classification and summary work asynchronously — engineers review the output in their own time rather than waiting for the agent. Start with issue triage, the lowest-disruption variant. Measure time-to-triage and false positive rate for the first two weeks before expanding to PR summaries or build failure analysis.
Related pages
- Agents
- MCP Servers
- Approvals
- Skills
- Connections
- Bus
- Audit Log
- Knowledge Base Agent Playbook
- Compliance Review Agent Playbook
- Tool Permission Matrix — scope the GitHub MCP tools and write-capable actions to the correct steps
- System Prompt Template — starting-point structure for the triage and PR summary agents