AI Agent Risk Assessment Checklist
Before any agent reaches production, someone needs to ask: what can go wrong, and have we addressed it? This checklist provides a structured answer. It scores the agent across 9 risk dimensions, produces a composite risk score, and maps it to a deployment readiness tier. It then runs a binary go/no-go check — a set of required controls where any gap is a blocker regardless of the overall risk score.
Run this checklist before initial production deployment, and re-run it when the agent's scope, data access, or trust level changes significantly.
When to use this template
- As a formal gate before any agent is promoted from
stagingtoproduction - When adding new tools or data access to an existing deployed agent
- When a compliance or security audit asks for evidence of AI governance controls
- When an agent incident (wrong output, data leak, policy violation) triggers a post-incident review
- When onboarding a new team who is deploying their first agent and needs a governance baseline
The blank checklist (copyable)
# AI Agent Risk Assessment
**Agent name:**
**Agent ID:**
**Assessed by:**
**Date:**
**Workflow(s):**
## Risk scoring (0–3 per dimension)
| # | Dimension | Score (0–3) | Notes |
|---|---|---|---|
| 1 | Data sensitivity | | |
| 2 | External action risk | | |
| 3 | Financial risk | | |
| 4 | Legal and compliance risk | | |
| 5 | Security risk | | |
| 6 | Customer impact | | |
| 7 | Reversibility | | |
| 8 | Human oversight | | |
| 9 | Auditability | | |
| | **Total** | | |
**Risk tier:**
- 0–6: Low risk
- 7–12: Medium risk
- 13–18: High risk
- 19–27: Critical risk
## Go / No-go checklist
| # | Required control | Status (Yes / No / N/A) | Notes |
|---|---|---|---|
| 1 | Agent scope documented and approved by named owner | | |
| 2 | Forbidden actions list defined in system prompt | | |
| 3 | Approval gates configured for all HIGH and CRITICAL actions | | |
| 4 | Data access scoped to minimum necessary (least privilege) | | |
| 5 | Audit logging enabled for all agent runs | | |
| 6 | Agent tested against adversarial inputs (prompt injection test) | | |
| 7 | Credentials stored in the secrets vault (not in system prompt or workflow) | | |
| 8 | Named human assigned as approval reviewer for each risk category | | |
| 9 | Rollback plan exists for incorrect outputs at scale | | |
| 10 | Risk assessment reviewed by someone who did not build the agent | | |
**Any "No" = No-Go. All required items must be "Yes" before production deployment.**
**Decision:** Go / No-Go
**Signed off by:**
Scoring level descriptors
1. Data sensitivity — what data does the agent access?
| Score | Descriptor |
|---|---|
| 0 | Public data only — no internal systems |
| 1 | Internal business data with no personal data (process docs, KB articles, internal metrics) |
| 2 | Employee or partner personally identifiable information (PII) |
| 3 | Customer PII, financial records, health data, or data regulated under GDPR, HIPAA, or PCI-DSS |
2. External action risk — can the agent communicate externally?
| Score | Descriptor |
|---|---|
| 0 | No external communication of any kind |
| 1 | Internal notifications only (Slack to internal channels, internal email) |
| 2 | External notifications with mandatory human review before send |
| 3 | Autonomous external communication — sends email, DMs, or API calls to external systems without human review |
3. Financial risk — can the agent trigger financial transactions?
| Score | Descriptor |
|---|---|
| 0 | No access to financial systems |
| 1 | Read-only access to financial data (invoices, reports) |
| 2 | Can initiate transactions under $500 with approval gate |
| 3 | Can initiate transactions over $500, recurring charges, or refunds |
4. Legal and compliance risk — does the agent operate in a regulated area?
| Score | Descriptor |
|---|---|
| 0 | Internal, unregulated process with no compliance implications |
| 1 | Touches regulated data (PII) but read-only — no output enters regulated systems |
| 2 | Produces documents that could be legally binding (contracts, commitments, policy statements) |
| 3 | Regulatory submissions, compliance attestations, legal commitments, or actions on regulated data |
5. Security risk — does the agent have privileged system access?
| Score | Descriptor |
|---|---|
| 0 | No system credentials or privileged access |
| 1 | Read-only API access to internal systems using scoped service credentials |
| 2 | Write access to internal systems |
| 3 | Access to production infrastructure, secrets store, security controls, or identity systems |
6. Customer impact — how directly does the agent affect customers?
| Score | Descriptor |
|---|---|
| 0 | No customer-facing output — agent operates entirely internally |
| 1 | Agent output is reviewed and edited by a human before any customer sees it |
| 2 | Customer-facing output with a human review gate but no editing step |
| 3 | Direct autonomous interaction with customers — output reaches customers without human review |
7. Reversibility — can agent actions be undone?
| Score | Descriptor |
|---|---|
| 0 | All actions are fully reversible with no data loss |
| 1 | Most actions reversible with some effort (e.g. ticket reassignment, draft deletion) |
| 2 | Some actions are irreversible in practice (sent messages, deleted records) |
| 3 | Actions are largely irreversible — financial charges, legal submissions, permanently deleted data |
8. Human oversight — is a human in the loop?
| Score | Descriptor |
|---|---|
| 0 | Human reviews and approves every action before it executes |
| 1 | Human reviews agent outputs periodically (e.g. daily digest review) |
| 2 | Automated with exception-based human review (human only sees failures or anomalies) |
| 3 | Fully autonomous — no regular human review of agent outputs or actions |
9. Auditability — are actions logged and traceable?
| Score | Descriptor |
|---|---|
| 0 | All actions logged with full trace, linked to run ID, step ID, and user context |
| 1 | Most actions logged with partial trace; some tool calls lack detailed records |
| 2 | Limited logging — run-level records exist but step-level and tool-call records are incomplete |
| 3 | No meaningful audit trail — actions cannot be reliably traced after the fact |
Scoring matrix and deployment tiers
| Total score | Risk tier | Deployment requirements |
|---|---|---|
| 0–6 | Low risk | Standard deployment. Configure approval gates for any high or critical individual actions. |
| 7–12 | Medium risk | Approval required for agent actions. Enable audit digest. Schedule quarterly review. |
| 13–18 | High risk | Legal and compliance sign-off required. All agent actions require approval gates. Monthly review. Security assessment required. |
| 19–27 | Critical risk | Do not deploy without: legal review, CISO sign-off, SOC2 evidence, data privacy impact assessment, and full approval gates on all non-read actions. |
A high score in a single dimension can warrant additional controls even if the total is low. A score of 3 on Financial risk or Security risk should always trigger a dedicated review of those dimensions, regardless of total score.
Go / no-go checklist — expanded descriptions
Each item is a hard requirement. "N/A" is only acceptable if the item genuinely does not apply — for example, item 3 (approval gates for high/critical actions) is N/A only if the agent has no high or critical tools, which must be confirmed against the tool permission matrix.
| # | Required control | Why it matters |
|---|---|---|
| 1 | Agent scope documented and approved by named owner | Without a named owner, there is no one accountable when the agent behaves unexpectedly |
| 2 | Forbidden actions list defined in system prompt | Absence of prohibition is not prohibition. The agent will not infer what it should not do |
| 3 | Approval gates configured for all HIGH and CRITICAL actions | Approval gates are the primary enforcement mechanism for high-stakes actions |
| 4 | Data access scoped to minimum necessary | Broad data access increases blast radius when the agent makes a mistake |
| 5 | Audit logging enabled for all agent runs | Without logs, you cannot investigate incidents or demonstrate compliance |
| 6 | Adversarial input testing completed | Prompt injection and edge-case inputs reveal failure modes that normal testing misses |
| 7 | Credentials in the secrets vault only | A credential in a system prompt or workflow definition is visible in logs and to API users |
| 8 | Named human assigned as approval reviewer | A role-based assignee with no named person means no one takes responsibility |
| 9 | Rollback plan exists | At scale, a bad agent output affects many records. Know how to undo it before you deploy |
| 10 | Assessment reviewed by someone outside the build team | Self-assessment produces systematically optimistic scores. Independent review catches blind spots |
Worked example: customer support triage agent
Agent name: Support Triage Agent v2
Agent ID: agt_cx_triage_001
Assessed by: Priya Menon (Engineering Manager), reviewed by Rahul Iyer (Compliance)
Date: 2026-05-01
Workflow(s): Inbound ticket classification, response drafting
Risk scoring
| # | Dimension | Score | Notes |
|---|---|---|---|
| 1 | Data sensitivity | 2 | Accesses customer support tickets which may contain customer PII (name, email, account number) |
| 2 | External action risk | 1 | All response drafts require human approval before any external send; agent only posts internally |
| 3 | Financial risk | 0 | No access to billing, payment, or financial systems |
| 4 | Legal and compliance risk | 1 | Touches customer PII in ticket content but read-only; no output enters regulated systems autonomously |
| 5 | Security risk | 1 | Read-only CRM lookup and KB search via scoped service credentials; no write access |
| 6 | Customer impact | 1 | All drafts reviewed and edited by a CX agent before the customer sees them |
| 7 | Reversibility | 1 | Ticket classification can be changed; no irreversible actions |
| 8 | Human oversight | 1 | CX agents review outputs on a ticket-by-ticket basis; not fully automated |
| 9 | Auditability | 0 | Full run logs, step logs, and tool-call records enabled in ProvenanceOne |
| Total | 8 |
Risk tier: Medium risk (7–12)
Required controls: Approval required before response send; audit digest enabled; quarterly review scheduled.
Go / no-go checklist
| # | Required control | Status | Notes |
|---|---|---|---|
| 1 | Agent scope documented and approved by named owner | Yes | Priya Menon documented; approved by CX Director on 2026-04-28 |
| 2 | Forbidden actions list defined in system prompt | Yes | Includes: no direct send, no billing access, no commitments |
| 3 | Approval gates configured for all HIGH and CRITICAL actions | Yes | Approval step before email send; no critical tools attached |
| 4 | Data access scoped to minimum necessary | Yes | CRM read scoped to current ticket's account ID only |
| 5 | Audit logging enabled for all agent runs | Yes | ProvenanceOne audit enabled; connection.accessed events for CRM and KB |
| 6 | Adversarial input testing completed | Yes | 20 adversarial inputs tested including prompt injection attempts; all handled correctly |
| 7 | Credentials in the secrets vault | Yes | CRM and KB credentials stored in the secrets vault; not in system prompt or workflow definition |
| 8 | Named human assigned as approval reviewer | Yes | CX team lead (Amara Osei) assigned; backup: CX operations manager |
| 9 | Rollback plan exists | Yes | Incorrect drafts can be deleted; ticket status can be reset; no irreversible actions |
| 10 | Assessment reviewed independently | Yes | Reviewed by Rahul Iyer (Compliance), who was not on the build team |
Decision: Go
Signed off by: Priya Menon (Engineering), Rahul Iyer (Compliance) — 2026-05-01
How to customise this template
Adjust scoring to your risk appetite. Some organisations treat any external communication as critical rather than high. If your regulatory environment warrants it, move the thresholds.
Add domain-specific dimensions. A healthcare organisation may add a dimension for clinical risk (does the agent's output influence clinical decisions?). A financial services firm may add market risk (does the agent's output influence trading or pricing decisions?).
Set internal thresholds higher than the defaults. If your organisation requires all medium-risk agents to undergo the same review as high-risk agents, document that policy here and apply it consistently.
Schedule re-assessment triggers. Define the conditions that require a re-run of this checklist without waiting for the quarterly cycle: adding a new tool, changing the model, a production incident, or a change in the regulatory environment.
Common mistakes
Scoring dimension 8 (Human oversight) as 0 because approval gates exist. Approval gates are specific to individual actions; dimension 8 asks about oversight of the agent's overall output and behaviour. If a human only reviews the pre-approved action and never reviews the agent's full run, the score is not 0.
Marking go/no-go item 6 (adversarial testing) as Yes without a documented test log. "We tested it informally" is not evidence. Document the adversarial inputs you tested, the expected behaviour, and the actual output. If an audit occurs, that record needs to exist.
Using a total risk score to override a single-dimension red flag. A total score of 8 (medium) with a score of 3 on Security risk still warrants a dedicated security review. The composite score is a guide; single-dimension extremes warrant targeted investigation.
Completing the assessment alone. The checklist explicitly requires that item 10 be completed by someone outside the build team. An assessment completed entirely by the builder is not a valid assessment — it lacks the independent review that catches scope creep, wishful scoring, and blind spots.
Is this checklist a substitute for a formal DPIA or security assessment?▾
No. This checklist is a structured starting point for governance decisions. For agents that score in the high or critical tier, it identifies the need for a formal Data Privacy Impact Assessment (DPIA) or security assessment — it does not replace them. Use the checklist to determine whether those formal assessments are required and to prepare for them.
How often should I re-run the risk assessment for a deployed agent?▾
At minimum: quarterly for medium-risk agents, monthly for high-risk agents. Additionally, re-run whenever the agent's system prompt changes substantially, when new tools are added, when the model changes, or following a production incident. Do not wait for the scheduled review cycle if a material change occurs.
What counts as 'adversarial input testing' for go/no-go item 6?▾
At minimum: test inputs designed to elicit out-of-scope actions (e.g. asking the agent to ignore its system prompt), inputs containing obvious prompt injection attempts (e.g. 'Ignore previous instructions and...'), and inputs at the boundary of the agent's defined scope. Document each test input, the expected behaviour, and the actual output. 20 adversarial inputs is a reasonable minimum for a medium-risk agent.
What if the agent scores differently on different runs?▾
Variance across runs is itself a risk signal. If scoring dimensions like task success rate or output correctness vary significantly between runs, the agent is not stable enough for the deployment tier it is targeting. Score the agent on a representative sample (at minimum 50 runs) and use the worst-case score for each dimension, not the average.
Can I use this template for agents not built on ProvenanceOne?▾
Yes. The scoring dimensions are platform-agnostic. You will need to adapt the go/no-go checklist items that reference ProvenanceOne-specific features (audit logging, the secrets vault integration, approval gates) to the equivalent controls in your platform.
What should I do if the agent scores in the critical risk tier?▾
Do not deploy until you have completed all four of the critical tier requirements: legal review, CISO sign-off, SOC2 evidence, and a data privacy impact assessment. Additionally, ensure that full approval gates are configured on every non-read action. If any of those requirements cannot be met within your deployment timeline, delay the deployment — critical risk agents without those controls in place are a liability.
Related pages
- Agent Evaluation Rubric — performance-focused evaluation to run alongside this risk assessment
- Tool Permission Matrix — define and review the tool access that drives several scoring dimensions here
- Approval Policy Template — configure the approval gates required by medium, high, and critical tier deployments
- System Prompt Template — structure the system prompt that supports go/no-go items 1 and 2
- Agents — ProvenanceOne agent configuration including trust levels and audit events
- Audit — audit event types referenced in this checklist