Human in the Loop AI Agents

Human in the loop (HITL) means a human must review and approve an AI agent's proposed action before it executes. HITL is not an all-or-nothing decision: the goal is to apply human review at the right points in a workflow — the actions that carry enough risk, ambiguity, or consequence to justify the overhead. Applied correctly, HITL makes agents safer to deploy at scale without removing the efficiency that automation provides.


Definitions

Human in the loop (HITL) : A pattern in which a human reviews and approves a proposed action before an AI agent executes it. HITL can be applied at specific decision points rather than to every agent output.

Approval step : A step kind in a ProvenanceOne workflow that pauses execution and routes a decision to a designated reviewer. The run remains in approval status until the reviewer approves or rejects.

Evidence : Structured data items displayed to the approver to support their decision. Each evidence item has a label, a value, and a visual tone (slate, amber, red, emerald, blue) that signals significance.

SLA (service-level agreement) : The time limit in minutes for an approval decision. When the SLA expires, ProvenanceOne emits an approval.sla_breach event, which triggers notifications and optionally escalation.

Delegated grant : A record granting one person approval authority on behalf of another for a defined scope — useful for vacation coverage or cross-team authority.


Three HITL patterns

Not every agent action warrants the same level of oversight. Three patterns cover most production use cases:

Pattern 1: Draft-only mode

The agent produces an output — a draft email, a report, a support reply — but takes no action. A human reviews the output and acts manually.

This is the lowest-overhead pattern and is appropriate when:

  • The workflow is in early deployment and agent reliability is not yet established
  • The action frequency is low and the output is always reviewed before use
  • The action is high-stakes enough that no automation is acceptable yet

The tradeoff: draft-only mode captures none of the efficiency benefit of automation beyond reducing the work to produce the first draft.

Pattern 2: Approval gate

The agent reaches a decision point and pauses. Execution resumes only after a designated reviewer approves. If the reviewer rejects, the run ends or takes an alternative path. The reviewer can edit the proposed action payload before approving.

Use approval gates for medium-to-high risk actions where automation is desired but unreviewed execution is not acceptable. This is the most common HITL pattern in production agentic systems.

Pattern 3: Exception-based oversight

The agent acts autonomously for routine cases. It escalates to a human only when confidence is below a threshold or specific conditions are met (for example, a deal size above a limit, or a data change affecting more than a defined number of records).

This is the most efficient pattern but requires the agent to have reliable self-assessment. Use it only when the agent has a demonstrated track record in the environment, confidence scores are well-calibrated, and escalation conditions are clearly defined and tested.


When is HITL required?

Use this table as a starting point for your organisation's approval policy. Adjust risk levels based on your specific context.

Action typeRecommended HITL patternRisk level
Draft generation (no send)None (draft-only output)low
Internal notificationNone or exception-basedlow
CRM field updateException-based (above defined deal size)low–medium
External email sendApproval gatemedium
Account modificationApproval gatehigh
Financial transactionApproval gatecritical
Legal commitmentApproval gate, legal reviewer assignedcritical
Production system changeApproval gate, technical reviewer assignedcritical
Data deletionApproval gatehigh
Access changeApproval gate, security reviewer assignedhigh

The risk level on an approval step informs the visual display to reviewers and drives escalation behaviour. A critical risk approval should never auto-approve under any condition.


Anatomy of a good approval request

An approval request is only as useful as the information it presents. A reviewer who lacks context is likely to approve without genuine review — which provides the appearance of oversight without the substance.

A well-designed approval request gives the reviewer:

  1. The specific action proposed, in plain language, not technical jargon. "Send email to [email protected] with subject line: Q3 partnership proposal" — not "invoke sendgrid.sendEmail".
  2. The agent's rationale for proposing the action: what it inferred and why it determined this action was appropriate.
  3. Evidence: the key data points that support the decision, displayed with visual tones that indicate significance. emerald for positive signals, amber for caution, red for risk factors, slate for neutral context, blue for informational.
  4. Risk classification and SLA deadline: what level of risk this action carries and how long the reviewer has to decide.
  5. Upstream context: what triggered this run, what previous steps produced, and how the proposed action fits in the broader workflow.
  6. An editable payload: the reviewer should be able to modify the proposed action — adjust a dollar amount, correct a recipient, edit a message — before approving. An approval that cannot be edited often results in blanket rejections when the right action is "approve with a small correction".

In ProvenanceOne, approval step fields include: action, summary, rationale, confidence, risk, slaMinutes, evidence (array of label/value/tone objects), an editable payload, and the upstream context from previous steps in the run. See Evidence and risk levels for configuration details.


SLA configuration and escalation

Every approval request must have a defined deadline. An approval with no SLA leaves a run in approval status indefinitely — which blocks the workflow, consumes resources, and creates a backlog that reviewers stop monitoring.

Guidelines for SLA configuration:

  • Set the SLA to the maximum time the business can tolerate waiting for the action. For a time-sensitive customer communication, this might be 30 minutes. For a quarterly financial adjustment, it might be 48 hours.
  • For critical risk actions, use a short SLA with aggressive escalation, not a long SLA with no escalation.
  • On SLA breach: notify the primary reviewer again, escalate to a secondary reviewer, or auto-reject. Never auto-approve a high or critical risk action when the SLA expires.

In ProvenanceOne, an SLA monitoring process runs every five minutes. When an SLA expires, it emits an approval.sla_breach event. Configure notification channels for this event in Notification preferences so the breach reaches the right people immediately.


Reviewer roles and assignment

Not everyone should approve every type of action. Vague assignment ("any admin can approve") leads to approvals sitting unreviewed because each admin assumes someone else will handle it.

Assign approvals by action type:

  • Financial transactions → finance team lead or finance manager
  • Legal commitments → legal counsel
  • Production system changes → senior engineer or SRE on call
  • Access changes → security team lead
  • Customer communications → account owner or customer success manager

Use specific assignees on approval steps rather than relying on the approvers platform group for all approvals. The approvers group is appropriate for cross-workflow approval authority — a pool of people who can approve any pending request — but for specific action types, named assignment ensures the right reviewer is notified.

In ProvenanceOne, the assignees field on an approval step accepts one or more email addresses. Members of the approvers platform group can approve any pending request in the workspace.


Delegated grants

Delegated grants allow one person to be granted approval authority on behalf of another for a defined scope. This is useful for:

  • Vacation coverage: a finance manager going on leave grants their deputy approval authority for payment-related approvals during the absence
  • Cross-team escalation: a security lead grants approval authority to a senior engineer for infrastructure change approvals when the security team is unavailable
  • Time-limited project authority: a legal counsel delegates approval authority for a specific project or workflow to a paralegal for the project duration

In ProvenanceOne, delegated grants are managed at /approvals/delegated/:grantId. The grant includes the grantor, the grantee, the scope (which workflows or approval types the grant covers), and an expiry. See Delegated grants for configuration.


Audit trail for approvals

Every approval decision is logged with the actor ID, timestamp, and the payload at the time of the decision. This is your compliance evidence that human review occurred.

Audit events logged for approvals:

  • approval.granted — reviewer approved the action; logs actor, timestamp, and final payload
  • approval.rejected — reviewer rejected the action; logs actor, timestamp, and reason if provided
  • approval.sla_breach — SLA expired without a decision; logs run ID and elapsed time
  • approval.reassigned — approval was reassigned to a different reviewer

These events are signed with HMAC-SHA256 and retained for seven years. For regulated industries, this trail provides defensible evidence that a human reviewed and approved each action — not that an automated system acted without oversight.

See the audit event reference for the full event structure.


Examples

Low risk — no approval needed

A support agent retrieves a FAQ answer from the knowledge base (confidence: 0.92, specific KB article cited) and sends it to the customer via the ticketing system. The answer is factual, grounded in a cited source, and no external communication or data change is involved beyond the ticket response. No HITL is required. The run completes automatically.

Medium risk — approval before sending

A sales agent drafts an outreach email to a prospect. The email references specific claims about the prospect's business inferred from research — the agent cannot be certain all claims are accurate.

The approval step presents:

  • Proposed action: send email to Head of Engineering, Acme Corp
  • Rationale: prospect recently posted about scaling infrastructure challenges; email references this
  • Evidence: { label: "Recipient", value: "Head of Engineering, Acme Corp", tone: "slate" }, { label: "Personalisation source", value: "LinkedIn + company news", tone: "emerald" }, { label: "Agent confidence", value: "0.84", tone: "emerald" }
  • Risk: medium
  • SLA: 60 minutes

The sales rep reviews the draft, edits one paragraph that overstates a claim, and approves. The email sends.

High risk — approval with evidence and editing

An operations agent proposes to update a vendor's payment terms in the ERP from Net-30 to Net-60, citing a contract amendment.

The finance manager sees:

  • Proposed action: update payment terms for Vendor X
  • Rationale: contract amendment CON-4821 signed 2026-04-12 specifies Net-60 terms
  • Evidence: { label: "Contract reference", value: "CON-4821", tone: "emerald" }, { label: "Current terms", value: "Net-30", tone: "slate" }, { label: "Agent confidence", value: "0.71", tone: "amber" }
  • Risk: high
  • SLA: 120 minutes

The confidence score is 0.71 — the manager opens the contract reference, confirms the terms, edits the effective date in the payload, and approves. The run updates the ERP record.

Critical risk — dual approval

A compliance agent flags a data access violation and proposes to revoke a user's API key. Risk: critical. Two approvers are assigned: the security lead and the compliance officer. SLA: 30 minutes. Neither approver can auto-approve. If the SLA breaches, both receive an escalation notification and the action is held pending decision.


Benefits of human-in-the-loop

  • Catches errors before they reach customers, partners, or production systems
  • Creates a defensible audit record of human review for compliance purposes
  • Allows agents to operate at higher autonomy over time as their decision quality is validated
  • Gives teams confidence to deploy agents in high-stakes workflows earlier in the adoption cycle
  • Evidence items and editable payloads mean reviewers correct rather than simply approve or reject, improving output quality

Risks and limitations

  • Approval fatigue: if too many actions require approval, reviewers begin approving without genuine review. Reserve approval gates for actions where the risk justifies the overhead.
  • SLA misconfiguration: an SLA of zero or no SLA means the run blocks indefinitely. An SLA that is too long means the approval offers no real-time protection.
  • Unmonitored inboxes: assigning approvals to a shared inbox with no direct ownership means requests sit unread. Assign named individuals or use a monitored channel.
  • Context-poor requests: an approval that shows only "approve or reject" with no rationale, evidence, or editable payload does not enable informed review. Design approval requests to give reviewers what they need to decide.
  • HITL in theory only: it is possible to configure an approval step, configure no notification channel, and have the approval request reach no one. Test the notification path before deploying to production.

Implementation checklist

Before deploying any workflow with an approval step:

  • Risk level set on the approval step (low, medium, high, or critical)
  • SLA configured — not zero, not missing
  • Named assignee or assignees specified — not just the approvers group for high-risk actions
  • Evidence items configured with appropriate tones (amber/red for risk factors)
  • Payload is editable — reviewers can correct, not only approve or reject
  • Notification channel configured and tested — reviewer receives the request
  • SLA breach notification configured — someone is alerted if the SLA expires
  • Audit events reviewed after first production run — confirm approval.granted or approval.rejected is logged
  • Escalation path documented — what happens if the primary reviewer is unavailable
  • Delegated grant in place for vacation coverage if the assignee is a single individual

Common mistakes

MistakeWhy it happensFix
No SLA on approval stepOversight during configurationThe run blocks indefinitely; always set slaMinutes
Approval assigned to a shared inbox with no ownerFeels like broader coverageAssign named individuals; shared inboxes are not monitored reliably
Auto-approving on SLA breachAdded to prevent workflow blockingOn breach, escalate or reject; never auto-approve high or critical risk
Approval request without context or evidenceThe step is added but not configured beyond the minimumAdd rationale, evidence items, and an editable payload; reviewers cannot make informed decisions from "approve or reject" alone
HITL configured but no notification set upApproval step is added; notification preferences are not touchedNo one receives the approval request; test the notification path before go-live
Same approver for all action typesSimpler to configure one groupFinance approvals need finance expertise; production changes need engineering review; tailor assignment by action type
Approval rate not monitoredTeams set up HITL and move onHigh approval rates (near 100%) may indicate rubber-stamp behaviour; review periodically

How ProvenanceOne helps

ProvenanceOne's approval step is designed to give reviewers the information they need to make genuine decisions rather than approving blindly. Evidence items with visual tones (emerald for positive signals, amber for caution, red for risk factors) surface the key data points without requiring the reviewer to read through run logs. The payload is editable before approval, so reviewers can correct rather than only approve or reject — which reduces rejection rates and improves output quality. SLA monitoring runs every five minutes and emits an approval.sla_breach event on expiry, which can be routed to a Slack channel or email address immediately. Delegated grants handle vacation coverage and cross-team authority without requiring workflow reconfiguration. Every approval decision is logged with HMAC-SHA256 signing and seven-year retention, giving compliance teams defensible evidence that human review occurred for each actioned decision.


FAQ

What is human in the loop in AI agents?

Human in the loop (HITL) means a human must review and approve an AI agent's proposed action before it executes. The agent pauses at a defined decision point, presents the proposed action and supporting evidence to a reviewer, and waits for an approve or reject decision before continuing. Not all agent actions require HITL — the goal is to apply it at the right risk level.

Does every AI agent action need human approval?

No. Low-risk, high-confidence, well-grounded actions — such as retrieving and returning a factual answer from a knowledge base — do not require approval. Reserve approval gates for actions that modify data, send external communications, touch production systems, or carry financial or legal consequences. See the risk level table above for guidance.

What happens if no one approves an AI agent action in time?

If the approval SLA expires, ProvenanceOne emits an `approval.sla_breach` event and triggers configured notifications. The correct response to an SLA breach is escalation to a secondary reviewer or auto-rejection — never auto-approval of a high-risk or critical-risk action. Configure SLA breach notifications in [Notification preferences](/docs/settings/notifications).

Can a reviewer edit the proposed action before approving?

Yes. The approval step in ProvenanceOne supports an editable payload. The reviewer can modify the proposed action — adjust a value, correct a recipient, change a date — before approving. The approved payload (including any edits) is what the workflow executes, and the edit is captured in the audit log.

What role do I need to approve an AI agent action?

You need to be either a named assignee on the approval step or a member of the `approvers` platform group. Members of the `approvers` group can approve any pending request in the workspace. Named assignees receive direct notification for the specific approval. The `admin` role does not automatically grant approval authority — the approvers group is separate.

How do I handle approval coverage when the assigned reviewer is on leave?

Configure a delegated grant at `/approvals/delegated/:grantId`. The grant specifies the grantor, grantee, scope (which workflows or approval types are covered), and an expiry date. During the grant period, the grantee can approve on behalf of the grantor. See [Delegated grants](/docs/approvals/delegated-grants) for setup instructions.

How do I know a human actually reviewed an AI approval and did not just click approve?

The audit log captures the actor ID, timestamp, and the payload at the time of the decision for every `approval.granted` event. If the reviewer edited the payload before approving, the change is recorded. You cannot determine from the audit log alone whether the reviewer read the evidence carefully, but payload edits, the time between notification and decision, and approval comments are signals. Approval fatigue — where approval rates are consistently near 100% with minimal edit rate — is worth monitoring.

What is the difference between an approval gate and exception-based oversight?

An approval gate pauses every run at a defined step and requires explicit human approval before proceeding. Exception-based oversight lets the agent act autonomously for routine cases and escalates only when confidence is below a threshold or conditions are met. Approval gates are appropriate for medium-to-high risk actions where no unreviewed execution is acceptable. Exception-based oversight is appropriate for high-volume workflows where the agent's reliability in the environment is established.