Knowledge Base AI Agent Playbook
An employee asks a question about leave policy, expense limits, or onboarding steps. Instead of waiting for an HR ticket to be resolved, a ProvenanceOne knowledge base agent retrieves the relevant policy document, cites the source, and returns a concise answer in seconds. This playbook walks through building that agent from trigger to response, including access controls, data freshness safeguards, and DLP enforcement. Estimated setup time is 30 minutes for a team with documents already in a ProvenanceOne datastore or Snowflake.
What this agent does
The knowledge base agent handles the full lifecycle of an employee query:
- Receives the question — a
triggerstep of typeapioreventaccepts the incoming query. The most common trigger is a Slack slash command (e.g./ask-policy) routed through the ProvenanceOne API, but you can also use amanualtrigger for testing. - Retrieves relevant documents — a
datastep queries the ProvenanceOne datastore (or Snowflake) for policy chunks matching the query. This step must be scoped to the querying user's role and department — see Security and permission model. - Identifies the most relevant sections — a
skillstep runs semantic search or BM25 retrieval over the retrieved chunks to rank and select the most relevant passages. - Generates a cited answer — an
agentstep (category:reasoning) synthesises the ranked passages into a concise response with direct source citations: document title, version, and section. - Checks for human-verification flags — a
logicstep checks whether the matched document or topic is taggedhuman_verification_required. Sensitive topics such as disciplinary procedures or compensation queries should be flagged. - Adds a caveat if flagged — if the logic step resolves to
true, the agent appends "Please verify this answer with HR or Legal before acting on it" to the response. - Stores the Q&A pair — a
storagestep writes the question, answer, source citations, and run ID to the datastore for analytics and continuous improvement. - Returns the answer — a
notifystep posts the answer to Slack (or returns it via API) to the originating user.
Data freshness warning: The agent answers only from documents currently indexed in the datastore. If your policies are updated in SharePoint, Notion, or Confluence but the datastore has not been synced, the agent will return outdated answers. Schedule a nightly sync workflow or trigger a sync on every document update. Add a
last_updatedtimestamp to every document chunk so the agent can surface how recent its sources are.
Best-fit use cases
- Answering routine HR policy questions: leave entitlements, expense limits, reimbursement deadlines, public holiday schedules
- Onboarding guides: "How do I request equipment?" or "What is the PTO accrual schedule for new hires?"
- IT knowledge base lookups: VPN setup steps, approved software list, access request processes
- Benefits and perks Q&A: health insurance coverage, wellness budget, parental leave policy
- Compliance policy lookups: data handling guidelines, acceptable use policy, security incident reporting steps
When not to use this agent
- Legal advice — the agent must never advise on legal matters. Any question that touches employment law, contract interpretation, or regulatory compliance should be routed to your legal team. Add an explicit instruction to the system prompt: "If the user asks for legal advice, respond with: 'I can share our policy documents, but this question requires guidance from your legal team.'"
- Medical, financial, or safety decisions — do not use agent output as a basis for any decision in these domains.
- Personal or sensitive HR cases — performance management, disciplinary proceedings, individual compensation discussions, grievances. Route these to a human.
- Authoritative compliance determinations — the agent retrieves policy text; it does not interpret whether an employee's specific situation is compliant.
- Answers the user plans to act on without verification — add a disclaimer to every answer reminding users to verify before acting on anything consequential.
Required connections and data sources
| Connection | Purpose | Auth method |
|---|---|---|
| Slack | Trigger (slash command) and response delivery | OAuth 2.0 |
| ProvenanceOne datastore | Primary policy document store | Built-in (object-backed) |
| Snowflake (optional) | Alternative or supplementary document index | Service Account |
Configure connections at Settings → Connections. The datastore requires no external credentials — create it at Data → Datastores → New Datastore and upload or sync your policy documents. See /docs/data/index for datastore setup instructions.
Recommended agent instructions
The following system prompt is a starting point. Adapt it to your document taxonomy and tone of voice.
You are an internal knowledge base assistant for [Company Name]. Your job is to answer employee questions using only the policy and knowledge base documents provided to you in this conversation.
Rules you must follow:
1. Every answer must cite the source document: include the document title, version, and section name.
2. If you cannot find relevant information in the provided documents, respond with: "I don't have information on this topic in the current knowledge base. Please contact [team/email] for help."
3. Do not paraphrase policy text in ways that change its meaning. Quote directly from source documents when precision matters.
4. If the question touches legal advice, respond with: "I can share our policy documents, but this question requires guidance from your legal team."
5. If the question touches medical, financial, or safety decisions, respond with: "Please consult the appropriate specialist for this topic."
6. Do not reveal information from documents that are outside the user's access level.
7. Keep answers concise — 2–4 sentences with a source citation, unless a step-by-step list is clearly more helpful.
Store this in the agent's systemPromptPreview-visible field and lock it via workspace policy to prevent editor-level changes without admin approval.
Required skills and tools
| Step | Kind | Description |
|---|---|---|
| Semantic search / BM25 retrieval | skill | Category: data. Ranks retrieved document chunks by relevance to the query. Input schema: { query: string, chunks: DocumentChunk[] }. |
| Document retrieval (scoped) | data | Queries the ProvenanceOne datastore or Snowflake, filtered by the user's role/department. |
| MCP Gateway DLP policy | mcp (optional) | Enforces DLP rules on agent output — redacts salary bands, PII, and confidential fields before the agent processes retrieved text. |
Configure the MCP Gateway DLP policy at MCP Servers → Gateway Policies. Define redaction rules for patterns matching salary information, NI/SSN numbers, and any fields tagged confidential in your document metadata. See /docs/mcp-servers/index for Gateway policy configuration.
Recommended workflow design
[Trigger: api / event]
↓
[Data: retrieve document chunks — scoped to user role]
↓
[Skill: semantic search / BM25 ranking]
↓
[Agent: synthesise answer with citations — reasoning category]
↓
[Logic: is topic flagged "human_verification_required"?]
↓ Yes ↓ No
[Notify: answer + caveat] [Notify: answer]
↓ ↓
[Storage: write Q&A pair + run ID to datastore]
Keep the workflow linear. Avoid branching into multiple agent steps — a single well-prompted reasoning agent is more predictable than a chain of agents passing partial answers between them.
Human approval rules
The knowledge base agent does not require an approval step by default because it is a read-only, information-retrieval workflow — it does not modify records, send bulk communications, or take irreversible actions.
Insert an approval step in these specific scenarios:
- Publishing new policy documents — if the workflow includes an action to publish or update a document in the datastore, require approval (
risk: medium, SLA 24 hours, assignee: document owner). - Bulk answer campaigns — if you build a variant that sends scheduled digest answers to groups of employees, require approval before the bulk notify step.
- Flag updates — if the agent can set
human_verification_requiredflags on documents, that action requires approval.
For standard Q&A runs, no approval gate is needed. Monitor the agent's output in the run debugger rather than gating every answer.
Security and permission model
Document-level access control is mandatory. The data step must filter retrieved documents to only those the querying user is authorised to see. Never retrieve documents from the full datastore index regardless of the query.
Implement this by:
- Including the user's role and department in the query payload passed to the
datastep. - Configuring the datastore query to filter on an
access_groupsmetadata field attached to each document. - Testing explicitly: run the agent as a low-permission user and verify that HR-only documents do not appear in answers.
MCP Gateway DLP — configure a Gateway policy to redact the following before the agent processes any retrieved text:
- Salary bands and compensation ranges
- Personal identifiers: NI/SSN, date of birth, home address
- Fields tagged
confidentialin document metadata
Roles and permissions:
| Role | Can do |
|---|---|
admin | Create and modify agent, update system prompt, publish to production |
editor | Modify workflow, update document sources |
viewer | Trigger the agent (via Slack), view run history — cannot modify configuration |
The approvers platform group is not required for this agent unless you add an approval-gated variant.
Audit events emitted: run.started, run.completed, run.failed, agent.updated, connection.accessed, secret.accessed (if Snowflake credentials are accessed).
Evaluation checklist
- Agent cites the source document (title, version, section) for every answer
- Agent returns "I don't have information on this topic" when the document is not in the index — test with at least five out-of-scope questions
- Access-controlled documents are not returned to unauthorised users — test explicitly with a low-permission test user
- DLP policy redacts salary bands and PII from all retrieved text before the agent processes it
- Answers to 10 known policy questions are compared against actual policy text (not agent paraphrase) by an HR team member
- Datastore sync is scheduled and monitored — verify sync ran within the last 24 hours before going live
- The
human_verification_requiredflag triggers the caveat message on at least three flagged test questions - Run debugger shows the exact chunks retrieved for each answer — no black-box responses
Rollout plan
Day 1 (setup): Create the datastore, upload or sync 10–20 core policy documents, configure Slack connection, build the workflow in the development environment. Run manually with test queries.
Day 2–3 (internal testing): Share with 3–5 HR/IT team members to test against known policy questions. Review every run in the debugger. Flag any paraphrasing errors or incorrect source citations.
Week 1 (limited rollout): Enable the Slack slash command for one team or department. Monitor run quality daily. Confirm the datastore sync is running on schedule.
Week 2–4 (broad rollout): Expand access to the full organisation. Set up a feedback mechanism (e.g. a Slack emoji reaction to flag incorrect answers). Review flagged answers weekly.
Ongoing: Schedule monthly review of the most frequently asked questions to confirm answers are still accurate. Track escalation rate — the percentage of queries routed to HR/IT after an agent answer.
Common failure modes
Agent paraphrases policy incorrectly The agent summarises rather than quoting, introducing subtle errors in policy language. Mitigation: add "quote directly from source documents when precision matters" to the system prompt. Review the run debugger to verify the agent is citing specific passages, not generating from memory.
Agent returns documents from the wrong department
Retrieval is not scoped correctly and returns HR-confidential documents to employees without HR role.
Mitigation: enforce department-scoped filtering in the data step. Test with a non-HR user account before going live. Review access control rules with your security team.
Stale policy answers
The agent confidently answers from an outdated document version.
Mitigation: add last_updated date and document version to every retrieved chunk. Add a logic step to flag or surface a warning if any retrieved document has not been updated in more than 90 days. Schedule and monitor nightly datastore syncs.
Agent says "I don't know" too aggressively Retrieval returns low-relevance chunks and the agent refuses to answer questions that are actually in the knowledge base. Mitigation: review the semantic search skill's relevance threshold. Add more document chunks for high-frequency topics. Check that document chunking is not cutting passages at unhelpful boundaries.
ROI assumptions
The table below uses conservative, illustrative assumptions. Replace with your organisation's actual values before presenting to stakeholders.
| Input | Illustrative value |
|---|---|
| Internal questions per day | 50 |
| Current resolution method | HR / IT helpdesk ticket |
| Minutes per ticket (helpdesk staff time) | 15 |
| Minutes per question with agent (review + confirm) | 2 |
| Escalation rate to human after agent answer | 20% |
| Loaded hourly cost (helpdesk staff) | $45 |
| Working days per year | 250 |
At these assumptions: (50 questions/day × 250 days × (15 − 2) min saved per question × 80% self-service rate) / 60 × $45 = approximately $97,500 / year in helpdesk time redirected. This does not account for employee time saved waiting for ticket responses.
Use the interactive calculator to adjust these inputs: /tools/ai-agent-roi-calculator?use_case=knowledge-base
FAQ
Can the agent replace our HR helpdesk?▾
No. The knowledge base agent handles routine lookups — leave policy, expense policy, onboarding steps, benefits summaries. Sensitive, personal, or complex HR queries — performance issues, individual compensation, disciplinary matters, grievances — should always be escalated to a human. The agent is a first-line triage tool, not a replacement for HR expertise.
How do I keep the knowledge base up to date?▾
Schedule a ProvenanceOne workflow to sync your authoritative document source — SharePoint, Notion, Confluence, or a Google Drive folder — to the ProvenanceOne datastore. Run it nightly or trigger it on every document publish event. Add a document version and last_updated timestamp to every chunk so the agent can surface how recent its sources are.
How do I prevent the agent from leaking confidential documents?▾
Scope the data step retrieval to the querying user's role and department. Attach an access_groups metadata field to each document and filter the datastore query on that field. Use the MCP Gateway DLP policy to redact sensitive fields — salary bands, PII — from retrieved text before the agent processes it. Test access controls explicitly with a non-privileged test user before going live.
What if the agent says something wrong?▾
Every answer includes source citations. Open the run debugger to see exactly which document chunks were retrieved and how the agent formed its answer. If the answer is wrong, the most common causes are: an outdated document in the datastore (fix: update and re-sync), a retrieval ranking issue (fix: tune the semantic search skill), or an overly liberal paraphrase (fix: tighten the system prompt to require direct quotes).
Can I use this for external customer documentation?▾
Yes, but treat it as an entirely separate agent and datastore. Do not use the same agent instance or the same datastore index for both internal employee queries and external customer queries. Apply separate access controls, separate DLP policies, and separate system prompts. Internal policy documents — especially HR, legal, and compensation — must never be reachable from a customer-facing agent.
Related pages
- Agents
- Skills
- MCP Servers
- Data and Datastores
- Connections
- Compliance Review Agent Playbook
- Engineering Agent Playbook
- System Prompt Template — starting-point structure for the knowledge base reasoning agent
- Tool Permission Matrix — define which retrieval and DLP tools the agent can access