Data Handling and Storage

ProvenanceOne stores workspace operational data across four the platform services: platform database (structured records), object storage (object payloads and packages), the secrets vault (all credentials), and the platform identity service (user identities). Credentials are never returned in API responses. All API traffic is encrypted in transit via TLS 1.2+. Audit events are retained for 7 years with a cryptographically-backed tamper-evidence MAC on every record.


What customers need to know

  • Credentials — connection OAuth tokens, API keys, agent model keys — are stored exclusively in the secrets vault and are never surfaced in API responses.
  • When an agent step executes, its system prompt, workflow context, and tool descriptions are sent to the configured AI model provider. Review the What is sent to model providers section before deploying agents that handle sensitive data.
  • Audit events are retained for 7 years. A configurable dataRetentionDays field on workspace settings controls retention for operational data.
  • A GDPR member erasure endpoint is available. The PersonID field on audit events is erased; the event data itself is preserved.
  • DLP controls (via the MCP Gateway) apply to MCP server tool calls. Coverage of direct skill calls requires confirmation.

Data storage map

The following table maps data categories to their storage services.

Data categoryStorage serviceNotes
Workflow definitions (DAGs)object storagePer-workspace bucket prefix
Workflow metadata, run records, step recordsplatform database
Agent configurations, skills, MCP server registrationsplatform database
Connection metadata (names, types, status)platform databaseCredentials stored separately
Connection credentials (OAuth tokens, API keys, service accounts)the secrets vaultNever returned in API responses
Agent model API keysthe secrets vaultReferenced via APIKeySecretName field; actual key never returned
Approval recordsplatform databaseApproval task tokens encrypted before storage
Audit eventsplatform database7-year TTL; HMAC-SHA256 MAC on every record
Bus messagesplatform database (≤400 KB), object storage (>400 KB)Large payloads spill to object storage
Skill packagesobject storageUploaded as .zip archives
Run step inputs/outputs (large)object storageSmall payloads inline in platform database
Workspace datastoresobject storagePer-workspace object storage
API keys, subscriptions, workspace settingsplatform database
User identities, passwords, group membershipsthe platform identity service

Encryption at rest

Credentials in the secrets vault are encrypted at rest by the secrets vault's native platform key management integration. Approval task tokens are encrypted before being written to platform database. Every audit event carries a HMAC-SHA256 MAC field as tamper evidence — this is a message authentication code, not full-record encryption.

Needs product confirmation: Specific platform key management key configuration (platform-managed key vs. customer-managed CMK), server-side encryption at rest configuration (server-side encryption vs. platform-managed encryption), platform database encryption-at-rest configuration (platform-managed key vs. CMK).

Infrastructure regions

Data is stored in the EU region. CDN edge nodes operate globally.

Needs product confirmation: Full data region deployment map; whether all platform database tables and the secrets vault resources are colocated in the EU region; whether data leaves a specific region during processing; multi-region deployment availability.


What is sent to model providers

When an agent step executes in a workflow run, the following data is transmitted to the configured AI model provider (Anthropic, OpenAI, Google, or Azure OpenAI):

  • The agent's system prompt
  • Workflow context passed to the agent step at runtime
  • Tool descriptions for attached skills and MCP servers
  • The agent's persistent memory values (from the agent's key-value store), if any are set
  • Tool call results returned by skills and MCP servers during the step execution

Model providers currently supported: Anthropic, OpenAI, Google, Azure OpenAI.

Needs product confirmation: Whether model provider API calls are made direct from the platform or via a proxy; whether provider agreements include zero-retention or no-training provisions; the exact serialised payload sent in each model API call; whether the system prompt is stripped of credentials before transmission.

Implication for sensitive data: Any data present in workflow context, tool outputs, or agent memory at the time of an agent step execution will be included in the model API call. Do not place raw credentials, PII, or confidential content in agent memory values or workflow context fields unless the relevant model provider agreement covers that data class.


Data access boundaries for credentials

Connection credentials are accessed only by three paths:

  1. The workflow execution engine, when running an action step that uses a connection
  2. the rotation service, when performing an OAuth refresh token exchange
  3. POST /connections/{id}/test, when testing a connection configuration

Every credential access via the execution engine or the test endpoint emits a connection.accessed audit event at risk level high. connection.listed is also logged, providing access-tracking evidence.

Agent model API keys are referenced in agent configuration by the APIKeySecretName field. The actual key value is never returned in any GET response for the agent.

POST /secrets/{id}/reveal — the only API path that returns a raw secret value — emits secret.accessed at risk level high. This call requires either the secrets:read API key scope or an admin session.


Retention

Audit events

Every audit event has a platform database TTL set to 7 years (2,557 days) from the event's occurredAt timestamp. Audit events cannot be manually deleted by any role.

Note: platform database TTL deletion is asynchronous. Records past their TTL may persist for up to 48 hours after expiry before platform database removes them. This does not affect event integrity or API query results during that window — expired records are excluded from reads.

Needs product confirmation: Whether 7 years is a hard minimum floor or whether the workspace dataRetentionDays setting can reduce audit retention below 7 years.

Operational data

The workspace dataRetentionDays field configures retention for non-audit operational data (run records, step records, bus messages). The precise scope of which record types are governed by this field requires confirmation.

Needs product confirmation: Exact data categories governed by dataRetentionDays; minimum and maximum permissible values; whether records past the configured retention window are hard-deleted or soft-deleted.


GDPR controls

Member erasure

POST /workspace/members/{userId}/erase is available to admin role only. This operation:

  1. Removes the member from the workspace
  2. Nulls the PersonID field on all historical audit events for that user (replacing the personally identifying value with a tombstone)
  3. Emits a member.erased audit event at risk level critical

The audit event record itself is preserved after erasure — only the PersonID field is removed. This maintains audit log integrity while satisfying the right to erasure under GDPR Article 17.

Limitations of current GDPR controls

The erasure endpoint addresses PersonID in the audit log. Erasure of personal data in other record types (run inputs, workflow context, agent memory) is not confirmed by the verified facts available.

Needs product confirmation: Full GDPR Article 28 Data Processing Agreement (DPA) availability; data residency options (EU-only, region selection); right-to-access fulfilment process for data subject access requests; breach notification procedures and SLA; whether personal data in run step inputs/outputs and agent memory is in scope for erasure.


Secrets management

All credentials stored by ProvenanceOne are held in the secrets vault with at-rest encryption. Secrets are never returned in:

  • GET /connections or GET /connections/{id} responses
  • GET /agents or GET /agents/{id} responses (the APIKeySecretName field is returned, not the key value)
  • Any list or detail endpoint for workspace configuration

Secret lifecycle events (all logged to the audit trail):

EventRisk levelTrigger
secret.createdlowNew secret written to the secrets vault
secret.updatedmediumSecret value or metadata updated
secret.rotatedmediumPOST /secrets/{id}/rotate called
secret.accessedhighPOST /secrets/{id}/reveal called
secret.listedlowSecrets enumerated (SOC2 evidence)

OAuth connection credentials have per-provider rotation service that handle refresh token exchange automatically. Rotation does not require the raw credential to pass through the API layer.


DLP — Data Loss Prevention

The MCP Gateway sits between the workflow execution engine and all MCP server tool calls. Before a tool call is dispatched to an MCP server, and before the tool response is returned to the agent, the Gateway evaluates the configured policy for that MCP server.

Available gateway policy controls:

  • Tool allowlist / denylist — explicitly permit or block named tools for a given MCP server
  • Input redaction — rules applied to outbound tool call inputs before they reach the MCP server
  • Output redaction — rules applied to inbound tool responses before they reach the agent

When a policy rule triggers, a policy.violation event is emitted to the audit log with the matched policy identifier.

Needs product confirmation: Exact redaction pattern types supported (regex, named entity recognition, keyword lists, JSONPath selectors); whether DLP policies also apply to direct skill call inputs and outputs (current confirmed coverage is MCP tool calls only).

Note: DLP gateway policies apply to MCP server tool calls only. Direct skill calls are not confirmed to pass through the MCP Gateway. If your DLP posture requires coverage of all agent tool calls, confirm skill call coverage before relying on gateway policies alone.


Admin controls

ControlLocationRole required
Configure gateway policies (tool allow/deny, redaction)Settings → MCP Gatewayadmin
Set dataRetentionDaysSettings → Workspaceadmin
Perform member GDPR erasureSettings → Membersadmin
Rotate a secretData → Secretseditor, admin
Reveal a secret valueData → Secretsadmin only
Create / revoke API keysSettings → API Keysadmin

Security implications

  1. Model provider data exposure: every agent step execution sends data to an external AI provider. Ensure your model provider agreement covers the data classes present in your workflows before go-live.
  2. Connection credential access is narrow but audited: only the execution engine, the rotation services, and the test endpoint can access connection credentials. Every access produces a high-risk audit event. Monitor connection.accessed events for unexpected access patterns.
  3. Secret reveal is a high-risk operation: POST /secrets/{id}/reveal is the only mechanism by which a raw credential value can be read outside of a workflow run. This call is always audited. Issue the secrets:read API key scope only to integrations that genuinely require it.
  4. DLP coverage is currently MCP-scoped: gateway policies are confirmed to apply to MCP server tool calls. Skill calls may not be covered. Assess whether this gap is acceptable for your use case.

Auditability

All data access and configuration changes described in this page produce audit events. Events relevant to data handling:

EventRiskWhat it covers
secret.accessedhighRaw secret value revealed
secret.rotatedmediumSecret rotated
connection.accessedhighConnection credential used
connection.listedlowConnection list enumerated
member.erasedcriticalGDPR erasure executed
policy.violationmediumDLP gateway policy triggered
datastore.object_deletedmediumObject deleted from datastore
gateway_policy.created/updated/deletedmedium/criticalDLP policy configuration changed

Audit events are queryable via GET /audit with the audit:read scope. Events are retained for 7 years and include a HMAC-SHA256 mac field for tamper-evidence.


Limitations and open questions

  • Model provider data handling: no confirmed zero-retention or no-training provisions with model providers. This is the most significant open question for organisations handling sensitive data.
  • DLP skill coverage: gateway DLP policies apply to MCP tool calls; direct skill call coverage is unconfirmed.
  • platform key management key ownership: whether CMK (customer-managed) keys are available or whether platform-managed keys are used is unconfirmed.
  • Data residency: region for platform database, the identity service, and the secrets vault resources is unconfirmed beyond the object storage primary bucket in the EU region.
  • GDPR DPA: Article 28 DPA availability is unconfirmed.
  • Operational data retention scope: the precise record types governed by dataRetentionDays are unconfirmed.

FAQ

Where are credentials stored?

All credentials — connection OAuth tokens, API keys, service account keys, and agent model API keys — are stored exclusively in the secrets vault with at-rest encryption. They are never returned in API responses for connections, agents, or workspace settings.

What data is sent to AI model providers?

When an agent step executes, the agent's system prompt, workflow context, tool descriptions, agent memory values, and tool call results are sent to the configured model provider. Whether provider agreements include zero-retention or no-training provisions is not confirmed — this must be verified before deploying agents that process sensitive data.

How long is data retained?

Audit events are retained for 7 years via a platform database TTL on every record. Operational data (runs, steps) is governed by the workspace `dataRetentionDays` setting. The precise scope and floor of that setting requires product confirmation.

Can a member's personal data be erased?

Yes. `POST /workspace/members/{userId}/erase` (admin only) removes the member and nulls the PersonID field on their historical audit events. The audit event records themselves are preserved. Erasure of personal data in run inputs, agent memory, and workflow context is not confirmed and requires product clarification.

Does DLP apply to all agent tool calls?

DLP gateway policies are confirmed to apply to MCP server tool calls routed through the MCP Gateway. Whether the same policies apply to direct skill call inputs and outputs is not confirmed. If full tool call coverage is required, confirm with the ProvenanceOne team before relying solely on gateway policies.