banner

Sieve

Sieve is an HTTP proxy that sits between your agent and the model API. It strips personally identifiable information from every outbound request and restores original values in the response. The model processes clean data. The user gets back real results.

diagram

How it works

The agent points its model API calls at http://localhost:4243/v1/messages instead of the real endpoint. That is the only configuration change required. Sieve handles the rest transparently.

On each outbound request, Sieve runs a multi-tier detection pipeline over the message content, replaces detected entities with opaque tokens like [EMAIL_001] and [PERSON_002], and stores the original values in a session vault backed by the system credential store. The cleaned request is forwarded to the model API. On the response, Sieve scans for unexpected data in the inbound content, then rehydrates all tokens before delivering the final response to the agent. The model never sees raw PII at any stage.

Four-tier sensitivity system

Not all personal data carries the same risk. Sieve applies different rules depending on how sensitive the data is.

  • Tier 1 — always strip: bank account numbers, SSNs, private keys, crypto addresses, IP addresses. Removed with no exceptions, regardless of whether they appear in the query.
  • Tier 2 — strip unless in query: phone numbers, physical addresses, dates of birth. Preserved only if the user explicitly included them in the current query.
  • Tier 3 — strip unless in query: names, email addresses, organisations. Preserved only if they appear verbatim in the current query.
  • Tier 4 — not stripped: public figures, company public information, general knowledge. These carry no personal risk.

The tier system means Sieve is not an all-or-nothing redaction filter. An agent can still refer to a person by name if the user named them in the same message. Context is preserved where it is safe to do so.

Session vault

Token-to-value mappings are stored in the system credential store: macOS Keychain, Windows DPAPI, or Linux Secret Service. Each entry is keyed to the current session. The vault expires after four hours of inactivity, at which point all stored values are deleted. Rehydration after expiry is not possible — a new session starts clean.

If the system credential store is unavailable, Sieve falls back to an in-memory dict and warns on startup. The in-memory store disappears when the process exits.

Audit log

Every Sieve operation is logged to ~/.kogent/sieve_audit.log in newline-delimited JSON. Raw query text and PII values are never written to the log — only SHA-256 hashes. Each entry records what was detected and stripped, not what the values were. Audit entries can optionally be signed with the agent's ETH key for tamper-evidence.

Demo view

When Vigil is running, a live demo view is available at http://localhost:5173/sieve. It shows the last few Sieve operations side by side: what the agent sent, what the model received, what the model returned, and what the user saw. Tokens are highlighted in the model view. Real values are shown in the agent and user views. The audit log panel at the bottom streams new entries in real time.

Installation

pip install "kogent[sieve]"

The sieve extra installs the detection pipeline and all proxy dependencies. The proxy starts automatically when Vigil is running. To run it standalone:

python3 -m kogent.sieve.mcp_server

Point your agent at the proxy instead of the model API directly:

import anthropic

client = anthropic.Anthropic(
    base_url="http://localhost:4243",   # Sieve proxy
    api_key="...",
)