September 5, 2025

MCP Server : Security Team’s Playbook

Buchi Reddy B

CEO & Founder at LEVO

Levo AI Security Research Panel

Research Team

TL;DR

Treat the MCP layer as the control point for agent actions. You will harden transport and identity, move policy and DLP into the hot path, map threats to detections, instrument spans for attribution, build kill switches, and run adversarial tests in CI.

Who this is for

CISOs, Security Engineering, Detection and Response, AppSec. You own threat models, controls, detections, and incident outcomes.

What you will ship

Threat model tied to concrete policy and tool controls
mTLS, JWT claims, replay defense, rate limits, idempotency
Inline allow, deny, redact, region routing, vendor allow lists
OpenTelemetry spans with actor, scope, policy, PII counters
SIEM detections for risky chains, exfil paths, and scope drift
Kill switches, quarantine, read-only mode, and runbooks
CI suites for prompt injection, SSRF, mass assignment, BOLA/BFLA

Play 1 - Threat model and kill-chain (Week 1)

Goal: Align on what you defend against.

Top risks

Prompt injection, tool misuse, context poisoning
Over-scoped authority, lateral movement
SSRF via fetch tools, exfil via outputs and vendors
Cost denial of wallet via loops and retries

Output
A single page with threats, controls, detections, and response steps.

Play 2 - Transport, auth, and identity (Week 1 - 2)

Goal: Close obvious doors.

Requirements

TLS 1.2+ and mTLS MCP↔tools↔APIs, cert rotation ≤ 90 days
JWT with aud, iss, sub, exp ≤ 10m, nbf, jti replay defense
Purpose-scoped tokens, deny wildcards, short TTL

Detection idea (KQL-ish)

PGSQL

where policy.decision == "allow"
and actor.type == "agent"
and scope.grants matches "write:*"
and ttl_sec > 900

Play 3 - Policy in the hot path (Week 2)

Goal: Prevent before detect.

Controls

Allow lists and strict input schemas
Redaction for prompts and outputs, region routing, vendor allow lists
Require ticket for destructive bulk changes

Policy examples

YAML

- rule: deny-bulk-without-ticket
  when: { tool: "crm.bulk_update", cond: "args.count > 1000 && !ctx.ticket" }
  decision: deny

- rule: block-ssrf
  when: { tool: "web.fetch", cond: "url.is_link_local || url.hostname in ['169.254.169.254']" }
  decision: deny

Play 4 - Observability and attribution (Week 2 - 3)

Goal: Reconstruct any chain in minutes.

Span names

PGSQL

agent.plan
mcp.tool.invoke
service.api.call
policy.evaluate
dlp.redact

Attributes

PGSQL

actor.id, actor.type, tool.name, scope.grants, policy.decision, pii.count, idem.key

Evidence
Signed append-only logs linked to spans and exported to SIEM.

Play 5 - Detections that matter (Week 3)

Goal: Alerts you will act on.

Rules

Rogue chain: unseen tool combination across tenants
Exfil path: tool output size spike to a new vendor domain
Scope drift: new broad grant for an agent without change ticket
Loop arrest: >N retries or plan depth exceeding threshold

Detection example (pseudo)

NGINX

if chain.novelty_score > 0.95 and policy.decision == "allow":
  alert "Novel tool chain allowed"

Play 6 - Response mechanics (Week 3 - 4)

Goal: Small blast radius.

Actions

Kill session, quarantine tool, MCP read-only mode
Block egress domain, pause consumer or reduce rate limits
Export evidence bundle and attach to incident ticket

Runbook basics

Roles and approvers
Steps and timers
Evidence and comms templates
Rollback and verification

Play 7 - OWASP API Top 10 mapping (Week 4)

Goal: Parity with API security maturity.

Risk	Control	Test
BOLA/BFLA	Resource ownership checks in tool logic	Unit tests with foreign IDs
Broken auth	JWT claims, mTLS, short TTL, replay defense	Replay jti → 401
Excess exposure	Redaction by default	Assert masked fields
Lack of limits	RPM/RPS and concurrency caps	Stress → 429 not 5xx
Mass assignment	Schema allow lists, reject unknown	Fuzzer rejects extras
Injection	Strict schemas, no eval, escaping	Payload with quotes safe
SSRF	Allow list, block link-local/IPs	169.254 denied
Misconfig	Header hygiene, size caps	Scan headers
Inventory	Catalog, signed servers only	Discovery clean
Logging	PII scrubbing, token masking	Log scans clean

Play 8 - Supply chain and provenance (Week 4 - 5)

Goal: Trust what runs.

Generate SBOM on build, pin dependencies, verify sigs
Only run signed MCP servers and tool bundles
Record build provenance and environment

Play 9 - Fuzzing, adversarial, and conformance (Week 5)

Goal: Break it before attackers do.

Contract fuzzers in CI for tool schemas
Adversarial suites for prompt injection, chain exploits, SSRF
Host conformance across Claude, VS Code, Cursor, Continue, Cline

Play 10 - Cost, limits, and stop rules (Week 5 - 6)

Goal: Denial of wallet is real.

Budgets per agent and per tool, daily caps
Backoff, circuit breakers, queue backpressure
Notify and throttle on anomalies

KPIs and SLOs

Mean time to detect risky chains
Inline preventions and redactions per quarter with false positive rate
P95 tool latency and error budgets
Cost anomalies resolved within SLA
% actions with signed traces and complete attribution

IR checklist

Kill switch tested (read-only, quarantine, egress block)
Evidence export under 15 minutes
On-call runbooks current with approvers
Quarterly tabletop and replay with learnings merged into tests

Hardening checklist

mTLS everywhere, cert rotation ≤ 90 days
JWT claims verified, replay defense via jti
Secrets in KMS, no prod secrets in env
Container seccomp/AppArmor, non-root, read-only FS
Rate limits and concurrency caps set per tenant
PII scrubbing in logs and spans, retention configured

How Levo can help

Levo brings pre-encryption capture, identity stitching, inline policy and DLP, signed evidence, SIEM-ready detections, exploit-aware tests for CI, and budget guards. You prevent more, detect sooner, and prove exactly what happened. Learn more: Levo MCP server use case → https://www.levo.ai/use-case/mcp-server

ON THIS PAGE