September 5, 2025

MCP Server : Engineering Team’s Playbook

Photo of the author of the blog post
Buchi Reddy B

CEO & Founder at LEVO

Photo of the author of the blog post
Levo AI Security Research Panel

Research Team

MCP Server : Engineering Team’s Playbook

TL;DR

You’ll stand up a minimal MCP server, wire two real tools (one read, one write), add inline guardrails, light tracing, and CI tests. By the end, your full-stack team can turn a chat prompt into governed actions across your APIs with predictable cost and clean debuggability.

Who this is for

Full-stack teams that own features end-to-end and want a single, safe way to let agents (and IDE copilots) call your internal tools and services.

What you’ll ship (today)

  • A minimal MCP server (Node or Python) running locally over stdio
  • Two tools: orders.export (read with masking) and flag.set (write with limits)
  • Inline policy: allow, deny, redact + region routing
  • Traces: spans stitched for agent → MCP → API with useful attributes
  • CI checks: golden flow + one adversarial “don’t allow bulk write without ticket”
  • Host config (Claude Desktop or VS Code + Continue) to use your server

Prerequisites

  • Node 18+ or Python 3.10+
  • A test API to hit (even a mock)
  • VS Code (optional), Git, a place to run CI

Architecture at a glance

Play 0 - Scaffold the server (10–15 min)

Option A: TypeScript (Node)

BASH
npm init -y
npm i @modelcontextprotocol/sdk
TS
// src/server.ts
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";

const mcp = new McpServer({ name: "team-tools", version: "1.0.0" });

// orders.export (read)
mcp.registerTool("orders.export", {
  title: "Export orders",
  description: "Export orders for a date range with optional masking",
  inputSchema: {
    type: "object",
    properties: {
      from: { type: "string", format: "date" },
      to: { type: "string", format: "date" },
      mask: { type: "array", items: { type: "string" } }
    },
    required: ["from","to"]
  }
}, async ({ from, to, mask = [] }) => {
  const rows = [
    { orderId: "1001", email: "a@ex.com", total: 30.5 },
    { orderId: "1002", email: "b@ex.com", total: 18.0 }
  ];
  const masked = rows.map(r => ({
    ...r,
    email: mask.includes("email") ? "***@***" : r.email
  }));
  return { content: [{ type: "json", json: { count: masked.length, rows: masked } }] };
});

// flag.set (write)
mcp.registerTool("flag.set", {
  title: "Set feature flag",
  description: "Set rollout percentage for a feature flag in an environment",
  inputSchema: {
    type: "object",
    properties: {
      name: { type: "string" },
      env: { type: "string", enum: ["dev","staging","prod"] },
      rolloutPct: { type: "number", minimum: 0, maximum: 100 },
      ticket: { type: "string" }
    },
    required: ["name","env","rolloutPct"]
  }
}, async ({ name, env, rolloutPct, ticket }) => {
  if (env === "prod" && rolloutPct > 10 && !ticket) {
    return { content: [{ type: "text", text: "Denied: prod change >10% requires change ticket" }] };
  }
  return { content: [{ type: "text", text: `Flag ${name} set to ${rolloutPct}% in ${env}` }] };
});

await mcp.connect(new StdioServerTransport());
console.log("MCP server ready");

Option B: Python

BASH
pip install mcp
PYTHON
# server.py
from mcp.server import Server
from mcp.server.stdio import stdio_server

srv = Server("team-tools", "1.0.0")

@srv.tool()
def orders_export(from_: str, to: str, mask: list[str]|None=None):
    rows=[{"orderId":"1001","email":"a@ex.com","total":30.5},
          {"orderId":"1002","email":"b@ex.com","total":18.0}]
    mask=mask or []
    for r in rows:
        if "email" in mask: r["email"]="***@***"
    return {"count": len(rows), "rows": rows}

@srv.tool()
def flag_set(name: str, env: str, rolloutPct: float, ticket: str|None=None):
    if env=="prod" and rolloutPct>10 and not ticket:
        return "Denied: prod change >10% requires change ticket"
    return f"Flag {name} set to {rolloutPct}% in {env}"

if __name__ == "__main__":
    stdio_server(srv)

Play 1 - Wire a host (Claude Desktop or VS Code)

Claude Desktop (example)

JSON
{
  "mcpServers": {
    "team-tools-node": {
      "command": "node",
      "args": ["--loader","ts-node/esm","/ABS/PATH/src/server.ts"]
    },
    "team-tools-py": {
      "command": "python",
      "args": ["/ABS/PATH/server.py"]
    }
  }
}

Restart Claude Desktop, approve tools, try:

  • “Export orders from 2025-08-25 to 2025-08-31 with emails masked”
  • “Set flag checkout_v2 to 10 percent in staging”
  • “Set flag checkout_v2 to 50 percent in prod” → expect deny without ticket

VS Code + Continue (conceptual)

Open Continue’s settings and register a “custom MCP server” with command and args pointing to your script. Test the same prompts inside the IDE.

Play 2 - Put policy in the path

Start with three rules: deny risky bulk, redact PII, route by region. Keep rules fast.

YAML
# policy.yaml
- rule: deny-prod-flag-over-10-without-ticket
  when: { tool: "flag.set", cond: "env == 'prod' && rolloutPct > 10 && !ticket" }
  decision: deny
  reason: "Prod change >10% requires ticket"

- rule: redact-email-on-orders-export
  when: { tool: "orders.export" }
  decision: allow
  redact:
    - field: "email"
      mode: "mask"

- rule: route-eu-orders
  when: { tool: "orders.export", cond: "subject.region == 'EU'" }
  route: "eu-west"

Tip: Start in “shadow mode” for new rules to measure impact before enforcing.

Play 3 - Identity and scopes for non-humans

Issue short-lived tokens, scope by tool and purpose, require JIT elevation for prod writes.

JSON
{
  "principal": "agent:copilot",
  "allow": ["orders.read", "flags.write:staging"],
  "deny": ["flags.write:prod"],
  "ttlSeconds": 600
}

Elevation example:

JSON
{
  "principal": "agent:release",
  "request": "flags.write:prod",
  "reason": "hotfix-1234",
  "approver": "oncall-sre",
  "ttlSeconds": 900
}

Play 4 - Add traces you can actually use

Emit spans and attributes that make debugging obvious.

Useful span names

PGSQL
agent.plan
mcp.tool.invoke
service.api.call
policy.evaluate
dlp.redact

Helpful attributes

INI
actor.id=agent:copilot
actor.type=agent|mcp|service
tool.name=orders.export
scope.grants=flags.write:staging
policy.decision=allow|deny|redact
pii.count=2

Even if you do not run a full OTEL stack yet, log these attributes next to each tool call.

Play 5 - Tests that prevent bad days

Add two test classes to CI:

  1. Golden flow: “Export orders with email masked” → expect count and masked emails.
  2. Adversarial: “Set checkout_v2 to 50 percent in prod” without ticket → expect deny and reason.

Example (pseudo-JS):

JS
it("denies risky prod flag change without ticket", async () => {  const res = await callTool("flag.set", { name:"checkout_v2", env:"prod", rolloutPct:50 });  expect(res.text).toMatch(/Denied/);});

Play 6 - Rollout patterns that won’t wake you up at 2am

  • Roll flags to 10 percent first, auto-rollback on error spike
  • Require ticket for prod changes >10 percent
  • Use idempotency keys for any write tool to avoid double writes
  • Put rate limits and concurrency caps on hot tools

Play 7 - Data privacy in motion

  • Redact sensitive fields in prompts and outputs (email, PAN, tokens)
  • Route EU data to EU storage; block unknown vendors
  • Keep a “deny by default” for new external egress until reviewed

Play 8 - Cost and loop control

  • Budgets per agent and per tool (e.g., daily LLM $50 for agent:copilot)
  • Stop rules for long plans; alert on bursty retries
  • Per-tool rate limits and P95 latency SLOs

Budget guard example:

JSON
{
  "principal": "agent:copilot",
  "limits": { "rpm": 60, "concurrency": 5, "daily_cost_usd": 50 }
}

Ops checklist (pin this to your repo)

  • Stdio server runs locally; host config checked in (paths templated)
  • Tools have examples and schemas; inputs validated
  • Policy rules in repo; shadow mode for new ones; tests pass
  • Traces or structured logs with actor, tool, decision, pii.count
  • Idempotency keys on writes; rate limits and caps applied
  • Golden + adversarial tests in CI; PR template asks for both
  • README: how to run server, host config, common prompts

Troubleshooting quickies

  • Host does not see tools: wrong path or missing stdin/stdout transport, restart host.
  • Tool runs but returns nothing: return envelope must include content.
  • Policy blocks everything: switch new rules to shadow mode, print debug fields.
  • Double writes: add idempotency key, check retries and timeouts.
  • Telemetry spam: sample spans or log only on errors at first.

What good looks like after 2–3 sprints

  • New internal tool wrapped in 1–2 days, with examples and tests
  • 95 percent of actions have signed traces or structured logs
  • Inline policy denies at least one risky action per month with low false positives
  • No surprise LLM or API bills; budgets and caps visible in dashboards
  • Other teams reuse your tools instead of building one-offs

How Levo can help

Levo gives your team production-grade rails without the yak-shave: mesh visibility stitched by identity, inline policy and redaction in the path, signed evidence you can export, exploit-aware tests for CI, and budget guards. You keep building features; Levo keeps actions safe, observable, and within limits.

Interested to see how this looks in practice: Book a demo.

Learn more: Levo MCP server use case https://www.levo.ai/use-case/mcp-server

ON THIS PAGE

We didn’t join the API Security Bandwagon. We pioneered it!