September 5, 2025

What is AI Security?

Photo of the author of the blog post
Buchi Reddy B

CEO & Founder at LEVO

Photo of the author of the blog post
Levo AI Security Research Panel

Research Team

What AI actually means (no buzzwords)

Plain meaning: AI systems learn patterns from data so they can predict, classify, generate, or decide. Everything you’ll meet is a variation on that theme.

The family at a glance

  • Machine Learning (ML): Learns from labeled or unlabeled data to make predictions or classifications (e.g., churn risk, image tagging).
  • Generative AI (GenAI) / LLMs: Produces new content text, code, images, audio based on a prompt, great at drafting, summarizing, and translating.
  • Multimodal models: Understand and generate across combined inputs (text + images + audio/video), enabling richer tasks like describing a chart or answering questions about a screenshot.
  • RAG (Retrieval-Augmented Generation): A model that consults your approved knowledge sources at query time, then answers using that context improves accuracy and keeps proprietary data out of training.
  • Internal / private LLMs: Models you host or run within your VPC/region for tighter data control, consistent latency, and policy enforcement.
  • Agentic AI: A model wrapped with planning and tool-use so it can take multi-step actions, query systems, file tickets, update records under policies and approvals.

Why the distinction matters

  • A chat-only LLM mainly creates content risk (it can be wrong, biased, or disclose sensitive text).
  • An agent introduces effect risk, it can do things (click, post, transfer, run code). That means you must govern tools, scopes, spending, approvals, and logs just like any other production system.

How we got here (short history)

From rules to agents, in five leaps:

  1. Rules-engines (expert systems)
    If-then logic encoded by humans. Great for narrow, deterministic domains, but brittle and hard to maintain.
    Security takeaway: predictable but opaque rule sprawl, change control and audit were the main concerns.

  2. Statistical learning
    Logistic regression, trees, SVMs, and ensembles learned patterns from data instead of hand-written rules.
    What changed: data and feature engineering became the fuel.
    Security takeaway: data quality and leakage risks started to matter more than code paths.

  3. Deep learning
    Neural networks + backprop scaled with GPUs, breakthroughs in vision and speech.
    What changed: representation learning reduced manual features, accuracy jumped on complex tasks.
    Security takeaway: models became less interpretable, adversarial examples and data poisoning entered the mainstream.

  4. Transformers
    Sequence models that excel at long-range context and parallel training.
    What changed: pretraining on massive corpora unlocked general purpose language and multimodal understanding.
    Security takeaway: models now follow instructions in natural language, so inputs themselves became an attack surface (prompt injection, jailbreaks).

  5. Foundation models → GenAI → Agents
    Pretrained models adapted to many tasks via fine-tuning and instruction alignment, connected to tools and data retrieval.
    What changed: from answering questions to taking actions creating tickets, posting messages, moving money, changing configs.
    Security takeaway: you must assume stochastic answers + real-world side-effects. Guardrails, policy mediation, scopes, and evidence logging move from “nice to have” to table stakes.

Bottom line: the leap wasn’t only about accuracy it was about generality and tool-use. That shift turns content risk into effect risk, so your controls must cover both the text a model produces and the actions it can trigger.

Why adoption exploded

Economic step-change

  • Quality at workable cost: Pretrained (foundation) models amortize training expense, you rent inference by the call. Smaller, specialized, or quantized models now hit “good enough” quality for many tasks.
  • Elastic consumption: Token-based, serverless, and GPU bursts mean pilots can scale without upfront capex. Routing lets you send easy tasks to cheap models and hard ones to safer, bigger models.

Technical accessibility

  • APIs and SDKs everywhere: A capable model is literally a function call away, client libraries, tool-calling, and structured output make integration straightforward.
  • Plug-and-play data: Vector DBs, embeddings, and retrieval patterns (RAG) let teams point models at existing documents, tickets, and wikis without retraining.
  • Agent frameworks: Orchestrators expose step-plans, memory, and tool scopes, so prototypes turn into working assistants fast.

Organizational pull

  • Immediate productivity wins: Drafts, summaries, code hints, and search copilots deliver visible time savings across support, sales, engineering, and ops.
  • Competitive pressure: Customers expect AI-assisted experiences, vendors bundle copilots into suites, nudging adoption across the stack.
  • Developer momentum: Templates, examples, and open weights collapse the path from paper → demo → production.

Result: speed first, controls later

  • Security and governance lag: Teams connect models to sensitive data and effectful tools before policies, provenance, and approvals are in place.
  • Cost sprawl: Unbounded prompts, long contexts, and agent loops create unpredictable bills.
  • Evidence gap: Many launches miss structured logs for prompts, tool calls, and policy decisions making audits and incident response harder.

What to do about it (day-one guardrails)

  • Put an AI gateway/firewall at the app–model boundary for input/output policy, schema checks, and budgets.
  • Start RAG hygiene: allow-listed sources, PII minimization, and provenance/signing for corpora.
  • Define agent capability tiers (read → write → effect) with human approvals on effectful actions.
  • Log for evidence from the start: prompts, contexts, model/router choice, tool calls, costs, safety verdicts (exportable to SIEM/GRC).
  • Track a small set of KPIs: injection-block rate, eval pass rate, cost per task, % actions requiring approval.

Two lenses to keep in mind

You’ll get farther, faster if you separate and resource two distinct missions:

Lens A - Security for AI

Goal: Protect the AI stack so it is safe, reliable, compliant, and auditable.
Scope: Data lifecycle → models/weights/prompts → retrieval indexes → apps/plugins/agents → tools and infra.
Owners: Product Security (lead), ML/Platform, Data Governance, Privacy/Legal, Risk.
Controls baseline:

  • Policy backbone via RMF / 42001 (risk appetite, roles, docs, audits)
  • Technical controls via SAIF (gateways, input/output policy, sandboxing, telemetry)
  • Posture & discovery (asset inventory, SBOM-M, control coverage)
  • Evaluation & red teaming (safety, PI/SID leakage, jailbreaks, drift)
  • Evidence bus (prompts, tool calls, router/model choice, safety verdicts, costs)
    KPIs: % AI apps behind gateway, injection-block rate, eval pass rate, % assets with SBOM-M, time to mitigate red-team findings, incident MTTR, cost per task.

Lens B - AI for Security

Goal: Use AI to boost detection, triage, investigation, and response safely.
Scope: SOC copilots, alert summarization, query assistants, hunt hypotheses, playbook drafting, controlled actions (with HITL) in EDR, SIEM, ticketing, and IAM.
Owners: SecOps (lead), Threat Intel, IR, Platform/SRE (for integrations).
Guardrails: Narrow tool scopes, staged approvals (HITL), replayable traces, allow-listed data sources, output validation before actions.
KPIs: MTTD/MTTR deltas, % alerts auto-triaged, false-action rate, analyst time saved, playbook adoption, evidence completeness.

Keep them distinct (but connected)

Why separation matters: Teams often fund the shiny “AI for Security” assistants and leave “Security for AI” under-resourced until an agent misfires or a data leak hits. Treat them as two programs with different charters, budgets, and scorecards.

RACI snapshot

  • CISO: accountable for both lenses, arbitrates risk/priority.
  • Head of Product Security: owns Security for AI controls and attestations.
  • SOC Director: owns AI for Security outcomes and runbooks.
  • Data/Privacy Lead: signs off on data sources, minimization, provenance.
  • Engineering: implements gateways, schemas, sandboxes, telemetry.

Handshakes between lenses

  • Shared evidence bus: SecOps copilots should consume the same structured prompts/tool logs the product stack emits.
  • Eval reuse: safety and leakage tests from Security for AI become quality gates for SOC copilots.
  • Incident loop: when an AI-assisted action goes wrong, IR consumes evidence artifacts and feeds findings back into policies and evals.

Framework pairing in practice

  • NIST AI RMF (Govern–Map–Measure–Manage) → sets the risk backbone.
  • ISO/IEC 42001 (AIMS) → operationalizes governance into an auditable management system.
  • SAIF → gives practitioners the concrete control catalog for gateways, tool policy, isolation, logging, and automation.

Working pattern: RMF defines what risk to manage, 42001 makes it auditably real, SAIF specifies how engineers build and run it.

Planning & budgeting cues

  • Two backlogs, two dashboards: one for hardening the AI stack, one for SOC augmentation.
  • Quarterly outcomes:
    • Security for AI: 100% of AI apps behind gateway, ≥90% eval pass, full SBOM-M coverage, evidence export to SIEM/GRC.
    • AI for Security: 30–50% auto-triage with ≤2% false-action, 20–30% MTTR reduction, analyst satisfaction up.
  • Funding rule of thumb: Ensure both lines appear in the plan (tools + headcount). If you must phase, land the gateway/evidence bus first, then scale SOC copilots on top of trustworthy data.

Where AI already shows up in your day

You’re likely using AI right now which means you already own its data and evidence story.

Common touchpoints (by role)

  • Office & CX: document copilots, meeting summarizers, help-center bots, sales-enablement search.
  • Engineering: code assistants in IDEs, backlog triage, test-case generation, incident postmortem drafting.
  • Ops & Finance: invoice QA, reconciliations, routing and scheduling, knowledge lookup.
  • Data & Analytics: natural-language BI, SQL/text explainers, catalog assistants.
  • IT & Security: ticket drafting, enrichment, alert summaries, guided playbooks (with approvals).

What changes when AI is in the loop

  • Inputs become untrusted data. Anything users paste or upload can inject instructions or secrets.
  • Outputs can trigger effects. If a bot can file tickets, move money, or post messages, treat it like production automation.

Day-one guardrails (use everywhere)

  • Put the AI interaction behind a gateway: input/output policy, schema checks, budget/rate limits.
  • Minimize context: redact PII/secrets, send only what’s needed.
  • Validate before acting: require structured outputs, human approval for effectful actions.
  • Isolate retrieval: allow-listed sources, per-index ACLs, provenance/signing of corpora.
  • Log for evidence: prompt/context IDs, model/router choice, tool calls, policy hits, costs exportable to SIEM/GRC.

Quick “smell tests”

  • Can a user upload a file or paste data? → You need PII/secret scrubbing and provenance.
  • Can the AI call tools? → You need scopes, budgets, sandboxing, and approvals.
  • Could the answer be audited later? → You need structured traces, source IDs, and policy verdicts.

Evidence to retain (minimum)

  • Asset inventory and SBOM-M
  • Source allow-lists and licenses/consent records
  • Dataset and index signatures/versions
  • Eval/red-team results and regression history
  • Prompt/context IDs, tool calls, router/model choice, safety decisions, and costs

Bottom line: if a feature accepts free-form input or can do anything on your behalf, it’s in scope for AI security today.

Conclusion

AI is no longer a lab experiment, it runs core workflows. The leap from chat-only models to agentic systems turns content risk into effect risk. That shift demands runtime guardrails at the app–model boundary, strict data hygiene for RAG, scoped permissions for tools, human approvals for high-impact actions, and complete evidence logging. Treat “Security for AI” and “AI for Security” as separate missions with their own owners, budgets, and KPIs, connected by a shared evidence bus. If a feature accepts untrusted input or can act on your behalf, it is in scope for AI security today.

FAQs

1) What is AI in plain terms?
Systems that learn patterns from data to predict, classify, generate, or decide. Everything else is a variation on that core idea.

2) How do ML, LLMs, multimodal, RAG, private LLMs, and agents differ?
ML predicts or classifies. LLMs generate text or code. Multimodal models handle text plus images or audio. RAG consults approved knowledge at query time. Private LLMs run in your VPC for control. Agents layer planning and tool-use so AI can take multi-step actions.

3) Why are agents riskier than chat-only LLMs?
Chat creates content risk. Agents introduce effect risk because they can act, post, update, transfer, or modify systems through tools and APIs.

4) What changed to make adoption explode?
APIs made capable models a function call away, retrieval patterns plugged in enterprise data, and elastic pricing aligned cost to usage. Teams saw fast wins, so adoption spread.

5) What day-one guardrails should every team use?
Gateway at the app–model boundary, schema and budget controls, PII minimization for context, allow-listed retrieval sources, human approval for effectful actions, and structured logs for prompts, tool calls, policy decisions, and costs.

6) What KPIs show AI is safely governed?
Injection-block rate, evaluation pass rate, cost per task, percent of actions requiring approval, percent of AI apps behind a gateway, time to mitigate red-team findings, incident MTTR.

7) What is the difference between Security for AI and AI for Security?
Security for AI protects the AI stack itself data, models, agents, tools, and evidence. AI for Security uses AI to improve detection and response with tight scopes, approvals, and replayable traces.

8) How do I start without slowing teams?
Land the gateway and evidence logging first, define agent capability tiers, allow-list retrieval sources, add approval steps only for high-risk scopes, and track a small KPI set from the start.

ON THIS PAGE

We didn’t join the API Security Bandwagon. We pioneered it!