Who it’s for: Product and platform teams shipping RAG or agent features who need provable defences against prompt injection, data leaks, and unsafe tool use especially in connected device and industrial environments.
What you get (deliverables)
- Tailored test suite for regression: a focused set of security/eval tests concepts you can run locally and wire into CI to catch regressions.
- Jailbreak & prompt-injection testing: aligned to common LLM risk categories, with clear pass/fail thresholds.
- RAG/MCP context isolation review: check how prompts, tools, retrieval, and plugins connect; reduce the attack surface.
- Guardrail & enforcement plan: safe defaults, escalation rules, and fallback behaviour you can adopt immediately.
- handover + one re-test: a 60–90 min walkthrough and one follow-up test within 14 days after you integrate.
How we work
We will require access to a dev/test instance of your app with RAG/agent features and prompts. We deliver tests, reports, small code changes, and a workflow concepts. Your team owns deployment and wiring the workflow into your pipeline.
Pricing: From €17k per system per model (scope dependent).
FAQ
- Do you write policies? No-this is an engineering sprint. We review and measure defences, and map them to common LLM risks.
- Which stacks? Vendor-agnostic: bring your models, vector stores, and orchestration.
- What’s MCP? A standard for connecting AI apps to tools/data; we focus on making those connections safer.