How It Works
Agent Inspector implements a complete security lifecycle for AI agents — from static code analysis to runtime behavior monitoring, with correlation that connects the dots.
The Security Lifecycle
Agent Inspector guides your agent through a complete security lifecycle, from development to production readiness:
1. Development
Write your agent code. IDE integration provides real-time feedback and persistent memory across sessions.
2. Static Analysis
Scan your code for OWASP LLM Top 10 vulnerabilities. 7 security categories with automated detection.
3. Dynamic Analysis
Run your agent through test sessions. 16 security checks analyze runtime behavior and detect anomalies.
4. Correlation
Connect static findings with runtime evidence. Prioritize VALIDATED issues over THEORETICAL risks.
5. Production Gate
GO/NO-GO decision based on security posture. Clear criteria for deployment readiness.
6. Production
Deploy with confidence. Generate compliance reports for stakeholders and auditors.
Two-Actor Architecture
Agent Inspector uses a two-actor model: an IDE component for static analysis and an Server component for dynamic analysis. Together, they enable correlation.
IDE Actor (Static Analysis)
The IDE integration brings security analysis directly into your development workflow:
- Persistent Memory: Context persists across coding sessions
- Runtime Data Access: Query dynamic analysis results from your IDE
- Slash Commands:
/agent-scan,/agent-fix,/agent-gate - MCP Protocol: Standardized communication between IDE and server
Server Actor (Dynamic Analysis)
The server component monitors runtime behavior through a transparent proxy:
- Zero Instrumentation: No code changes required — just change
base_url - Complete Capture: All LLM calls, tool usage, and responses
- Behavioral Analysis: Clustering, outlier detection, stability scoring
- Dashboard: Real-time visualization at
localhost:7100
Static Analysis
Static analysis scans your agent code for security vulnerabilities before runtime. Findings are mapped to the OWASP LLM Top 10 framework.
7 Security Check Categories
| Category | What It Checks | OWASP Mapping |
|---|---|---|
| Prompt Injection | Unsanitized user input in prompts, missing input validation | LLM01 |
| Insecure Output | Unvalidated LLM output used in dangerous operations | LLM02 |
| Data Leakage | PII exposure, sensitive data in prompts, logging issues | LLM06 |
| Excessive Agency | Tools with dangerous capabilities, missing permissions | LLM08 |
| Supply Chain | Outdated dependencies, vulnerable packages | LLM05 |
| Model DoS | Missing rate limits, unbounded token usage | LLM04 |
| Overreliance | Missing human oversight, no validation of LLM decisions | LLM09 |
In your IDE with MCP integration: /agent-scan
Or from command line: agent-inspector scan ./my-agent
Dynamic Analysis
Dynamic analysis monitors your agent during runtime to detect behavioral issues that only emerge during execution.
16 Security Checks in 4 Categories
Resource Management
- Token usage patterns
- Session duration limits
- Tool call frequency
- Cost anomalies
Environment & Supply Chain
- Model version tracking
- Tool adoption patterns
- Dependency usage
- Configuration drift
Behavioral Stability
- Consistency scoring
- Predictability metrics
- Cluster analysis
- Outlier detection
Privacy & PII
- PII detection (Presidio)
- Sensitive data exposure
- Data retention issues
- Cross-session leakage
1. Point your agent to http://localhost:4000
2. Run 20+ test sessions with varied inputs
3. View results in dashboard or IDE: /agent-analyze
Correlation
Correlation connects static code findings with runtime evidence. This helps prioritize real risks over theoretical issues.
Correlation States
| State | Meaning | Priority |
|---|---|---|
| VALIDATED | Static finding confirmed by runtime evidence | Critical — fix immediately |
| UNEXERCISED | Static finding not yet triggered in runtime tests | High — needs more testing |
| THEORETICAL | Static finding with no path to exploitation in runtime | Low — monitor for changes |
Why Correlation Matters
Static analysis alone produces many false positives. Dynamic analysis alone misses code paths not exercised in testing. Correlation gives you the complete picture — prioritizing issues that are both detectable in code AND confirmed in runtime behavior.
Zero-Instrumentation Proxy
Traditional observability requires extensive instrumentation: adding logging statements, integrating SDKs, configuring collectors, and maintaining instrumentation code. Agent Inspector eliminates all of this.
The Proxy Architecture
Agent Inspector sits between your agent and the LLM API as a transparent proxy. Simply change your
base_url to point to Agent Inspector, and it automatically captures:
- Every LLM request and response
- All tool calls, parameters, and results
- Token usage at each step
- Timing and performance data
- Error messages and stack traces
Benefits
No Code Changes
Your agent code stays clean and focused on logic. No SDKs to integrate or logging to maintain.
Works with Any Framework
Anthropic SDK, OpenAI SDK, LangChain, n8n, custom agents, and more.
Minimal Overhead
Proxy adds less than 10ms latency. No performance impact on your agents.
Complete Visibility
Captures everything automatically, no manual logging or configuration required.
Supported Frameworks
Since Agent Inspector captures raw LLM requests and responses at the API level, it's completely platform-agnostic. Any framework that makes HTTP calls to supported LLM providers will work.
LLM Providers
- Anthropic Claude: All Claude models via the Anthropic SDK
- OpenAI: GPT-4, GPT-3.5, and other OpenAI models
Agent Frameworks
- LangChain: Any LangChain agent using supported LLMs
- Mastra: Full support for Mastra workflows
- Strands: Strands agent framework integration
- n8n: n8n AI nodes and workflows
- Custom Agents: Any custom implementation using the proxy
IDE Integration
- Cursor: MCP integration with slash commands
- Claude Code: Plugin marketplace integration
Local-First Architecture
Agent Inspector runs entirely on your local machine. Your data never leaves your environment:
- No cloud dependencies: Everything runs locally
- No data sharing: Your prompts and responses stay on your machine
- No accounts required: Just install and run
- Full control: You own all captured data
Storage Modes
In-Memory (Default)
Fast and lightweight. Data is cleared when the server stops. Best for quick debugging and CI pipelines.
Persistent (File)
Data survives restarts. Access historical sessions. Best for ongoing development and regression testing.