Loading...
Loading...
Run Semgrep static analysis for fast security scanning and pattern matching. Use when asked to scan code with Semgrep, write custom YAML rules, find vulnerabilities quickly, use taint mode, or set up Semgrep in CI/CD pipelines.
npx skill4agent add trailofbits/skills semgrep# pip
python3 -m pip install semgrep
# Homebrew
brew install semgrep
# Docker
docker run --rm -v "${PWD}:/src" returntocorp/semgrep semgrep --config auto /src
# Update
pip install --upgrade semgrepsemgrep --config auto . # Auto-detect rules
semgrep --config auto --metrics=off . # Disable telemetry for proprietary codesemgrep --config p/<RULESET> . # Single ruleset
semgrep --config p/security-audit --config p/trailofbits . # Multiple| Ruleset | Description |
|---|---|
| General security and code quality |
| Comprehensive security rules |
| OWASP Top 10 vulnerabilities |
| CWE Top 25 vulnerabilities |
| r2c security audit rules |
| Trail of Bits security rules |
| Python-specific |
| JavaScript-specific |
| Go-specific |
semgrep --config p/security-audit --sarif -o results.sarif . # SARIF
semgrep --config p/security-audit --json -o results.json . # JSON
semgrep --config p/security-audit --dataflow-traces . # Show data flowsemgrep --config p/python app.py # Single file
semgrep --config p/javascript src/ # Directory
semgrep --config auto --include='**/test/**' . # Include tests (excluded by default)rules:
- id: hardcoded-password
languages: [python]
message: "Hardcoded password detected: $PASSWORD"
severity: ERROR
pattern: password = "$PASSWORD"| Syntax | Description | Example |
|---|---|---|
| Match anything | |
| Capture metavariable | |
| Deep expression match | |
| Operator | Description |
|---|---|
| Match exact pattern |
| All must match (AND) |
| Any matches (OR) |
| Exclude matches |
| Match only inside context |
| Match only outside context |
| Regex matching |
| Regex on captured value |
| Compare values |
rules:
- id: sql-injection
languages: [python]
message: "Potential SQL injection"
severity: ERROR
patterns:
- pattern-either:
- pattern: cursor.execute($QUERY)
- pattern: db.execute($QUERY)
- pattern-not:
- pattern: cursor.execute("...", (...))
- metavariable-regex:
metavariable: $QUERY
regex: .*\+.*|.*\.format\(.*|.*%.*# Pattern `os.system($CMD)` catches this:
os.system(user_input) # Found# Same pattern misses this:
cmd = user_input
processed = cmd.strip()
os.system(processed) # Missed - no direct matchuser_inputcmd = ...processed = ...shlex.quote()os.system()rules:
- id: command-injection
languages: [python]
message: "User input flows to command execution"
severity: ERROR
mode: taint
pattern-sources:
- pattern: request.args.get(...)
- pattern: request.form[...]
- pattern: request.json
pattern-sinks:
- pattern: os.system($SINK)
- pattern: subprocess.call($SINK, shell=True)
- pattern: subprocess.run($SINK, shell=True, ...)
pattern-sanitizers:
- pattern: shlex.quote(...)
- pattern: int(...)rules:
- id: flask-sql-injection
languages: [python]
message: "SQL injection: user input flows to query without parameterization"
severity: ERROR
metadata:
cwe: "CWE-89: SQL Injection"
owasp: "A03:2021 - Injection"
confidence: HIGH
mode: taint
pattern-sources:
- pattern: request.args.get(...)
- pattern: request.form[...]
- pattern: request.json
pattern-sinks:
- pattern: cursor.execute($QUERY)
- pattern: db.execute($QUERY)
pattern-sanitizers:
- pattern: int(...)
fix: cursor.execute($QUERY, (params,))# test_rule.py
def test_vulnerable():
user_input = request.args.get("id")
# ruleid: flask-sql-injection
cursor.execute("SELECT * FROM users WHERE id = " + user_input)
def test_safe():
user_input = request.args.get("id")
# ok: flask-sql-injection
cursor.execute("SELECT * FROM users WHERE id = ?", (user_input,))semgrep --test rules/name: Semgrep
on:
push:
branches: [main]
pull_request:
schedule:
- cron: '0 0 1 * *' # Monthly
jobs:
semgrep:
runs-on: ubuntu-latest
container:
image: returntocorp/semgrep
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Required for diff-aware scanning
- name: Run Semgrep
run: |
if [ "${{ github.event_name }}" = "pull_request" ]; then
semgrep ci --baseline-commit ${{ github.event.pull_request.base.sha }}
else
semgrep ci
fi
env:
SEMGREP_RULES: >-
p/security-audit
p/owasp-top-ten
p/trailofbitstests/fixtures/
**/testdata/
generated/
vendor/
node_modules/password = get_from_vault() # nosemgrep: hardcoded-password
dangerous_but_safe() # nosemgrepsemgrep --config rules/ --time . # Check rule performance
ulimit -n 4096 # Increase file descriptors for large codebasesrules:
- id: my-rule
paths:
include: [src/]
exclude: [src/generated/]pip install semgrep-rules-manager
semgrep-rules-manager --dir ~/semgrep-rules download
semgrep -f ~/semgrep-rules .| Shortcut | Why It's Wrong |
|---|---|
| "Semgrep found nothing, code is clean" | Semgrep is pattern-based; it can't track complex data flow across functions |
| "I wrote a rule, so we're covered" | Rules need testing with |
| "Taint mode catches injection" | Only if you defined all sources, sinks, AND sanitizers correctly |
| "Pro rules are comprehensive" | Pro rules are good but not exhaustive; supplement with custom rules for your codebase |
| "Too many findings = noisy tool" | High finding count often means real problems; tune rules, don't disable them |