safety-scan
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSafety Scan
安全扫描
Scan content for prompt injection, jailbreak attempts, and unsafe patterns.
扫描内容中的提示注入、越狱尝试以及不安全模式。
When to use
使用场景
Before processing untrusted input (user submissions, API payloads, webhook data), scan it to detect prompt injection, adversarial content, or policy violations.
在处理不可信输入(用户提交内容、API负载、Webhook数据)之前,对其进行扫描,以检测提示注入、对抗性内容或违反策略的情况。
Steps
操作步骤
- Quick safety check — call with the input text for a boolean safe/unsafe result
mcp__claude-flow__aidefence_is_safe - Deep analysis — call for detailed threat classification and confidence scores
mcp__claude-flow__aidefence_analyze - Full scan — call for comprehensive multi-layer scanning
mcp__claude-flow__aidefence_scan - Train defenses — call with confirmed threats to improve detection
mcp__claude-flow__aidefence_learn - View stats — call for detection rates and false positive metrics
mcp__claude-flow__aidefence_stats
- 快速安全检查 — 调用接口,传入输入文本,获取安全/不安全的布尔值结果
mcp__claude-flow__aidefence_is_safe - 深度分析 — 调用接口,获取详细的威胁分类和置信度评分
mcp__claude-flow__aidefence_analyze - 全面扫描 — 调用接口,进行全方位的多层扫描
mcp__claude-flow__aidefence_scan - 训练防御模型 — 调用接口,传入已确认的威胁样本,提升检测能力
mcp__claude-flow__aidefence_learn - 查看统计数据 — 调用接口,获取检测率和误报指标
mcp__claude-flow__aidefence_stats
Threat categories
威胁类别
- Prompt injection (direct and indirect)
- Jailbreak attempts
- Data exfiltration patterns
- Instruction override attacks
- Social engineering prompts
- 提示注入(直接和间接)
- 越狱尝试
- 数据泄露模式
- 指令覆盖攻击
- 社会工程学提示