Search Results: prompt-injection

Found 42 Skills

AI & Machine Learningorchestra-research/ai-res...

prompt-guard

Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.

🇺🇸|EnglishTranslated

Security & Complianceaffaan-m/everything-claud...

llm-trading-agent-security

Security patterns for autonomous trading agents with wallet or transaction authority. Covers prompt injection, spend limits, pre-send simulation, circuit breakers, MEV protection, and key handling.

🇺🇸|EnglishTranslated

Security & Compliancealex-ilgayev/mcpspy

security-integration-tests

Use this agent when working with prompt injection detection integration tests, including running tests, debugging failures, or adding new test samples.

🇺🇸|EnglishTranslated

AI & Machine Learningabsolutelyskilled/absolut...

skill-audit

Use this skill when auditing AI agent skills for security vulnerabilities, prompt injection, permission abuse, supply chain risks, or structural quality. Triggers on skill review, security audit, skill safety check, prompt injection detection, skill trust verification, skill quality gate, and any task requiring security analysis of AI agent skill files.

🇺🇸|EnglishTranslated

1 scripts/Attention

AI & Machine Learningasgard-ai-platform/skills

tech-prompt-engineering

Debug and harden production LLM prompts — handle prompt injection, output format drift, instruction forgetting in long contexts, and cross-model portability issues. Use this skill when the user ships an LLM-powered feature to production and needs to diagnose why outputs are inconsistent, unsafe, or regressed after model updates — NOT for basic 'write a better prompt' questions.

🇺🇸|EnglishTranslated

Security & Complianceborghei/claude-skills

skill-security-auditor

Security audit and vulnerability scanning for AI agent skills before installation. Detects prompt injection in SKILL.md files, dangerous code patterns (eval, exec, subprocess), network exfiltration, credential harvesting, dependency supply chain risks, file system boundary violations, and obfuscation. Produces PASS/WARN/FAIL verdicts with remediation guidance. Use when evaluating untrusted skills, pre-install security gates, or auditing skill repositories.

🇺🇸|EnglishTranslated

3 scripts/Attention

AI & Machine Learningmartinholovsky/claude-ski...

cloud-api-integration

Expert skill for integrating cloud AI APIs (Claude, GPT-4, Gemini). Covers secure API key management, prompt injection prevention, rate limiting, cost optimization, and protection against data exfiltration attacks.

🇺🇸|EnglishTranslated

AI & Machine Learningpatricio0312rev/skills

guardrails-safety-filter-builder

Implements content safety filters with PII redaction, policy constraints, prompt injection detection, and safe refusal templates. Use when adding "content moderation", "safety filters", "PII protection", or "guardrails".

🇺🇸|EnglishTranslated

Security & Compliancefermionoid/senseguard

senseguard

Semantic security scanner for OpenClaw skills. Detects prompt injection, data exfiltration, and hidden instructions that traditional code scanners miss. Use when user asks to scan skills, check skill safety, or run a security audit.

🇺🇸|EnglishTranslated

6 scripts/Attention

Security & Compliancearadotso/trending-skills

codex-session-patcher

Clean AI refusal responses from Codex CLI, Claude Code, and OpenCode session files, and inject CTF/pentest prompts to reduce refusals.

🇺🇸|EnglishTranslated

Security & Complianceerom/claude-skill-maton

maton

Security auditor for Claude Code skills and agent definitions. Scans a skill or agent directory for prompt injection, data exfiltration, privilege escalation, memory poisoning, obfuscation, malicious persistence, and 12 other threat categories (18 total). Returns a graded verdict (OK / WARNING / CRITICAL) with detailed findings. Use this skill whenever you need to audit, review, or validate the safety of a skill, an agent definition, a system prompt, or any set of instruction files before installing or trusting them. Also use it when the user mentions security scanning, threat detection, prompt injection checking, or wants to verify that a skill is safe. Triggers on: /maton, "audit this skill", "is this skill safe", "check for injection", "scan for threats", "review this agent", "security check".

🇺🇸|EnglishTranslated

15 scripts/Attention

AI & Machine Learningfatih-developer/fth-skill...

adaptive-guard

Protects LLM agent systems in real-time with a 5-tier filter (hash cache, rule engine, ML classifier, LLM judge, human approval) and an async learning engine. Synthesizes new rules from every detected attack, adding less than 50ms latency. Trigger on 'add security layer', 'prevent prompt injection', 'adaptive guard', 'runtime protection', or 'agent security'.

🇺🇸|EnglishTranslated