pentest-ai-llm-security

Original🇺🇸 English
Translated

AI/LLM application security testing — prompt injection, jailbreaking, data exfiltration, and insecure output handling per OWASP LLM Top 10.

9installs
Added on

NPX Install

npx skill4agent add jd-opensource/joysafeter pentest-ai-llm-security

Pentest AI/LLM Security

Purpose

AI-integrated applications introduce entirely new attack surfaces. Prompt injection is the "SQLi of AI." Neither Shannon nor any existing skill addresses this domain. OWASP LLM Top 10 (2025) defines the methodology.

Prerequisites

Authorization Requirements

  • Written authorization with AI/LLM testing scope explicitly included
  • Model access details — API endpoints, model versions, tool/function access
  • Data sensitivity classification — what data the LLM can access
  • Rate limit awareness — LLM API costs can escalate quickly

Environment Setup

  • Garak for automated LLM vulnerability scanning
  • Burp Suite for API interception of LLM requests/responses
  • Python scripts for custom prompt injection payloads
  • Local proxy to capture full request/response chains

Core Workflow

  1. Integration Point Discovery: Identify all LLM integration points — chat interfaces, content generation, RAG pipelines, AI search, code completion, summarization.
  2. Direct Prompt Injection: Override system prompts, extract system prompt content, inject instructions that change model behavior.
  3. Indirect Prompt Injection: Embed malicious instructions in documents/emails/web pages the LLM processes, poisoned RAG context.
  4. Data Exfiltration: Extract training data, PII from context windows, other users' conversation history, system config details.
  5. Insecure Output Handling: LLM output rendered as HTML (XSS via LLM), used in SQL queries (SQLi via LLM), used in system commands.
  6. Excessive Agency: LLM with tool access performing unauthorized actions, privilege escalation through tool chains, resource abuse.
  7. Classification: Document findings with OWASP LLM Top 10 (2025) classification and remediation guidance.

OWASP LLM Top 10 (2025) Coverage

CategoryTest FocusStatus
LLM01 Prompt InjectionDirect and indirect injection
LLM02 Sensitive Information DisclosureData exfiltration, PII leakage
LLM03 Supply ChainModel provenance, plugin trust
LLM04 Data and Model PoisoningTraining data integrity
LLM05 Improper Output HandlingXSS/SQLi via LLM output
LLM06 Excessive AgencyUnauthorized tool use
LLM07 System Prompt LeakageSystem prompt extraction
LLM08 Vector and Embedding WeaknessesRAG poisoning
LLM09 MisinformationHallucination exploitation
LLM10 Unbounded ConsumptionResource exhaustion

Tool Categories

CategoryToolsPurpose
LLM ScanningGarak, rebuffAutomated prompt injection testing
API InterceptionBurp Suite, mitmproxyLLM API request/response capture
Prompt FuzzingCustom Python scriptsPayload generation and testing
Output AnalysisBrowser DevTools, BurpInsecure output rendering detection

References

  • references/tools.md
    - Tool function signatures and parameters
  • references/workflows.md
    - Attack pattern definitions and test vectors