cairn-ai-pentest

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Cairn AI Automated Penetration Testing System

Cairn AI自动化渗透测试系统

Skill by ara.so — Daily 2026 Skills collection.
Cairn is an AI-driven automated penetration testing and general problem-solving framework developed by the Bytex@起零衍迹实验室 team. It achieved the unique "AK" (All Killed / full score) result in the 2nd TCH Tencent Cloud Hackathon Intelligent Penetration Challenge, placing 4th online. The system uses LLM-based agents to autonomously reason about, plan, and execute multi-step security testing tasks.

Skill by ara.so — Daily 2026 Skills collection.
Cairn是由Bytex@起零衍迹实验室团队开发的一款AI驱动的自动化渗透测试与通用问题解决框架。它在第二届TCH腾讯云黑客马拉松智能渗透挑战赛中取得了唯一的"AK"(全通关/满分)成绩,线上排名第4。该系统基于LLM驱动的Agent自主进行推理、规划并执行多步骤安全测试任务。

What Cairn Does

Cairn的功能

  • Autonomous AI Agent Loop: Iteratively reasons about a target, selects tools, executes commands, and interprets results
  • Penetration Testing Automation: Web vulnerability discovery, exploitation, CTF-style challenge solving
  • General Problem Solving: Extensible to non-security tasks via tool/plugin architecture
  • Multi-step Planning: Breaks complex objectives into subtasks with memory and context management
  • Tool Integration: Wraps common pentest tools (nmap, sqlmap, curl, custom scripts) as callable agent actions

  • 自主AI Agent循环:针对目标进行迭代推理、选择工具、执行命令并解读结果
  • 渗透测试自动化:Web漏洞发现、利用,以及CTF风格挑战解决
  • 通用问题解决:通过工具/插件架构可扩展至非安全类任务
  • 多步骤规划:借助记忆与上下文管理将复杂目标拆解为子任务
  • 工具集成:将常见渗透测试工具(nmap、sqlmap、curl、自定义脚本)封装为可调用的Agent动作

Project Status

项目状态

⚠️ Code is still being organized and is expected to be open-sourced soon. The examples below reflect the architecture described in the competition writeup and visible repository structure.
Follow the writeup for architecture details: https://mp.weixin.qq.com/s/DlpEH7bVr0xi0VawPJs3XA

⚠️ Code is still being organized and is expected to be open-sourced soon. The examples below reflect the architecture described in the competition writeup and visible repository structure.
如需了解架构细节,请查看文档:https://mp.weixin.qq.com/s/DlpEH7bVr0xi0VawPJs3XA

Installation

安装

bash
undefined
bash
undefined

Clone the repository

Clone the repository

Install Python dependencies (expected)

Install Python dependencies (expected)

pip install -r requirements.txt
pip install -r requirements.txt

Or with uv (modern Python tooling)

Or with uv (modern Python tooling)

uv sync
undefined
uv sync
undefined

Environment Configuration

环境配置

Create a
.env
file in the project root:
env
undefined
在项目根目录创建
.env
文件:
env
undefined

LLM Provider (OpenAI-compatible endpoint)

LLM Provider (OpenAI-compatible endpoint)

OPENAI_API_KEY=your_api_key_here OPENAI_BASE_URL=https://api.openai.com/v1 MODEL_NAME=gpt-4o
OPENAI_API_KEY=your_api_key_here OPENAI_BASE_URL=https://api.openai.com/v1 MODEL_NAME=gpt-4o

OR use a local/alternative provider

OR use a local/alternative provider

MODEL_NAME=deepseek-chat

MODEL_NAME=deepseek-chat

Agent configuration

Agent configuration

MAX_ITERATIONS=30 TIMEOUT_PER_STEP=60
MAX_ITERATIONS=30 TIMEOUT_PER_STEP=60

Target scope (safety guard)

Target scope (safety guard)

TARGET_SCOPE=192.168.1.0/24
TARGET_SCOPE=192.168.1.0/24

Logging

Logging

LOG_LEVEL=INFO LOG_FILE=./logs/cairn.log

---
LOG_LEVEL=INFO LOG_FILE=./logs/cairn.log

---

Core Architecture

核心架构

Cairn follows a ReAct (Reasoning + Acting) agent pattern:
User Goal
┌─────────────────────────────┐
│         Agent Loop          │
│  ┌────────────────────────┐ │
│  │  Think (LLM Reasoning) │ │
│  └──────────┬─────────────┘ │
│             │               │
│  ┌──────────▼─────────────┐ │
│  │  Act (Tool Selection)  │ │
│  └──────────┬─────────────┘ │
│             │               │
│  ┌──────────▼─────────────┐ │
│  │  Observe (Parse Result)│ │
│  └──────────┬─────────────┘ │
│             │               │
│         (loop until done)   │
└─────────────────────────────┘
Final Answer / Exploit / Report

Cairn遵循ReAct (Reasoning + Acting) Agent模式:
User Goal
┌─────────────────────────────┐
│         Agent Loop          │
│  ┌────────────────────────┐ │
│  │  Think (LLM Reasoning) │ │
│  └──────────┬─────────────┘ │
│             │               │
│  ┌──────────▼─────────────┐ │
│  │  Act (Tool Selection)  │ │
│  └──────────┬─────────────┘ │
│             │               │
│  ┌──────────▼─────────────┐ │
│  │  Observe (Parse Result)│ │
│  └──────────┬─────────────┘ │
│             │               │
│         (loop until done)   │
└─────────────────────────────┘
Final Answer / Exploit / Report

Key Usage Patterns

核心使用模式

1. Basic Agent Invocation (Expected CLI)

1. 基础Agent调用(预期CLI)

bash
undefined
bash
undefined

Run against a CTF challenge or target

Run against a CTF challenge or target

python cairn.py --target "http://192.168.1.100" --goal "Find and exploit SQL injection to retrieve admin credentials"
python cairn.py --target "http://192.168.1.100" --goal "Find and exploit SQL injection to retrieve admin credentials"

With custom model

With custom model

python cairn.py --target "http://challenge.example.com"
--goal "Solve this web CTF challenge and get the flag"
--model gpt-4o
--max-iterations 25
python cairn.py --target "http://challenge.example.com"
--goal "Solve this web CTF challenge and get the flag"
--model gpt-4o
--max-iterations 25

Dry run (plan only, no execution)

Dry run (plan only, no execution)

python cairn.py --target "http://192.168.1.100"
--goal "Enumerate all open services"
--dry-run
undefined
python cairn.py --target "http://192.168.1.100"
--goal "Enumerate all open services"
--dry-run
undefined

2. Python API Usage (Expected)

2. Python API使用(预期)

python
from cairn import CairnAgent
from cairn.tools import ToolRegistry
from cairn.config import CairnConfig
python
from cairn import CairnAgent
from cairn.tools import ToolRegistry
from cairn.config import CairnConfig

Initialize configuration

Initialize configuration

config = CairnConfig( model_name="gpt-4o", api_key=os.environ["OPENAI_API_KEY"], base_url=os.environ.get("OPENAI_BASE_URL", "https://api.openai.com/v1"), max_iterations=30, target_scope=["192.168.1.0/24"], )
config = CairnConfig( model_name="gpt-4o", api_key=os.environ["OPENAI_API_KEY"], base_url=os.environ.get("OPENAI_BASE_URL", "https://api.openai.com/v1"), max_iterations=30, target_scope=["192.168.1.0/24"], )

Build tool registry

Build tool registry

tools = ToolRegistry() tools.register_defaults() # nmap, curl, sqlmap, ffuf, etc.
tools = ToolRegistry() tools.register_defaults() # nmap, curl, sqlmap, ffuf, etc.

Create and run agent

Create and run agent

agent = CairnAgent(config=config, tools=tools)
result = agent.run( target="http://192.168.1.100", goal="Find all web vulnerabilities and attempt exploitation", )
print(result.summary) print(result.findings)
undefined
agent = CairnAgent(config=config, tools=tools)
result = agent.run( target="http://192.168.1.100", goal="Find all web vulnerabilities and attempt exploitation", )
print(result.summary) print(result.findings)
undefined

3. Custom Tool Registration

3. 自定义工具注册

python
from cairn.tools import Tool, ToolResult

class CustomExploitTool(Tool):
    name = "custom_exploit"
    description = "Exploits a specific vulnerability in target application"
    
    def execute(self, target: str, payload: str, **kwargs) -> ToolResult:
        import subprocess
        cmd = f"python exploit.py --target {target} --payload '{payload}'"
        output = subprocess.run(cmd, shell=True, capture_output=True, text=True)
        return ToolResult(
            success=output.returncode == 0,
            output=output.stdout,
            error=output.stderr,
        )
python
from cairn.tools import Tool, ToolResult

class CustomExploitTool(Tool):
    name = "custom_exploit"
    description = "Exploits a specific vulnerability in target application"
    
    def execute(self, target: str, payload: str, **kwargs) -> ToolResult:
        import subprocess
        cmd = f"python exploit.py --target {target} --payload '{payload}'"
        output = subprocess.run(cmd, shell=True, capture_output=True, text=True)
        return ToolResult(
            success=output.returncode == 0,
            output=output.stdout,
            error=output.stderr,
        )

Register with agent

Register with agent

tools.register(CustomExploitTool()) agent = CairnAgent(config=config, tools=tools)
undefined
tools.register(CustomExploitTool()) agent = CairnAgent(config=config, tools=tools)
undefined

4. Multi-Phase Penetration Test

4. 多阶段渗透测试

python
from cairn import CairnAgent, Phase
from cairn.pipeline import PentestPipeline

pipeline = PentestPipeline(agent=agent)
python
from cairn import CairnAgent, Phase
from cairn.pipeline import PentestPipeline

pipeline = PentestPipeline(agent=agent)

Define phases

Define phases

pipeline.add_phase(Phase( name="reconnaissance", goal="Enumerate all open ports and services on {target}", )) pipeline.add_phase(Phase( name="vulnerability_scan", goal="Based on discovered services, identify exploitable vulnerabilities", depends_on=["reconnaissance"], )) pipeline.add_phase(Phase( name="exploitation", goal="Exploit identified vulnerabilities and achieve {objective}", depends_on=["vulnerability_scan"], ))
pipeline.add_phase(Phase( name="reconnaissance", goal="Enumerate all open ports and services on {target}", )) pipeline.add_phase(Phase( name="vulnerability_scan", goal="Based on discovered services, identify exploitable vulnerabilities", depends_on=["reconnaissance"], )) pipeline.add_phase(Phase( name="exploitation", goal="Exploit identified vulnerabilities and achieve {objective}", depends_on=["vulnerability_scan"], ))

Run full pipeline

Run full pipeline

report = pipeline.run( target="192.168.1.100", objective="obtain root shell or read /flag", ) report.save("./reports/pentest_report.json")

---
report = pipeline.run( target="192.168.1.100", objective="obtain root shell or read /flag", ) report.save("./reports/pentest_report.json")

---

Tool Integration Examples

工具集成示例

Built-in Tool Wrappers (Expected)

内置工具封装(预期)

python
undefined
python
undefined

nmap integration

nmap integration

from cairn.tools.network import NmapTool
nmap = NmapTool() result = nmap.execute(target="192.168.1.100", flags="-sV -sC -p-")
from cairn.tools.network import NmapTool
nmap = NmapTool() result = nmap.execute(target="192.168.1.100", flags="-sV -sC -p-")

Returns structured service enumeration data

Returns structured service enumeration data

HTTP request tool

HTTP request tool

from cairn.tools.web import HTTPTool
http = HTTPTool() result = http.execute( url="http://target.com/login", method="POST", data={"username": "admin' OR '1'='1", "password": "x"}, follow_redirects=True, )
from cairn.tools.web import HTTPTool
http = HTTPTool() result = http.execute( url="http://target.com/login", method="POST", data={"username": "admin' OR '1'='1", "password": "x"}, follow_redirects=True, )

Command execution tool (sandboxed)

Command execution tool (sandboxed)

from cairn.tools.shell import ShellTool
shell = ShellTool(allowed_commands=["curl", "nmap", "sqlmap", "ffuf"]) result = shell.execute(command="sqlmap -u 'http://target.com/?id=1' --dbs --batch")

---
from cairn.tools.shell import ShellTool
shell = ShellTool(allowed_commands=["curl", "nmap", "sqlmap", "ffuf"]) result = shell.execute(command="sqlmap -u 'http://target.com/?id=1' --dbs --batch")

---

Agent Memory and Context

Agent记忆与上下文

python
from cairn.memory import AgentMemory
python
from cairn.memory import AgentMemory

Memory persists findings across agent steps

Memory persists findings across agent steps

memory = AgentMemory( short_term_limit=20, # Recent observations in context long_term_enabled=True, # Summarize older context facts_store=True, # Extract and index key facts )
agent = CairnAgent(config=config, tools=tools, memory=memory)
memory = AgentMemory( short_term_limit=20, # Recent observations in context long_term_enabled=True, # Summarize older context facts_store=True, # Extract and index key facts )
agent = CairnAgent(config=config, tools=tools, memory=memory)

Access collected facts after run

Access collected facts after run

for finding in agent.memory.findings: print(f"[{finding.severity}] {finding.description}") print(f" Evidence: {finding.evidence}") print(f" Recommendation: {finding.remediation}")

---
for finding in agent.memory.findings: print(f"[{finding.severity}] {finding.description}") print(f" Evidence: {finding.evidence}") print(f" Recommendation: {finding.remediation}")

---

Configuration Reference

配置参考

python
undefined
python
undefined

cairn/config.py (expected structure)

cairn/config.py (expected structure)

@dataclass class CairnConfig: # LLM settings model_name: str = "gpt-4o" api_key: str = field(default_factory=lambda: os.environ["OPENAI_API_KEY"]) base_url: str = "https://api.openai.com/v1" temperature: float = 0.1 # Low temp for consistent tool use max_tokens: int = 4096
# Agent behavior
max_iterations: int = 30       # Hard stop on runaway loops
timeout_per_step: int = 60     # Seconds per tool execution
verbose: bool = False

# Safety
target_scope: list[str] = field(default_factory=list)
dry_run: bool = False          # Plan without executing
require_confirmation: bool = False  # Interactive approval per step

# Output
report_format: str = "json"    # json | markdown | html
report_path: str = "./reports"

---
@dataclass class CairnConfig: # LLM settings model_name: str = "gpt-4o" api_key: str = field(default_factory=lambda: os.environ["OPENAI_API_KEY"]) base_url: str = "https://api.openai.com/v1" temperature: float = 0.1 # Low temp for consistent tool use max_tokens: int = 4096
# Agent behavior
max_iterations: int = 30       # Hard stop on runaway loops
timeout_per_step: int = 60     # Seconds per tool execution
verbose: bool = False

# Safety
target_scope: list[str] = field(default_factory=list)
dry_run: bool = False          # Plan without executing
require_confirmation: bool = False  # Interactive approval per step

# Output
report_format: str = "json"    # json | markdown | html
report_path: str = "./reports"

---

Prompt Engineering Patterns

提示词工程模式

Cairn uses structured system prompts for reliable tool invocation:
python
undefined
Cairn使用结构化系统提示词实现可靠的工具调用:
python
undefined

Example system prompt structure (inferred from competition writeup)

Example system prompt structure (inferred from competition writeup)

SYSTEM_PROMPT = """You are an expert penetration tester AI agent.
SYSTEM_PROMPT = """你是一名专业渗透测试AI Agent.

Objective

Objective

{goal}
{goal}

Target

Target

{target}
{target}

Available Tools

Available Tools

{tool_descriptions}
{tool_descriptions}

Rules

Rules

  1. Always reason step-by-step before acting
  2. Stay within scope: {scope}
  3. Prefer non-destructive enumeration before exploitation
  4. Document every finding with evidence
  1. 行动前始终逐步推理
  2. 严格遵守测试范围:{scope}
  3. 优先进行非破坏性枚举,再开展漏洞利用
  4. 为每个发现记录证据

Response Format

Response Format

Thought: <your reasoning> Action: <tool_name> Action Input: <tool parameters as JSON>
After receiving Observation, continue until you reach a Final Answer. """

---
Thought: <你的推理内容> Action: <工具名称> Action Input: <工具参数(JSON格式)>
收到Observation后,继续执行直至得出Final Answer. """

---

CTF / Challenge Mode

CTF/挑战模式

bash
undefined
bash
undefined

Optimized for CTF flag capture

Optimized for CTF flag capture

python cairn.py
--mode ctf
--target "http://ctf-challenge.com:8080"
--goal "Find the hidden flag in format FLAG{...}"
--model gpt-4o
--iterations 50
--verbose
python cairn.py
--mode ctf
--target "http://ctf-challenge.com:8080"
--goal "Find the hidden flag in format FLAG{...}"
--model gpt-4o
--iterations 50
--verbose

With flag pattern matching

With flag pattern matching

python cairn.py
--mode ctf
--target "http://target.com"
--flag-pattern "CTF{[a-zA-Z0-9_]+}"
--auto-submit

---
python cairn.py
--mode ctf
--target "http://target.com"
--flag-pattern "CTF{[a-zA-Z0-9_]+}"
--auto-submit

---

Logging and Debugging

日志与调试

python
import logging
from cairn import CairnAgent
python
import logging
from cairn import CairnAgent

Enable detailed agent trace logging

Enable detailed agent trace logging

logging.basicConfig(level=logging.DEBUG)
agent = CairnAgent(config=config, tools=tools, verbose=True)
logging.basicConfig(level=logging.DEBUG)
agent = CairnAgent(config=config, tools=tools, verbose=True)

Each step is logged:

Each step is logged:

[THINK] Analyzing login form for injection points...

[THINK] Analyzing login form for injection points...

[ACT] Calling tool: http_request

[ACT] Calling tool: http_request

[INPUT] {"url": "...", "method": "POST", "data": {...}}

[INPUT] {"url": "...", "method": "POST", "data": {...}}

[OBS] Response 200, contains "Invalid credentials"

[OBS] Response 200, contains "Invalid credentials"

[THINK] Response suggests valid injection point, trying UNION...

[THINK] Response suggests valid injection point, trying UNION...


---

---

Troubleshooting

故障排查

IssueCauseFix
Agent loops without progressGoal too vague or tools failing silentlyAdd
--max-iterations 15
, use
--verbose
to inspect loop
Tool execution timeoutSlow network or heavy scanIncrease
TIMEOUT_PER_STEP
in config
LLM refuses tool callSafety filter on model providerUse a less restrictive model endpoint or rephrase goal
Out of context windowLong agent historyReduce
short_term_limit
or enable memory summarization
Scope violation errorTarget not in allowed scopeAdd target CIDR to
TARGET_SCOPE
in
.env
Empty findings reportAgent completed but found nothingCheck target accessibility, increase iterations

问题原因解决方法
Agent陷入循环无进展目标描述过于模糊或工具静默失败添加
--max-iterations 15
参数,使用
--verbose
查看循环细节
工具执行超时网络缓慢或扫描任务繁重在配置中增加
TIMEOUT_PER_STEP
的值
LLM拒绝调用工具模型提供商的安全过滤机制使用限制较少的模型端点,或重新表述目标
超出上下文窗口Agent历史记录过长减小
short_term_limit
的值,或启用记忆总结功能
范围违规错误目标不在允许的测试范围内将目标CIDR添加到
.env
文件的
TARGET_SCOPE
检测报告为空Agent执行完成但未发现任何内容检查目标是否可访问,增加迭代次数

Responsible Use

负责任使用

Cairn is licensed under AGPL-3.0. Usage must comply with:
  • ✅ Authorized penetration tests with written permission
  • ✅ CTF competitions and intentionally vulnerable lab environments
  • ✅ Personal security research on systems you own
  • ❌ Unauthorized access to systems you don't own
  • ❌ Commercial use without a separate commercial license
Contact the maintainer at the repository for commercial licensing inquiries.

Cairn采用AGPL-3.0许可证。使用时必须遵守以下规则:
  • ✅ 获得书面许可的授权渗透测试
  • ✅ CTF竞赛及故意设置漏洞的实验室环境
  • ✅ 针对自有系统的个人安全研究
  • ❌ 未经授权访问不属于你的系统
  • ❌ 未获得单独商业许可证的商业使用
如需商业授权,请联系仓库维护者。

Resources

相关资源