itemized-functions
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOverview
概述
This skill analyzes architecture files to identify all external integrations (APIs, services, databases, etc.), then generates:
- Function wrappers () — Clean, tested functions that use each integration exactly as specified in the architecture
function_*.py - Individual test files () — Comprehensive tests for each function covering success cases, failure modes, edge cases, and diverse input types
test_*.py - Heavy API test suites (when needed) — Separate test files for integrations requiring extensive data or API calls
- Integration test runner () — Master test orchestrator that runs all tests, collects results, and generates the final report
run_all_tests.py - Debug log () — Complete execution trace for troubleshooting
integrations.debug.log - Summary report () — Detailed findings: function signatures, actual API responses (sanitized), latency metrics, test results, learnings
ITEMIZED_FUNCTIONS_REPORT.md
Purpose: Understand the exact behavior of every 3rd-party integration before building the full project. No assumptions. Real execution. Real data.
本技能会分析架构文件,识别所有外部集成(API、服务、数据库等),然后生成以下内容:
- 函数包装器()—— 简洁、经过测试的函数,严格按照架构中指定的方式使用每个集成
function_*.py - 独立测试文件()—— 针对每个函数的全面测试,涵盖成功场景、故障模式、边缘情况以及多种输入类型
test_*.py - 重型API测试套件(必要时)—— 针对需要大量数据或API调用的集成的独立测试文件
- 集成测试运行器()—— 主测试编排器,运行所有测试、收集结果并生成最终报告
run_all_tests.py - 调试日志()—— 用于故障排查的完整执行跟踪
integrations.debug.log - 汇总报告()—— 详细结果:函数签名、实际API响应(已脱敏)、延迟指标、测试结果、经验总结
ITEMIZED_FUNCTIONS_REPORT.md
目的: 在构建完整项目之前,了解每个第三方集成的精确行为。不做假设,真实执行,获取真实数据。
Activation
激活
Trigger phrases:
- "generate itemized functions"
- "create integration tests"
- "itemized functions from architecture"
- "test all integrations"
Required input: Architecture files must be provided or referenced. The skill reads the full architecture to understand all integrations, their purpose, and how they're used.
Output: All generated files are created in an directory at the repository root.
integration_tests/触发短语:
- "generate itemized functions"
- "create integration tests"
- "itemized functions from architecture"
- "test all integrations"
必填输入: 必须提供或引用架构文件。本技能会读取完整架构,了解所有集成的用途和使用方式。
输出: 所有生成的文件都会创建在仓库根目录的文件夹中。
integration_tests/Workflow
工作流程
Phase 1: Architecture Analysis
阶段1:架构分析
- Read all provided architecture files (scan thoroughly, don't ask questions)
- Identify all external integrations:
- API services (Ollama, OpenAI, Anthropic, etc.)
- Database systems (PostgreSQL, MongoDB, Redis, etc.)
- File processing tools (GitHub Linguist, ImageMagick, etc.)
- Message queues, cache systems, external webhooks, etc.
- For each integration, extract:
- Primary purpose (what the architecture says it's used for)
- Specific use cases (e.g., "Ollama for tool calling" vs "Ollama for chat")
- Expected inputs/outputs
- Failure modes to test
- Performance constraints or requirements
- Determine test complexity:
- Standard test: ≤10 API calls, <50MB data, simple responses
- Heavy test: >10 API calls, >50MB data, streaming responses, or complex orchestration
- 读取所有提供的架构文件(全面扫描,无需询问)
- 识别所有外部集成:
- API服务(Ollama、OpenAI、Anthropic等)
- 数据库系统(PostgreSQL、MongoDB、Redis等)
- 文件处理工具(GitHub Linguist、ImageMagick等)
- 消息队列、缓存系统、外部Webhook等
- 针对每个集成,提取以下信息:
- 核心用途(架构中说明的用途)
- 特定用例(例如:"Ollama用于工具调用" vs "Ollama用于聊天")
- 预期输入/输出
- 需要测试的故障模式
- 性能约束或要求
- 确定测试复杂度:
- 标准测试:≤10次API调用,<50MB数据,响应简单
- 重型测试:>10次API调用,>50MB数据,流式响应,或复杂编排
Phase 2: Credential Setup
阶段2:凭据配置
Generate at repository root with template entries for all required credentials:
.env.devundefined在仓库根目录生成文件,包含所有必要凭据的模板条目:
.env.devundefinedOllama
Ollama
OLLAMA_API_URL=http://localhost:11434
OLLAMA_MODEL=neural-chat
OLLAMA_API_URL=http://localhost:11434
OLLAMA_MODEL=neural-chat
GitHub Linguist
GitHub Linguist
GITHUB_LINGUIST_PATH=/path/to/github-linguist
GITHUB_LINGUIST_PATH=/path/to/github-linguist
PostgreSQL
PostgreSQL
DB_HOST=localhost
DB_PORT=5432
DB_NAME=testdb
DB_USER=testuser
DB_PASSWORD=
DB_HOST=localhost
DB_PORT=5432
DB_NAME=testdb
DB_USER=testuser
DB_PASSWORD=
[Other integrations...]
[其他集成...]
**Note:** User must fill in actual values before running tests.
**注意:** 用户必须在运行测试前填写实际值。Phase 3: Generate Function Wrappers
阶段3:生成函数包装器
For each integration, create :
integration_tests/function_[service].pyRequirements:
- Clean, production-ready function signatures
- Proper type hints (Python 3.8+)
- Error handling with meaningful error messages
- Timeout handling (appropriate per service)
- Input validation where needed
- Logging to
integrations.debug.log - Return consistent, testable response objects
Example structure:
python
import os
import logging
from typing import Any, Dict, List
import requests
from datetime import datetime
logger = logging.getLogger(__name__)
def call_ollama_chat(prompt: str, model: str = None, temperature: float = 0.7, timeout: int = 30) -> Dict[str, Any]:
"""
Call Ollama API for chat completion.
Args:
prompt: The user prompt
model: Model name (uses OLLAMA_MODEL env var if not provided)
temperature: Sampling temperature (0.0-1.0)
timeout: Request timeout in seconds
Returns:
Dict with keys: response, model, created_at, latency_ms
Raises:
ValueError: If credentials/config missing
requests.Timeout: If request exceeds timeout
requests.RequestException: For API errors
"""
try:
start_time = datetime.now()
api_url = os.getenv("OLLAMA_API_URL", "http://localhost:11434")
model = model or os.getenv("OLLAMA_MODEL")
if not model:
raise ValueError("OLLAMA_MODEL not set in environment")
response = requests.post(
f"{api_url}/api/chat",
json={
"model": model,
"messages": [{"role": "user", "content": prompt}],
"temperature": temperature,
"stream": False
},
timeout=timeout
)
response.raise_for_status()
latency_ms = (datetime.now() - start_time).total_seconds() * 1000
result = response.json()
result["latency_ms"] = latency_ms
logger.debug(f"Ollama chat call successful. Latency: {latency_ms}ms")
return result
except requests.Timeout:
logger.error(f"Ollama API timeout after {timeout}s")
raise
except requests.RequestException as e:
logger.error(f"Ollama API error: {str(e)}")
raise
except Exception as e:
logger.error(f"Unexpected error calling Ollama: {str(e)}")
raise针对每个集成,创建:
integration_tests/function_[service].py要求:
- 简洁、可用于生产环境的函数签名
- 正确的类型提示(Python 3.8+)
- 带有明确错误信息的异常处理
- 针对不同服务的合理超时处理
- 必要的输入验证
- 日志输出到
integrations.debug.log - 返回一致、可测试的响应对象
示例结构:
python
import os
import logging
from typing import Any, Dict, List
import requests
from datetime import datetime
logger = logging.getLogger(__name__)
def call_ollama_chat(prompt: str, model: str = None, temperature: float = 0.7, timeout: int = 30) -> Dict[str, Any]:
"""
Call Ollama API for chat completion.
Args:
prompt: The user prompt
model: Model name (uses OLLAMA_MODEL env var if not provided)
temperature: Sampling temperature (0.0-1.0)
timeout: Request timeout in seconds
Returns:
Dict with keys: response, model, created_at, latency_ms
Raises:
ValueError: If credentials/config missing
requests.Timeout: If request exceeds timeout
requests.RequestException: For API errors
"""
try:
start_time = datetime.now()
api_url = os.getenv("OLLAMA_API_URL", "http://localhost:11434")
model = model or os.getenv("OLLAMA_MODEL")
if not model:
raise ValueError("OLLAMA_MODEL not set in environment")
response = requests.post(
f"{api_url}/api/chat",
json={
"model": model,
"messages": [{"role": "user", "content": prompt}],
"temperature": temperature,
"stream": False
},
timeout=timeout
)
response.raise_for_status()
latency_ms = (datetime.now() - start_time).total_seconds() * 1000
result = response.json()
result["latency_ms"] = latency_ms
logger.debug(f"Ollama chat call successful. Latency: {latency_ms}ms")
return result
except requests.Timeout:
logger.error(f"Ollama API timeout after {timeout}s")
raise
except requests.RequestException as e:
logger.error(f"Ollama API error: {str(e)}")
raise
except Exception as e:
logger.error(f"Unexpected error calling Ollama: {str(e)}")
raisePhase 4: Generate Test Files
阶段4:生成测试文件
For each function, create :
integration_tests/test_[service].pyRequirements:
- Use pytest framework
- Test success cases with typical inputs
- Test success cases with diverse/edge case inputs (smart per-service)
- Test failure modes: auth failure, timeout, malformed response, rate limiting, service down
- Capture actual API responses (sanitize credentials)
- Measure latency
- Each test logs to
integrations.debug.log - Each test validates response structure
Example structure:
python
import pytest
import os
from unittest.mock import patch, MagicMock
import requests
from function_ollama import call_ollama_chat
@pytest.fixture
def setup_env(monkeypatch):
"""Setup environment variables for testing."""
monkeypatch.setenv("OLLAMA_API_URL", "http://localhost:11434")
monkeypatch.setenv("OLLAMA_MODEL", "neural-chat")
class TestOllamaChatSuccess:
"""Test successful Ollama chat calls."""
def test_basic_chat(self, setup_env):
"""Test basic chat completion."""
response = call_ollama_chat("What is 2+2?")
assert "response" in response
assert response["model"] == "neural-chat"
assert "latency_ms" in response
assert response["latency_ms"] > 0
def test_chat_with_temperature(self, setup_env):
"""Test chat with different temperature values."""
for temp in [0.0, 0.5, 1.0]:
response = call_ollama_chat("Tell a story", temperature=temp)
assert "response" in response
assert response["latency_ms"] > 0
def test_long_prompt(self, setup_env):
"""Test with very long prompt."""
long_prompt = "What is the meaning of life? " * 100
response = call_ollama_chat(long_prompt)
assert "response" in response
class TestOllamaChatFailures:
"""Test failure modes."""
def test_auth_failure(self, setup_env, monkeypatch):
"""Test behavior when API authentication fails."""
monkeypatch.setenv("OLLAMA_API_URL", "http://invalid-url:11434")
with pytest.raises(requests.RequestException):
call_ollama_chat("test")
def test_timeout(self, setup_env, monkeypatch):
"""Test timeout handling."""
with patch('requests.post') as mock_post:
mock_post.side_effect = requests.Timeout()
with pytest.raises(requests.Timeout):
call_ollama_chat("test", timeout=1)
def test_missing_credentials(self, monkeypatch):
"""Test when required env vars are missing."""
monkeypatch.delenv("OLLAMA_MODEL", raising=False)
with pytest.raises(ValueError):
call_ollama_chat("test")针对每个函数,创建:
integration_tests/test_[service].py要求:
- 使用pytest框架
- 测试典型输入的成功场景
- 测试多样/边缘输入的成功场景(针对不同服务灵活调整)
- 测试故障模式:认证失败、超时、响应格式错误、速率限制、服务宕机
- 捕获实际API响应(脱敏凭据)
- 测量延迟
- 每个测试都记录到
integrations.debug.log - 每个测试都验证响应结构
示例结构:
python
import pytest
import os
from unittest.mock import patch, MagicMock
import requests
from function_ollama import call_ollama_chat
@pytest.fixture
def setup_env(monkeypatch):
"""Setup environment variables for testing."""
monkeypatch.setenv("OLLAMA_API_URL", "http://localhost:11434")
monkeypatch.setenv("OLLAMA_MODEL", "neural-chat")
class TestOllamaChatSuccess:
"""Test successful Ollama chat calls."""
def test_basic_chat(self, setup_env):
"""Test basic chat completion."""
response = call_ollama_chat("What is 2+2?")
assert "response" in response
assert response["model"] == "neural-chat"
assert "latency_ms" in response
assert response["latency_ms"] > 0
def test_chat_with_temperature(self, setup_env):
"""Test chat with different temperature values."""
for temp in [0.0, 0.5, 1.0]:
response = call_ollama_chat("Tell a story", temperature=temp)
assert "response" in response
assert response["latency_ms"] > 0
def test_long_prompt(self, setup_env):
"""Test with very long prompt."""
long_prompt = "What is the meaning of life? " * 100
response = call_ollama_chat(long_prompt)
assert "response" in response
class TestOllamaChatFailures:
"""Test failure modes."""
def test_auth_failure(self, setup_env, monkeypatch):
"""Test behavior when API authentication fails."""
monkeypatch.setenv("OLLAMA_API_URL", "http://invalid-url:11434")
with pytest.raises(requests.RequestException):
call_ollama_chat("test")
def test_timeout(self, setup_env, monkeypatch):
"""Test timeout handling."""
with patch('requests.post') as mock_post:
mock_post.side_effect = requests.Timeout()
with pytest.raises(requests.Timeout):
call_ollama_chat("test", timeout=1)
def test_missing_credentials(self, monkeypatch):
"""Test when required env vars are missing."""
monkeypatch.delenv("OLLAMA_MODEL", raising=False)
with pytest.raises(ValueError):
call_ollama_chat("test")Phase 5: Heavy API Test Suites (When Applicable)
阶段5:重型API测试套件(适用时)
If an integration qualifies as "heavy" (>10 API calls, >50MB data, streaming, complex orchestration), create a separate :
integration_tests/heavy_test_[service].pyInclude in heavy tests:
- Large data processing (if applicable)
- Streaming response handling
- Multiple chained API calls
- Performance benchmarks
- Resource usage patterns
- Rate limiting behavior
Header with reasoning:
python
"""
Heavy API test suite for [service].
Reasoning:
- [Service] requires extensive testing due to [specific reason]:
- Streaming responses with large payloads
- Multiple chained API calls (25+ total)
- Data processing >50MB
- Complex state management across calls
- Critical performance path in architecture
These tests are separated from standard tests to avoid:
- Excessive API quota usage during CI/CD
- Extended test execution time
- Unnecessary load on rate-limited endpoints
"""如果某个集成属于“重型”(>10次API调用、>50MB数据、流式响应、复杂编排),则创建独立的:
integration_tests/heavy_test_[service].py重型测试包含内容:
- 大数据处理(如适用)
- 流式响应处理
- 多个链式API调用
- 性能基准测试
- 资源使用模式
- 速率限制行为
带说明的头部:
python
"""
Heavy API test suite for [service].
Reasoning:
- [Service] requires extensive testing due to [specific reason]:
- Streaming responses with large payloads
- Multiple chained API calls (25+ total)
- Data processing >50MB
- Complex state management across calls
- Critical performance path in architecture
These tests are separated from standard tests to avoid:
- Excessive API quota usage during CI/CD
- Extended test execution time
- Unnecessary load on rate-limited endpoints
"""Phase 6: Integration Test Runner
阶段6:集成测试运行器
Create :
integration_tests/run_all_tests.pyResponsibilities:
- Import and run all test files using pytest
- Collect results: passed, failed, skipped
- For each test, capture:
- Test name
- Status (passed/failed/skipped)
- Execution time
- Error message (if failed)
- Sample response data (if success)
- Aggregate latency metrics per service
- Generate summary statistics
- Write all to
ITEMIZED_FUNCTIONS_REPORT.md
Example output structure:
Test Results Summary:
- Total: 42 tests
- Passed: 38
- Failed: 2
- Skipped: 2
Service Latency Metrics:
- Ollama Chat: avg 145ms, min 89ms, max 287ms (10 calls)
- GitHub Linguist: avg 234ms, min 156ms, max 412ms (8 calls)
- PostgreSQL: avg 12ms, min 8ms, max 31ms (10 calls)
[Detailed results for each service...]创建:
integration_tests/run_all_tests.py职责:
- 使用pytest导入并运行所有测试文件
- 收集结果:通过、失败、跳过
- 针对每个测试,捕获:
- 测试名称
- 状态(通过/失败/跳过)
- 执行时间
- 错误信息(如果失败)
- 示例响应数据(如果成功)
- 按服务汇总延迟指标
- 生成汇总统计数据
- 将所有内容写入
ITEMIZED_FUNCTIONS_REPORT.md
示例输出结构:
Test Results Summary:
- Total: 42 tests
- Passed: 38
- Failed: 2
- Skipped: 2
Service Latency Metrics:
- Ollama Chat: avg 145ms, min 89ms, max 287ms (10 calls)
- GitHub Linguist: avg 234ms, min 156ms, max 412ms (8 calls)
- PostgreSQL: avg 12ms, min 8ms, max 31ms (10 calls)
[每个服务的详细结果...]Phase 7: Debug Logging
阶段7:调试日志
All generated code writes to :
integration_tests/integrations.debug.log- Timestamp, log level, service name, message
- Request/response bodies (sanitize credentials)
- Timing information
- Error stack traces
- Environment info (for debugging credential/setup issues)
Format:
[2024-01-15 14:32:15.342] DEBUG [ollama] Calling /api/chat with model=neural-chat
[2024-01-15 14:32:15.521] DEBUG [ollama] Response received: 145ms latency, 1250 chars
[2024-01-15 14:32:16.012] ERROR [github-linguist] FAILED_TO_TEST - Connection refused (auth_required, network_error, timeout, api_error, etc.)所有生成的代码都会写入:
integration_tests/integrations.debug.log- 时间戳、日志级别、服务名称、消息
- 请求/响应体(脱敏凭据)
- 计时信息
- 错误堆栈跟踪
- 环境信息(用于排查凭据/配置问题)
格式:
[2024-01-15 14:32:15.342] DEBUG [ollama] Calling /api/chat with model=neural-chat
[2024-01-15 14:32:15.521] DEBUG [ollama] Response received: 145ms latency, 1250 chars
[2024-01-15 14:32:16.012] ERROR [github-linguist] FAILED_TO_TEST - Connection refused (auth_required, network_error, timeout, api_error, etc.)Phase 8: Summary Report
阶段8:汇总报告
Create :
ITEMIZED_FUNCTIONS_REPORT.mdStructure:
markdown
undefined创建:
ITEMIZED_FUNCTIONS_REPORT.md结构:
markdown
undefinedItemized Functions Report
Itemized Functions Report
Generated: [timestamp]
Architecture Analyzed: [list of architecture files]
Total Integrations Tested: [count]
Test Success Rate: [X%]
Generated: [timestamp]
Architecture Analyzed: [list of architecture files]
Total Integrations Tested: [count]
Test Success Rate: [X%]
Executive Summary
Executive Summary
- [count] integrations identified and tested
- tests passed, [Y] failed, [Z] skipped
- Key findings and blockers (if any)
- [count] integrations identified and tested
- tests passed, [Y] failed, [Z] skipped
- Key findings and blockers (if any)
Integration Details
Integration Details
[Service Name] (e.g., Ollama)
[Service Name] (e.g., Ollama)
Purpose (from architecture): [extracted from architecture]
Function Signature:
```python
def call_ollama_chat(prompt: str, model: str = None, temperature: float = 0.7, timeout: int = 30) -> Dict[str, Any]
```
Test Coverage: [count tests, all passed/mixed/failed]
Latency: avg X ms, min Y ms, max Z ms (10 calls)
Sample API Response (sanitized):
```json
{
"response": "2 + 2 = 4",
"model": "neural-chat",
"created_at": "2024-01-15T14:32:15Z",
"latency_ms": 145
}
```
Failure Modes Tested:
- ✓ Timeout (handled correctly, raises Timeout exception)
- ✓ Auth failure (handled correctly, raises RequestException)
- ✓ Malformed response (handled correctly, raises JSONDecodeError)
- ✓ Service unavailable (raises ConnectionError)
Key Learnings:
Heavy Tests: None
(or if applicable: — [reason])
heavy_test_ollama.pyPurpose (from architecture): [extracted from architecture]
Function Signature:
```python
def call_ollama_chat(prompt: str, model: str = None, temperature: float = 0.7, timeout: int = 30) -> Dict[str, Any]
```
Test Coverage: [count tests, all passed/mixed/failed]
Latency: avg X ms, min Y ms, max Z ms (10 calls)
Sample API Response (sanitized):
```json
{
"response": "2 + 2 = 4",
"model": "neural-chat",
"created_at": "2024-01-15T14:32:15Z",
"latency_ms": 145
}
```
Failure Modes Tested:
- ✓ Timeout (handled correctly, raises Timeout exception)
- ✓ Auth failure (handled correctly, raises RequestException)
- ✓ Malformed response (handled correctly, raises JSONDecodeError)
- ✓ Service unavailable (raises ConnectionError)
Key Learnings:
Heavy Tests: None
(or if applicable: — [reason])
heavy_test_ollama.py[Next Service...]
[Next Service...]
[Same structure as above]
[Same structure as above]
Failed Tests & Blockers
Failed Tests & Blockers
[Service Name] - FAILED_TO_TEST
[Service Name] - FAILED_TO_TEST
Reason: [auth_required, network_error, timeout, api_error, service_down, etc.]
Error Message: [exact error]
Suggestion: [how to resolve, e.g., "Set OLLAMA_API_URL in .env.dev and ensure Ollama service is running"]
Reason: [auth_required, network_error, timeout, api_error, service_down, etc.]
Error Message: [exact error]
Suggestion: [how to resolve, e.g., "Set OLLAMA_API_URL in .env.dev and ensure Ollama service is running"]
Cross-Service Insights
Cross-Service Insights
[Any patterns, dependencies, or interactions discovered across integrations]
[Any patterns, dependencies, or interactions discovered across integrations]
Recommendations
Recommendations
- [Any critical issues or setup requirements]
- [Performance or scaling considerations]
- [Dependencies between services]
- [Any critical issues or setup requirements]
- [Performance or scaling considerations]
- [Dependencies between services]
Test Execution Log
Test Execution Log
[Link to or excerpt from integrations.debug.log]
undefined[Link to or excerpt from integrations.debug.log]
undefinedStandards & Requirements
标准与要求
Code Generation
代码生成
- Python 3.8+ compatible — Type hints, f-strings, async/await support if needed
- Error messages are meaningful — Not generic "error occurred", but "OLLAMA_API_URL not set" or "Connection refused on localhost:11434"
- Timeouts are sensible per service — LLM APIs: 60s, Database: 10s, File processing: 30s, etc.
- No hardcoded values — All config from environment
- Logging is comprehensive — Every significant action logged
- 兼容Python 3.8+ — 支持类型提示、f-string、async/await(如需要)
- 错误信息明确 — 不使用通用的“发生错误”,而是“OLLAMA_API_URL未设置”或“localhost:11434连接被拒绝”
- 针对不同服务设置合理超时 — LLM API:60秒,数据库:10秒,文件处理:30秒等
- 无硬编码值 — 所有配置均来自环境变量
- 日志全面 — 每个重要操作都有日志记录
Testing
测试
- Pytest conventions — files, fixtures, clear test names, assertions with messages
test_*.py - Real execution — Actually call the APIs (not mocked), unless explicitly impossible
- Diverse inputs per service — Smart, not generic:
- LLM APIs: different prompt lengths, temperatures, contexts
- File processing: different file types/sizes
- Databases: different query patterns, edge cases
- Time-series: different time ranges, aggregations
- Failure mode coverage — Auth, network, timeout, rate limit, malformed data
- Response validation — Verify structure, types, required fields
- Latency tracking — Measure every call
- 遵循pytest约定 — 文件、fixture、清晰的测试名称、带消息的断言
test_*.py - 真实执行 — 实际调用API(除非明确无法实现)
- 针对不同服务使用多样输入 — 灵活调整,而非通用输入:
- LLM API:不同长度的提示词、温度值、上下文
- 文件处理:不同文件类型/大小
- 数据库:不同查询模式、边缘情况
- 时间序列:不同时间范围、聚合方式
- 覆盖故障模式 — 认证、网络、超时、速率限制、格式错误的数据
- 响应验证 — 验证结构、类型、必填字段
- 延迟跟踪 — 测量每次调用的延迟
Report Generation
报告生成
- No credentials exposed — Sanitize all env vars, passwords, tokens, API keys in report
- Full response bodies — Show actual data (sanitized) so developer understands exact format
- Timestamps throughout — When was this run, when was each test executed
- Actionable findings — Not just "failed" but "why" and "how to fix"
- Report all failures and anomalies — Do not omit unexpected behavior because it seems minor. Every quirk discovered now prevents a production incident later
- Output goes under the user's name — The report becomes their reference document. Incomplete or softened findings waste the testing effort
- 不暴露凭据 — 在报告中脱敏所有环境变量、密码、令牌、API密钥
- 完整响应体 — 展示实际数据(已脱敏),让开发者了解精确格式
- 全程带时间戳 — 记录运行时间、每个测试的执行时间
- 结果可操作 — 不仅说明“失败”,还要说明“原因”和“修复方法”
- 报告所有失败和异常 — 不要因为看似微小就忽略意外行为,现在发现的每个问题都能避免未来的生产事故
- 输出归属于用户 — 报告成为用户的参考文档,不完整或模糊的结果会浪费测试精力
Special Cases
特殊情况
Streaming APIs (e.g., LLM chat with stream=true):
- Test both streamed and non-streamed responses
- Measure latency from first token to completion
- Validate chunk format
Databases:
- Test CRUD operations
- Test connection pooling if applicable
- Test transaction handling
- Test query timeouts
File Processing APIs:
- Test with multiple file types
- Test error handling for unsupported formats
- Test large file handling
Rate-Limited APIs:
- Detect rate limit headers
- Test behavior when rate limited
- Suggest backoff strategies in report
Webhook/Async APIs:
- If applicable, test callback handling
- Test idempotency if needed
流式API(如stream=true的LLM聊天):
- 同时测试流式和非流式响应
- 测量从第一个token到完成的延迟
- 验证分块格式
数据库:
- 测试CRUD操作
- 如适用,测试连接池
- 测试事务处理
- 测试查询超时
文件处理API:
- 测试多种文件类型
- 测试不支持格式的错误处理
- 测试大文件处理
速率限制API:
- 检测速率限制头部
- 测试达到速率限制时的行为
- 在报告中建议退避策略
Webhook/异步API:
- 如适用,测试回调处理
- 如需要,测试幂等性
Directory Structure (Final)
最终目录结构
integration_tests/
├── .env.dev # Template credentials file
├── integrations.debug.log # Debug log from test execution
├── ITEMIZED_FUNCTIONS_REPORT.md # Final summary report
├── run_all_tests.py # Master test runner
├── function_ollama.py # Function wrapper
├── test_ollama.py # Standard tests
├── heavy_test_ollama.py # (if needed) Heavy tests
├── function_github_linguist.py # Another wrapper
├── test_github_linguist.py # Standard tests
├── function_postgres.py # Another wrapper
├── test_postgres.py # Standard tests
└── [More function/test pairs...]integration_tests/
├── .env.dev # 凭据模板文件
├── integrations.debug.log # 测试执行的调试日志
├── ITEMIZED_FUNCTIONS_REPORT.md # 最终汇总报告
├── run_all_tests.py # 主测试运行器
├── function_ollama.py # 函数包装器
├── test_ollama.py # 标准测试
├── heavy_test_ollama.py # (如需要)重型测试
├── function_github_linguist.py # 另一个包装器
├── test_github_linguist.py # 标准测试
├── function_postgres.py # 另一个包装器
├── test_postgres.py # 标准测试
└── [更多函数/测试对...]Execution
执行步骤
- User provides architecture files and triggers skill
- Skill generates all files in
integration_tests/ - User fills in with actual credentials
.env.dev - User runs:
python run_all_tests.py - All tests execute, report generated
- Developer reviews and
ITEMIZED_FUNCTIONS_REPORT.mdintegrations.debug.log
- 用户提供架构文件并触发本技能
- 本技能在中生成所有文件
integration_tests/ - 用户在中填写实际凭据
.env.dev - 用户运行:
python run_all_tests.py - 所有测试执行,生成报告
- 开发者查看和
ITEMIZED_FUNCTIONS_REPORT.mdintegrations.debug.log
Quality Assurance
质量保证
- No questions asked — Read architecture, infer integrations, generate tests
- LLM confidence in test design — Trust that generated tests cover the right scenarios
- Sanitization is rigorous — Scan all report content for credentials before writing
- File encoding is UTF-8 — Handle responses with special characters correctly
- All files importable — Generated Python is syntactically correct and runs
- 无需询问 — 读取架构、推断集成、生成测试
- 测试设计的LLM可信度 — 信任生成的测试能覆盖正确场景
- 脱敏严格 — 在写入前扫描所有报告内容以查找凭据
- 文件编码为UTF-8 — 正确处理含特殊字符的响应
- 所有文件可导入 — 生成的Python代码语法正确且可运行