itemized-functions

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Overview

概述

This skill analyzes architecture files to identify all external integrations (APIs, services, databases, etc.), then generates:
  1. Function wrappers (
    function_*.py
    ) — Clean, tested functions that use each integration exactly as specified in the architecture
  2. Individual test files (
    test_*.py
    ) — Comprehensive tests for each function covering success cases, failure modes, edge cases, and diverse input types
  3. Heavy API test suites (when needed) — Separate test files for integrations requiring extensive data or API calls
  4. Integration test runner (
    run_all_tests.py
    ) — Master test orchestrator that runs all tests, collects results, and generates the final report
  5. Debug log (
    integrations.debug.log
    ) — Complete execution trace for troubleshooting
  6. Summary report (
    ITEMIZED_FUNCTIONS_REPORT.md
    ) — Detailed findings: function signatures, actual API responses (sanitized), latency metrics, test results, learnings
Purpose: Understand the exact behavior of every 3rd-party integration before building the full project. No assumptions. Real execution. Real data.
本技能会分析架构文件,识别所有外部集成(API、服务、数据库等),然后生成以下内容:
  1. 函数包装器
    function_*.py
    )—— 简洁、经过测试的函数,严格按照架构中指定的方式使用每个集成
  2. 独立测试文件
    test_*.py
    )—— 针对每个函数的全面测试,涵盖成功场景、故障模式、边缘情况以及多种输入类型
  3. 重型API测试套件(必要时)—— 针对需要大量数据或API调用的集成的独立测试文件
  4. 集成测试运行器
    run_all_tests.py
    )—— 主测试编排器,运行所有测试、收集结果并生成最终报告
  5. 调试日志
    integrations.debug.log
    )—— 用于故障排查的完整执行跟踪
  6. 汇总报告
    ITEMIZED_FUNCTIONS_REPORT.md
    )—— 详细结果:函数签名、实际API响应(已脱敏)、延迟指标、测试结果、经验总结
目的: 在构建完整项目之前,了解每个第三方集成的精确行为。不做假设,真实执行,获取真实数据。

Activation

激活

Trigger phrases:
  • "generate itemized functions"
  • "create integration tests"
  • "itemized functions from architecture"
  • "test all integrations"
Required input: Architecture files must be provided or referenced. The skill reads the full architecture to understand all integrations, their purpose, and how they're used.
Output: All generated files are created in an
integration_tests/
directory at the repository root.
触发短语:
  • "generate itemized functions"
  • "create integration tests"
  • "itemized functions from architecture"
  • "test all integrations"
必填输入: 必须提供或引用架构文件。本技能会读取完整架构,了解所有集成的用途和使用方式。
输出: 所有生成的文件都会创建在仓库根目录的
integration_tests/
文件夹中。

Workflow

工作流程

Phase 1: Architecture Analysis

阶段1:架构分析

  1. Read all provided architecture files (scan thoroughly, don't ask questions)
  2. Identify all external integrations:
    • API services (Ollama, OpenAI, Anthropic, etc.)
    • Database systems (PostgreSQL, MongoDB, Redis, etc.)
    • File processing tools (GitHub Linguist, ImageMagick, etc.)
    • Message queues, cache systems, external webhooks, etc.
  3. For each integration, extract:
    • Primary purpose (what the architecture says it's used for)
    • Specific use cases (e.g., "Ollama for tool calling" vs "Ollama for chat")
    • Expected inputs/outputs
    • Failure modes to test
    • Performance constraints or requirements
  4. Determine test complexity:
    • Standard test: ≤10 API calls, <50MB data, simple responses
    • Heavy test: >10 API calls, >50MB data, streaming responses, or complex orchestration
  1. 读取所有提供的架构文件(全面扫描,无需询问)
  2. 识别所有外部集成
    • API服务(Ollama、OpenAI、Anthropic等)
    • 数据库系统(PostgreSQL、MongoDB、Redis等)
    • 文件处理工具(GitHub Linguist、ImageMagick等)
    • 消息队列、缓存系统、外部Webhook等
  3. 针对每个集成,提取以下信息
    • 核心用途(架构中说明的用途)
    • 特定用例(例如:"Ollama用于工具调用" vs "Ollama用于聊天")
    • 预期输入/输出
    • 需要测试的故障模式
    • 性能约束或要求
  4. 确定测试复杂度
    • 标准测试:≤10次API调用,<50MB数据,响应简单
    • 重型测试:>10次API调用,>50MB数据,流式响应,或复杂编排

Phase 2: Credential Setup

阶段2:凭据配置

Generate
.env.dev
at repository root with template entries for all required credentials:
undefined
在仓库根目录生成
.env.dev
文件,包含所有必要凭据的模板条目:
undefined

Ollama

Ollama

OLLAMA_API_URL=http://localhost:11434 OLLAMA_MODEL=neural-chat
OLLAMA_API_URL=http://localhost:11434 OLLAMA_MODEL=neural-chat

GitHub Linguist

GitHub Linguist

GITHUB_LINGUIST_PATH=/path/to/github-linguist
GITHUB_LINGUIST_PATH=/path/to/github-linguist

PostgreSQL

PostgreSQL

DB_HOST=localhost DB_PORT=5432 DB_NAME=testdb DB_USER=testuser DB_PASSWORD=
DB_HOST=localhost DB_PORT=5432 DB_NAME=testdb DB_USER=testuser DB_PASSWORD=

[Other integrations...]

[其他集成...]


**Note:** User must fill in actual values before running tests.

**注意:** 用户必须在运行测试前填写实际值。

Phase 3: Generate Function Wrappers

阶段3:生成函数包装器

For each integration, create
integration_tests/function_[service].py
:
Requirements:
  • Clean, production-ready function signatures
  • Proper type hints (Python 3.8+)
  • Error handling with meaningful error messages
  • Timeout handling (appropriate per service)
  • Input validation where needed
  • Logging to
    integrations.debug.log
  • Return consistent, testable response objects
Example structure:
python
import os
import logging
from typing import Any, Dict, List
import requests
from datetime import datetime

logger = logging.getLogger(__name__)

def call_ollama_chat(prompt: str, model: str = None, temperature: float = 0.7, timeout: int = 30) -> Dict[str, Any]:
    """
    Call Ollama API for chat completion.
    
    Args:
        prompt: The user prompt
        model: Model name (uses OLLAMA_MODEL env var if not provided)
        temperature: Sampling temperature (0.0-1.0)
        timeout: Request timeout in seconds
    
    Returns:
        Dict with keys: response, model, created_at, latency_ms
    
    Raises:
        ValueError: If credentials/config missing
        requests.Timeout: If request exceeds timeout
        requests.RequestException: For API errors
    """
    try:
        start_time = datetime.now()
        
        api_url = os.getenv("OLLAMA_API_URL", "http://localhost:11434")
        model = model or os.getenv("OLLAMA_MODEL")
        
        if not model:
            raise ValueError("OLLAMA_MODEL not set in environment")
        
        response = requests.post(
            f"{api_url}/api/chat",
            json={
                "model": model,
                "messages": [{"role": "user", "content": prompt}],
                "temperature": temperature,
                "stream": False
            },
            timeout=timeout
        )
        response.raise_for_status()
        
        latency_ms = (datetime.now() - start_time).total_seconds() * 1000
        result = response.json()
        result["latency_ms"] = latency_ms
        
        logger.debug(f"Ollama chat call successful. Latency: {latency_ms}ms")
        return result
        
    except requests.Timeout:
        logger.error(f"Ollama API timeout after {timeout}s")
        raise
    except requests.RequestException as e:
        logger.error(f"Ollama API error: {str(e)}")
        raise
    except Exception as e:
        logger.error(f"Unexpected error calling Ollama: {str(e)}")
        raise
针对每个集成,创建
integration_tests/function_[service].py
要求:
  • 简洁、可用于生产环境的函数签名
  • 正确的类型提示(Python 3.8+)
  • 带有明确错误信息的异常处理
  • 针对不同服务的合理超时处理
  • 必要的输入验证
  • 日志输出到
    integrations.debug.log
  • 返回一致、可测试的响应对象
示例结构:
python
import os
import logging
from typing import Any, Dict, List
import requests
from datetime import datetime

logger = logging.getLogger(__name__)

def call_ollama_chat(prompt: str, model: str = None, temperature: float = 0.7, timeout: int = 30) -> Dict[str, Any]:
    """
    Call Ollama API for chat completion.
    
    Args:
        prompt: The user prompt
        model: Model name (uses OLLAMA_MODEL env var if not provided)
        temperature: Sampling temperature (0.0-1.0)
        timeout: Request timeout in seconds
    
    Returns:
        Dict with keys: response, model, created_at, latency_ms
    
    Raises:
        ValueError: If credentials/config missing
        requests.Timeout: If request exceeds timeout
        requests.RequestException: For API errors
    """
    try:
        start_time = datetime.now()
        
        api_url = os.getenv("OLLAMA_API_URL", "http://localhost:11434")
        model = model or os.getenv("OLLAMA_MODEL")
        
        if not model:
            raise ValueError("OLLAMA_MODEL not set in environment")
        
        response = requests.post(
            f"{api_url}/api/chat",
            json={
                "model": model,
                "messages": [{"role": "user", "content": prompt}],
                "temperature": temperature,
                "stream": False
            },
            timeout=timeout
        )
        response.raise_for_status()
        
        latency_ms = (datetime.now() - start_time).total_seconds() * 1000
        result = response.json()
        result["latency_ms"] = latency_ms
        
        logger.debug(f"Ollama chat call successful. Latency: {latency_ms}ms")
        return result
        
    except requests.Timeout:
        logger.error(f"Ollama API timeout after {timeout}s")
        raise
    except requests.RequestException as e:
        logger.error(f"Ollama API error: {str(e)}")
        raise
    except Exception as e:
        logger.error(f"Unexpected error calling Ollama: {str(e)}")
        raise

Phase 4: Generate Test Files

阶段4:生成测试文件

For each function, create
integration_tests/test_[service].py
:
Requirements:
  • Use pytest framework
  • Test success cases with typical inputs
  • Test success cases with diverse/edge case inputs (smart per-service)
  • Test failure modes: auth failure, timeout, malformed response, rate limiting, service down
  • Capture actual API responses (sanitize credentials)
  • Measure latency
  • Each test logs to
    integrations.debug.log
  • Each test validates response structure
Example structure:
python
import pytest
import os
from unittest.mock import patch, MagicMock
import requests
from function_ollama import call_ollama_chat

@pytest.fixture
def setup_env(monkeypatch):
    """Setup environment variables for testing."""
    monkeypatch.setenv("OLLAMA_API_URL", "http://localhost:11434")
    monkeypatch.setenv("OLLAMA_MODEL", "neural-chat")

class TestOllamaChatSuccess:
    """Test successful Ollama chat calls."""
    
    def test_basic_chat(self, setup_env):
        """Test basic chat completion."""
        response = call_ollama_chat("What is 2+2?")
        assert "response" in response
        assert response["model"] == "neural-chat"
        assert "latency_ms" in response
        assert response["latency_ms"] > 0
    
    def test_chat_with_temperature(self, setup_env):
        """Test chat with different temperature values."""
        for temp in [0.0, 0.5, 1.0]:
            response = call_ollama_chat("Tell a story", temperature=temp)
            assert "response" in response
            assert response["latency_ms"] > 0
    
    def test_long_prompt(self, setup_env):
        """Test with very long prompt."""
        long_prompt = "What is the meaning of life? " * 100
        response = call_ollama_chat(long_prompt)
        assert "response" in response

class TestOllamaChatFailures:
    """Test failure modes."""
    
    def test_auth_failure(self, setup_env, monkeypatch):
        """Test behavior when API authentication fails."""
        monkeypatch.setenv("OLLAMA_API_URL", "http://invalid-url:11434")
        with pytest.raises(requests.RequestException):
            call_ollama_chat("test")
    
    def test_timeout(self, setup_env, monkeypatch):
        """Test timeout handling."""
        with patch('requests.post') as mock_post:
            mock_post.side_effect = requests.Timeout()
            with pytest.raises(requests.Timeout):
                call_ollama_chat("test", timeout=1)
    
    def test_missing_credentials(self, monkeypatch):
        """Test when required env vars are missing."""
        monkeypatch.delenv("OLLAMA_MODEL", raising=False)
        with pytest.raises(ValueError):
            call_ollama_chat("test")
针对每个函数,创建
integration_tests/test_[service].py
要求:
  • 使用pytest框架
  • 测试典型输入的成功场景
  • 测试多样/边缘输入的成功场景(针对不同服务灵活调整)
  • 测试故障模式:认证失败、超时、响应格式错误、速率限制、服务宕机
  • 捕获实际API响应(脱敏凭据)
  • 测量延迟
  • 每个测试都记录到
    integrations.debug.log
  • 每个测试都验证响应结构
示例结构:
python
import pytest
import os
from unittest.mock import patch, MagicMock
import requests
from function_ollama import call_ollama_chat

@pytest.fixture
def setup_env(monkeypatch):
    """Setup environment variables for testing."""
    monkeypatch.setenv("OLLAMA_API_URL", "http://localhost:11434")
    monkeypatch.setenv("OLLAMA_MODEL", "neural-chat")

class TestOllamaChatSuccess:
    """Test successful Ollama chat calls."""
    
    def test_basic_chat(self, setup_env):
        """Test basic chat completion."""
        response = call_ollama_chat("What is 2+2?")
        assert "response" in response
        assert response["model"] == "neural-chat"
        assert "latency_ms" in response
        assert response["latency_ms"] > 0
    
    def test_chat_with_temperature(self, setup_env):
        """Test chat with different temperature values."""
        for temp in [0.0, 0.5, 1.0]:
            response = call_ollama_chat("Tell a story", temperature=temp)
            assert "response" in response
            assert response["latency_ms"] > 0
    
    def test_long_prompt(self, setup_env):
        """Test with very long prompt."""
        long_prompt = "What is the meaning of life? " * 100
        response = call_ollama_chat(long_prompt)
        assert "response" in response

class TestOllamaChatFailures:
    """Test failure modes."""
    
    def test_auth_failure(self, setup_env, monkeypatch):
        """Test behavior when API authentication fails."""
        monkeypatch.setenv("OLLAMA_API_URL", "http://invalid-url:11434")
        with pytest.raises(requests.RequestException):
            call_ollama_chat("test")
    
    def test_timeout(self, setup_env, monkeypatch):
        """Test timeout handling."""
        with patch('requests.post') as mock_post:
            mock_post.side_effect = requests.Timeout()
            with pytest.raises(requests.Timeout):
                call_ollama_chat("test", timeout=1)
    
    def test_missing_credentials(self, monkeypatch):
        """Test when required env vars are missing."""
        monkeypatch.delenv("OLLAMA_MODEL", raising=False)
        with pytest.raises(ValueError):
            call_ollama_chat("test")

Phase 5: Heavy API Test Suites (When Applicable)

阶段5:重型API测试套件(适用时)

If an integration qualifies as "heavy" (>10 API calls, >50MB data, streaming, complex orchestration), create a separate
integration_tests/heavy_test_[service].py
:
Include in heavy tests:
  • Large data processing (if applicable)
  • Streaming response handling
  • Multiple chained API calls
  • Performance benchmarks
  • Resource usage patterns
  • Rate limiting behavior
Header with reasoning:
python
"""
Heavy API test suite for [service].

Reasoning:
- [Service] requires extensive testing due to [specific reason]:
  - Streaming responses with large payloads
  - Multiple chained API calls (25+ total)
  - Data processing >50MB
  - Complex state management across calls
  - Critical performance path in architecture

These tests are separated from standard tests to avoid:
- Excessive API quota usage during CI/CD
- Extended test execution time
- Unnecessary load on rate-limited endpoints
"""
如果某个集成属于“重型”(>10次API调用、>50MB数据、流式响应、复杂编排),则创建独立的
integration_tests/heavy_test_[service].py
重型测试包含内容:
  • 大数据处理(如适用)
  • 流式响应处理
  • 多个链式API调用
  • 性能基准测试
  • 资源使用模式
  • 速率限制行为
带说明的头部:
python
"""
Heavy API test suite for [service].

Reasoning:
- [Service] requires extensive testing due to [specific reason]:
  - Streaming responses with large payloads
  - Multiple chained API calls (25+ total)
  - Data processing >50MB
  - Complex state management across calls
  - Critical performance path in architecture

These tests are separated from standard tests to avoid:
- Excessive API quota usage during CI/CD
- Extended test execution time
- Unnecessary load on rate-limited endpoints
"""

Phase 6: Integration Test Runner

阶段6:集成测试运行器

Create
integration_tests/run_all_tests.py
:
Responsibilities:
  1. Import and run all test files using pytest
  2. Collect results: passed, failed, skipped
  3. For each test, capture:
    • Test name
    • Status (passed/failed/skipped)
    • Execution time
    • Error message (if failed)
    • Sample response data (if success)
  4. Aggregate latency metrics per service
  5. Generate summary statistics
  6. Write all to
    ITEMIZED_FUNCTIONS_REPORT.md
Example output structure:
Test Results Summary:
- Total: 42 tests
- Passed: 38
- Failed: 2
- Skipped: 2

Service Latency Metrics:
- Ollama Chat: avg 145ms, min 89ms, max 287ms (10 calls)
- GitHub Linguist: avg 234ms, min 156ms, max 412ms (8 calls)
- PostgreSQL: avg 12ms, min 8ms, max 31ms (10 calls)

[Detailed results for each service...]
创建
integration_tests/run_all_tests.py
职责:
  1. 使用pytest导入并运行所有测试文件
  2. 收集结果:通过、失败、跳过
  3. 针对每个测试,捕获:
    • 测试名称
    • 状态(通过/失败/跳过)
    • 执行时间
    • 错误信息(如果失败)
    • 示例响应数据(如果成功)
  4. 按服务汇总延迟指标
  5. 生成汇总统计数据
  6. 将所有内容写入
    ITEMIZED_FUNCTIONS_REPORT.md
示例输出结构:
Test Results Summary:
- Total: 42 tests
- Passed: 38
- Failed: 2
- Skipped: 2

Service Latency Metrics:
- Ollama Chat: avg 145ms, min 89ms, max 287ms (10 calls)
- GitHub Linguist: avg 234ms, min 156ms, max 412ms (8 calls)
- PostgreSQL: avg 12ms, min 8ms, max 31ms (10 calls)

[每个服务的详细结果...]

Phase 7: Debug Logging

阶段7:调试日志

All generated code writes to
integration_tests/integrations.debug.log
:
  • Timestamp, log level, service name, message
  • Request/response bodies (sanitize credentials)
  • Timing information
  • Error stack traces
  • Environment info (for debugging credential/setup issues)
Format:
[2024-01-15 14:32:15.342] DEBUG [ollama] Calling /api/chat with model=neural-chat
[2024-01-15 14:32:15.521] DEBUG [ollama] Response received: 145ms latency, 1250 chars
[2024-01-15 14:32:16.012] ERROR [github-linguist] FAILED_TO_TEST - Connection refused (auth_required, network_error, timeout, api_error, etc.)
所有生成的代码都会写入
integration_tests/integrations.debug.log
  • 时间戳、日志级别、服务名称、消息
  • 请求/响应体(脱敏凭据)
  • 计时信息
  • 错误堆栈跟踪
  • 环境信息(用于排查凭据/配置问题)
格式:
[2024-01-15 14:32:15.342] DEBUG [ollama] Calling /api/chat with model=neural-chat
[2024-01-15 14:32:15.521] DEBUG [ollama] Response received: 145ms latency, 1250 chars
[2024-01-15 14:32:16.012] ERROR [github-linguist] FAILED_TO_TEST - Connection refused (auth_required, network_error, timeout, api_error, etc.)

Phase 8: Summary Report

阶段8:汇总报告

Create
ITEMIZED_FUNCTIONS_REPORT.md
:
Structure:
markdown
undefined
创建
ITEMIZED_FUNCTIONS_REPORT.md
结构:
markdown
undefined

Itemized Functions Report

Itemized Functions Report

Generated: [timestamp] Architecture Analyzed: [list of architecture files] Total Integrations Tested: [count] Test Success Rate: [X%]
Generated: [timestamp] Architecture Analyzed: [list of architecture files] Total Integrations Tested: [count] Test Success Rate: [X%]

Executive Summary

Executive Summary

  • [count] integrations identified and tested
  • tests passed, [Y] failed, [Z] skipped
  • Key findings and blockers (if any)
  • [count] integrations identified and tested
  • tests passed, [Y] failed, [Z] skipped
  • Key findings and blockers (if any)

Integration Details

Integration Details

[Service Name] (e.g., Ollama)

[Service Name] (e.g., Ollama)

Purpose (from architecture): [extracted from architecture]
Function Signature: ```python def call_ollama_chat(prompt: str, model: str = None, temperature: float = 0.7, timeout: int = 30) -> Dict[str, Any] ```
Test Coverage: [count tests, all passed/mixed/failed]
Latency: avg X ms, min Y ms, max Z ms (10 calls)
Sample API Response (sanitized): ```json { "response": "2 + 2 = 4", "model": "neural-chat", "created_at": "2024-01-15T14:32:15Z", "latency_ms": 145 } ```
Failure Modes Tested:
  • ✓ Timeout (handled correctly, raises Timeout exception)
  • ✓ Auth failure (handled correctly, raises RequestException)
  • ✓ Malformed response (handled correctly, raises JSONDecodeError)
  • ✓ Service unavailable (raises ConnectionError)
Key Learnings:
Heavy Tests: None (or if applicable:
heavy_test_ollama.py
— [reason])

Purpose (from architecture): [extracted from architecture]
Function Signature: ```python def call_ollama_chat(prompt: str, model: str = None, temperature: float = 0.7, timeout: int = 30) -> Dict[str, Any] ```
Test Coverage: [count tests, all passed/mixed/failed]
Latency: avg X ms, min Y ms, max Z ms (10 calls)
Sample API Response (sanitized): ```json { "response": "2 + 2 = 4", "model": "neural-chat", "created_at": "2024-01-15T14:32:15Z", "latency_ms": 145 } ```
Failure Modes Tested:
  • ✓ Timeout (handled correctly, raises Timeout exception)
  • ✓ Auth failure (handled correctly, raises RequestException)
  • ✓ Malformed response (handled correctly, raises JSONDecodeError)
  • ✓ Service unavailable (raises ConnectionError)
Key Learnings:
Heavy Tests: None (or if applicable:
heavy_test_ollama.py
— [reason])

[Next Service...]

[Next Service...]

[Same structure as above]

[Same structure as above]

Failed Tests & Blockers

Failed Tests & Blockers

[Service Name] - FAILED_TO_TEST

[Service Name] - FAILED_TO_TEST

Reason: [auth_required, network_error, timeout, api_error, service_down, etc.] Error Message: [exact error] Suggestion: [how to resolve, e.g., "Set OLLAMA_API_URL in .env.dev and ensure Ollama service is running"]

Reason: [auth_required, network_error, timeout, api_error, service_down, etc.] Error Message: [exact error] Suggestion: [how to resolve, e.g., "Set OLLAMA_API_URL in .env.dev and ensure Ollama service is running"]

Cross-Service Insights

Cross-Service Insights

[Any patterns, dependencies, or interactions discovered across integrations]

[Any patterns, dependencies, or interactions discovered across integrations]

Recommendations

Recommendations

  • [Any critical issues or setup requirements]
  • [Performance or scaling considerations]
  • [Dependencies between services]

  • [Any critical issues or setup requirements]
  • [Performance or scaling considerations]
  • [Dependencies between services]

Test Execution Log

Test Execution Log

[Link to or excerpt from integrations.debug.log]
undefined
[Link to or excerpt from integrations.debug.log]
undefined

Standards & Requirements

标准与要求

Code Generation

代码生成

  • Python 3.8+ compatible — Type hints, f-strings, async/await support if needed
  • Error messages are meaningful — Not generic "error occurred", but "OLLAMA_API_URL not set" or "Connection refused on localhost:11434"
  • Timeouts are sensible per service — LLM APIs: 60s, Database: 10s, File processing: 30s, etc.
  • No hardcoded values — All config from environment
  • Logging is comprehensive — Every significant action logged
  • 兼容Python 3.8+ — 支持类型提示、f-string、async/await(如需要)
  • 错误信息明确 — 不使用通用的“发生错误”,而是“OLLAMA_API_URL未设置”或“localhost:11434连接被拒绝”
  • 针对不同服务设置合理超时 — LLM API:60秒,数据库:10秒,文件处理:30秒等
  • 无硬编码值 — 所有配置均来自环境变量
  • 日志全面 — 每个重要操作都有日志记录

Testing

测试

  • Pytest conventions
    test_*.py
    files, fixtures, clear test names, assertions with messages
  • Real execution — Actually call the APIs (not mocked), unless explicitly impossible
  • Diverse inputs per service — Smart, not generic:
    • LLM APIs: different prompt lengths, temperatures, contexts
    • File processing: different file types/sizes
    • Databases: different query patterns, edge cases
    • Time-series: different time ranges, aggregations
  • Failure mode coverage — Auth, network, timeout, rate limit, malformed data
  • Response validation — Verify structure, types, required fields
  • Latency tracking — Measure every call
  • 遵循pytest约定
    test_*.py
    文件、fixture、清晰的测试名称、带消息的断言
  • 真实执行 — 实际调用API(除非明确无法实现)
  • 针对不同服务使用多样输入 — 灵活调整,而非通用输入:
    • LLM API:不同长度的提示词、温度值、上下文
    • 文件处理:不同文件类型/大小
    • 数据库:不同查询模式、边缘情况
    • 时间序列:不同时间范围、聚合方式
  • 覆盖故障模式 — 认证、网络、超时、速率限制、格式错误的数据
  • 响应验证 — 验证结构、类型、必填字段
  • 延迟跟踪 — 测量每次调用的延迟

Report Generation

报告生成

  • No credentials exposed — Sanitize all env vars, passwords, tokens, API keys in report
  • Full response bodies — Show actual data (sanitized) so developer understands exact format
  • Timestamps throughout — When was this run, when was each test executed
  • Actionable findings — Not just "failed" but "why" and "how to fix"
  • Report all failures and anomalies — Do not omit unexpected behavior because it seems minor. Every quirk discovered now prevents a production incident later
  • Output goes under the user's name — The report becomes their reference document. Incomplete or softened findings waste the testing effort
  • 不暴露凭据 — 在报告中脱敏所有环境变量、密码、令牌、API密钥
  • 完整响应体 — 展示实际数据(已脱敏),让开发者了解精确格式
  • 全程带时间戳 — 记录运行时间、每个测试的执行时间
  • 结果可操作 — 不仅说明“失败”,还要说明“原因”和“修复方法”
  • 报告所有失败和异常 — 不要因为看似微小就忽略意外行为,现在发现的每个问题都能避免未来的生产事故
  • 输出归属于用户 — 报告成为用户的参考文档,不完整或模糊的结果会浪费测试精力

Special Cases

特殊情况

Streaming APIs (e.g., LLM chat with stream=true):
  • Test both streamed and non-streamed responses
  • Measure latency from first token to completion
  • Validate chunk format
Databases:
  • Test CRUD operations
  • Test connection pooling if applicable
  • Test transaction handling
  • Test query timeouts
File Processing APIs:
  • Test with multiple file types
  • Test error handling for unsupported formats
  • Test large file handling
Rate-Limited APIs:
  • Detect rate limit headers
  • Test behavior when rate limited
  • Suggest backoff strategies in report
Webhook/Async APIs:
  • If applicable, test callback handling
  • Test idempotency if needed
流式API(如stream=true的LLM聊天):
  • 同时测试流式和非流式响应
  • 测量从第一个token到完成的延迟
  • 验证分块格式
数据库:
  • 测试CRUD操作
  • 如适用,测试连接池
  • 测试事务处理
  • 测试查询超时
文件处理API:
  • 测试多种文件类型
  • 测试不支持格式的错误处理
  • 测试大文件处理
速率限制API:
  • 检测速率限制头部
  • 测试达到速率限制时的行为
  • 在报告中建议退避策略
Webhook/异步API:
  • 如适用,测试回调处理
  • 如需要,测试幂等性

Directory Structure (Final)

最终目录结构

integration_tests/
├── .env.dev                          # Template credentials file
├── integrations.debug.log            # Debug log from test execution
├── ITEMIZED_FUNCTIONS_REPORT.md      # Final summary report
├── run_all_tests.py                  # Master test runner
├── function_ollama.py                # Function wrapper
├── test_ollama.py                    # Standard tests
├── heavy_test_ollama.py              # (if needed) Heavy tests
├── function_github_linguist.py       # Another wrapper
├── test_github_linguist.py           # Standard tests
├── function_postgres.py              # Another wrapper
├── test_postgres.py                  # Standard tests
└── [More function/test pairs...]
integration_tests/
├── .env.dev                          # 凭据模板文件
├── integrations.debug.log            # 测试执行的调试日志
├── ITEMIZED_FUNCTIONS_REPORT.md      # 最终汇总报告
├── run_all_tests.py                  # 主测试运行器
├── function_ollama.py                # 函数包装器
├── test_ollama.py                    # 标准测试
├── heavy_test_ollama.py              # (如需要)重型测试
├── function_github_linguist.py       # 另一个包装器
├── test_github_linguist.py           # 标准测试
├── function_postgres.py              # 另一个包装器
├── test_postgres.py                  # 标准测试
└── [更多函数/测试对...]

Execution

执行步骤

  1. User provides architecture files and triggers skill
  2. Skill generates all files in
    integration_tests/
  3. User fills in
    .env.dev
    with actual credentials
  4. User runs:
    python run_all_tests.py
  5. All tests execute, report generated
  6. Developer reviews
    ITEMIZED_FUNCTIONS_REPORT.md
    and
    integrations.debug.log
  1. 用户提供架构文件并触发本技能
  2. 本技能在
    integration_tests/
    中生成所有文件
  3. 用户在
    .env.dev
    中填写实际凭据
  4. 用户运行:
    python run_all_tests.py
  5. 所有测试执行,生成报告
  6. 开发者查看
    ITEMIZED_FUNCTIONS_REPORT.md
    integrations.debug.log

Quality Assurance

质量保证

  • No questions asked — Read architecture, infer integrations, generate tests
  • LLM confidence in test design — Trust that generated tests cover the right scenarios
  • Sanitization is rigorous — Scan all report content for credentials before writing
  • File encoding is UTF-8 — Handle responses with special characters correctly
  • All files importable — Generated Python is syntactically correct and runs
  • 无需询问 — 读取架构、推断集成、生成测试
  • 测试设计的LLM可信度 — 信任生成的测试能覆盖正确场景
  • 脱敏严格 — 在写入前扫描所有报告内容以查找凭据
  • 文件编码为UTF-8 — 正确处理含特殊字符的响应
  • 所有文件可导入 — 生成的Python代码语法正确且可运行