software-developer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Software Developer Skill

软件开发者技能

Purpose

用途

Implement production-quality bioinformatics software from technical specifications with comprehensive testing, documentation, and error handling.
根据技术规范实现具备全面测试、文档和错误处理的生产级生物信息学软件。

When to Use This Skill

何时使用该技能

Use this skill when you need to:
  • Implement software from architecture specification
  • Write production-ready code (not exploratory analysis)
  • Create command-line tools or packages
  • Build reusable libraries
  • Ensure code quality through testing
在以下场景中使用该技能:
  • 根据架构规范实现软件
  • 编写生产就绪代码(而非探索性分析代码)
  • 创建命令行工具或软件包
  • 构建可复用库
  • 通过测试保障代码质量

Workflow Integration

工作流集成

Pattern: Receive Spec → Implement → Test → Document → Deliver
Systems Architect provides technical spec
Software Developer implements
    ↓  (copilot reviews continuously)
Biologist Commentator validates biological correctness
Production-ready software
模式:接收规范 → 实现 → 测试 → 文档编写 → 交付
系统架构师提供技术规范
软件开发者实现
    ↓ (Copilot持续评审)
生物学家评论员验证生物学正确性
生产就绪软件

Archival Compliance

归档合规性

Before writing any output file:
  1. Check if archival context was provided via handoff from an orchestrator
    • If yes: use the provided archival_context block directly
    • If archival_context is "skip": bypass all compliance checks
  2. If no handoff context: check for
    .archive-metadata.yaml
    in the repo root following the archival compliance check pattern: a. Read the reference document:
    ~/.claude/skills/archive-workflow/references/archival-compliance-check.md
    b. If file not found, use graceful degradation (log warning, proceed without archival check) c. Apply the 5-step pattern to all file creation operations
  3. Before writing output, validate path against guidelines
  4. On violation: if invoked standalone, present advisory options; if invoked via Task tool (sub-agent), apply archival guidelines silently
software-developer specific: Focus on code naming conventions (snake_case for .py) and directory structure (src/, tests/) validation.
在编写任何输出文件之前:
  1. 检查是否通过编排器移交提供了归档上下文(archival_context)
    • 如果是:直接使用提供的archival_context块
    • 如果archival_context为"skip":跳过所有合规性检查
  2. 如果没有移交上下文:检查仓库根目录下是否存在
    .archive-metadata.yaml
    ,遵循归档合规性检查模式: a. 阅读参考文档:
    ~/.claude/skills/archive-workflow/references/archival-compliance-check.md
    b. 如果文件未找到,使用优雅降级(记录警告,不进行归档检查继续执行) c. 对所有文件创建操作应用五步模式
  3. 在编写输出前,验证路径是否符合指南要求
  4. 若违反规范:如果是独立调用,展示建议选项;如果是通过Task工具(子Agent)调用,静默应用归档指南
软件开发者特定要求:重点关注代码命名规范(.py文件使用snake_case)和目录结构(src/、tests/)验证。

Core Capabilities

核心能力

1. Implementation from Spec

1. 基于规范实现

  • Translate architecture into working code
  • Modular, reusable functions/classes
  • Follow coding standards (PEP 8)
  • Type hints for clarity
  • 将架构转化为可运行代码
  • 模块化、可复用的函数/类
  • 遵循编码标准(PEP 8)
  • 类型提示以提升可读性

2. Error Handling

2. 错误处理

  • Try/except with informative messages
  • Validate inputs
  • Graceful failure
  • Logging for debugging
  • 带信息性提示的Try/except处理
  • 输入验证
  • 优雅降级
  • 用于调试的日志记录

3. Testing

3. 测试

  • Unit tests (pytest)
  • Integration tests
  • Edge case coverage
  • 80% code coverage goal
  • 单元测试(pytest)
  • 集成测试
  • 边缘场景覆盖
  • 代码覆盖率目标>80%

4. Documentation

4. 文档编写

  • Docstrings (Google style)
  • README with usage examples
  • API reference
  • Troubleshooting guide
  • 文档字符串(Google风格)
  • 包含使用示例的README
  • API参考文档
  • 故障排除指南

5. CLI Interface

5. CLI界面

  • argparse or Click
  • Help messages
  • Progress bars for long operations
  • Sensible defaults
  • argparse或Click
  • 帮助信息
  • 长操作进度条
  • 合理的默认值

Standard Package Structure

标准软件包结构

Use
assets/package_structure_template/
:
project_name/
├── src/
│   ├── __init__.py
│   ├── module1.py
│   ├── module2.py
│   └── cli.py
├── tests/
│   ├── test_module1.py
│   ├── test_module2.py
│   ├── fixtures/
│   └── test_data/
├── docs/
│   ├── usage.md
│   └── api.md
├── README.md
├── setup.py
├── pyproject.toml
├── requirements.txt
├── environment.yml
└── .gitignore
使用
assets/package_structure_template/
project_name/
├── src/
│   ├── __init__.py
│   ├── module1.py
│   ├── module2.py
│   └── cli.py
├── tests/
│   ├── test_module1.py
│   ├── test_module2.py
│   ├── fixtures/
│   └── test_data/
├── docs/
│   ├── usage.md
│   └── api.md
├── README.md
├── setup.py
├── pyproject.toml
├── requirements.txt
├── environment.yml
└── .gitignore

Code Quality Standards

代码质量标准

Docstring Format (Google Style)

文档字符串格式(Google风格)

python
def calculate_cpm(counts: pd.DataFrame) -> pd.DataFrame:
    """
    Calculate counts per million (CPM) normalization.

    Parameters
    ----------
    counts : pd.DataFrame
        Raw count matrix (genes × samples)

    Returns
    -------
    pd.DataFrame
        CPM-normalized counts

    Raises
    ------
    ValueError
        If counts contain negative values

    Examples
    --------
    >>> counts = pd.DataFrame({'A': [10, 20], 'B': [30, 40]})
    >>> cpm = calculate_cpm(counts)
    >>> cpm['A'].sum()  # Should be ~1,000,000
    1000000.0
    """
    if (counts < 0).any().any():
        raise ValueError("Counts cannot be negative")

    return (counts / counts.sum(axis=0)) * 1e6
python
def calculate_cpm(counts: pd.DataFrame) -> pd.DataFrame:
    """
    计算每百万次计数(CPM)归一化。

    参数
    ----------
    counts : pd.DataFrame
        原始计数矩阵(基因 × 样本)

    返回值
    -------
    pd.DataFrame
        CPM归一化后的计数

    异常
    ------
    ValueError
        如果计数包含负值

    示例
    --------
    >>> counts = pd.DataFrame({'A': [10, 20], 'B': [30, 40]})
    >>> cpm = calculate_cpm(counts)
    >>> cpm['A'].sum()  # 应约为1,000,000
    1000000.0
    """
    if (counts < 0).any().any():
        raise ValueError("Counts cannot be negative")

    return (counts / counts.sum(axis=0)) * 1e6

Error Handling

错误处理

python
undefined
python
undefined

✅ Good: Informative error messages

✅ 良好示例:信息丰富的错误提示

try: data = pd.read_csv(filepath) except FileNotFoundError: raise FileNotFoundError( f"Data file not found: {filepath}\n" f"Expected location: {Path(filepath).absolute()}" ) except pd.errors.EmptyDataError: raise ValueError( f"Data file is empty: {filepath}\n" f"Check that file was generated correctly" )
undefined
try: data = pd.read_csv(filepath) except FileNotFoundError: raise FileNotFoundError( f"Data file not found: {filepath}\n" f"Expected location: {Path(filepath).absolute()}" ) except pd.errors.EmptyDataError: raise ValueError( f"Data file is empty: {filepath}\n" f"Check that file was generated correctly" )
undefined

Logging

日志记录

python
import logging

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

def process_samples(sample_list):
    logger.info(f"Processing {len(sample_list)} samples")
    for i, sample in enumerate(sample_list):
        logger.debug(f"Processing sample {i+1}/{len(sample_list)}: {sample}")
        # ... processing code ...
    logger.info("Processing complete")
python
import logging

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

def process_samples(sample_list):
    logger.info(f"Processing {len(sample_list)} samples")
    for i, sample in enumerate(sample_list):
        logger.debug(f"Processing sample {i+1}/{len(sample_list)}: {sample}")
        # ... 处理代码 ...
    logger.info("Processing complete")

Testing with pytest

使用pytest进行测试

python
undefined
python
undefined

tests/test_normalization.py

tests/test_normalization.py

import pytest import pandas as pd import numpy as np from src.normalization import calculate_cpm
def test_cpm_sum_equals_million(): """Test that CPM normalization sums to ~1 million.""" counts = pd.DataFrame({'A': [10, 20, 30], 'B': [40, 50, 60]}) cpm = calculate_cpm(counts) assert np.allclose(cpm.sum(axis=0), 1e6)
def test_cpm_raises_on_negative(): """Test that negative counts raise ValueError.""" counts = pd.DataFrame({'A': [-10, 20], 'B': [30, 40]}) with pytest.raises(ValueError, match="negative"): calculate_cpm(counts)
def test_cpm_handles_zero_sum(): """Test behavior when column sums to zero.""" counts = pd.DataFrame({'A': [0, 0], 'B': [10, 20]}) # Should handle gracefully (decide behavior: NaN or raise)
undefined
import pytest import pandas as pd import numpy as np from src.normalization import calculate_cpm
def test_cpm_sum_equals_million(): """测试CPM归一化的总和是否约为100万。""" counts = pd.DataFrame({'A': [10, 20, 30], 'B': [40, 50, 60]}) cpm = calculate_cpm(counts) assert np.allclose(cpm.sum(axis=0), 1e6)
def test_cpm_raises_on_negative(): """测试负值计数是否会触发ValueError。""" counts = pd.DataFrame({'A': [-10, 20], 'B': [30, 40]}) with pytest.raises(ValueError, match="negative"): calculate_cpm(counts)
def test_cpm_handles_zero_sum(): """测试列总和为零时的行为。""" counts = pd.DataFrame({'A': [0, 0], 'B': [10, 20]}) # 应优雅处理(确定行为:返回NaN或触发异常)
undefined

CLI Template

CLI模板

See
assets/cli_template.py
:
python
#!/usr/bin/env python3
"""
QC Pipeline CLI

Usage:
    qc_pipeline samples.csv --output results/
"""

import click
import logging
from pathlib import Path

@click.command()
@click.argument('sample_file', type=click.Path(exists=True))
@click.option('--output', '-o', default='results/', help='Output directory')
@click.option('--threads', '-t', default=4, help='Number of threads')
@click.option('--verbose', '-v', is_flag=True, help='Verbose logging')
def main(sample_file, output, threads, verbose):
    """Run QC pipeline on samples."""

    # Setup logging
    level = logging.DEBUG if verbose else logging.INFO
    logging.basicConfig(level=level)
    logger = logging.getLogger(__name__)

    # Validate inputs
    output_dir = Path(output)
    output_dir.mkdir(parents=True, exist_ok=True)

    logger.info(f"Processing samples from {sample_file}")
    logger.info(f"Output directory: {output_dir}")
    logger.info(f"Using {threads} threads")

    # Main logic
    try:
        # ... pipeline code ...
        logger.info("Pipeline complete!")
    except Exception as e:
        logger.error(f"Pipeline failed: {e}")
        raise

if __name__ == '__main__':
    main()
查看
assets/cli_template.py
python
#!/usr/bin/env python3
"""
QC流程CLI

用法:
    qc_pipeline samples.csv --output results/
"""

import click
import logging
from pathlib import Path

@click.command()
@click.argument('sample_file', type=click.Path(exists=True))
@click.option('--output', '-o', default='results/', help='输出目录')
@click.option('--threads', '-t', default=4, help='线程数')
@click.option('--verbose', '-v', is_flag=True, help='详细日志')
def main(sample_file, output, threads, verbose):
    """对样本运行QC流程。"""

    # 配置日志
    level = logging.DEBUG if verbose else logging.INFO
    logging.basicConfig(level=level)
    logger = logging.getLogger(__name__)

    # 验证输入
    output_dir = Path(output)
    output_dir.mkdir(parents=True, exist_ok=True)

    logger.info(f"Processing samples from {sample_file}")
    logger.info(f"Output directory: {output_dir}")
    logger.info(f"Using {threads} threads")

    # 主逻辑
    try:
        # ... 流程代码 ...
        logger.info("Pipeline complete!")
    except Exception as e:
        logger.error(f"Pipeline failed: {e}")
        raise

if __name__ == '__main__':
    main()

Testing Strategy

测试策略

1. Unit Tests

1. 单元测试

Test individual functions in isolation.
独立测试单个函数。

2. Integration Tests

2. 集成测试

Test components working together.
测试组件协同工作。

3. Regression Tests

3. 回归测试

Save expected outputs, compare to current.
保存预期输出,与当前结果对比。

4. Edge Case Tests

4. 边缘场景测试

  • Empty input
  • Single element
  • All zeros
  • Missing values
  • Very large input
  • 空输入
  • 单个元素
  • 全零值
  • 缺失值
  • 超大输入

Copilot Integration

Copilot集成

During implementation:
  1. Write code section
  2. Copilot reviews immediately
  3. Fix critical issues before proceeding
  4. Iterate until approved
  5. Move to next section
实现过程中:
  1. 编写代码段
  2. Copilot立即评审
  3. 修复关键问题后再继续
  4. 迭代直至通过审核
  5. 进入下一部分

Quality Checklist

质量检查清单

Before delivery:
  • All code passes tests (pytest)
  • >80% test coverage
  • All public functions documented
  • Error messages are actionable
  • CLI help message clear
  • README with installation + usage
  • Example data/workflow provided
  • Copilot approved (no critical issues)
  • Biologist validated (biological correctness)
交付前:
  • 所有代码通过测试(pytest)
  • 代码覆盖率>80%
  • 所有公共函数都有文档
  • 错误提示具备可操作性
  • CLI帮助信息清晰
  • README包含安装和使用说明
  • 提供示例数据/工作流
  • Copilot审核通过(无关键问题)
  • 生物学家验证通过(生物学正确性)

References

参考资料

For detailed standards:
  • references/coding_standards.md
    - PEP 8, naming, function length
  • references/testing_patterns.md
    - pytest, fixtures, mocking
  • references/error_handling_guide.md
    - Exception hierarchy, logging
  • references/documentation_standards.md
    - Docstrings, README, API docs
详细标准请参考:
  • references/coding_standards.md
    - PEP 8、命名规范、函数长度
  • references/testing_patterns.md
    - pytest、fixtures、模拟
  • references/error_handling_guide.md
    - 异常层级、日志记录
  • references/documentation_standards.md
    - 文档字符串、README、API文档

Scripts

脚本

Available in
scripts/
:
  • project_template_generator.py
    - Creates project structure
  • test_runner.py
    - Runs pytest with coverage
scripts/
目录下提供:
  • project_template_generator.py
    - 创建项目结构
  • test_runner.py
    - 运行pytest并生成覆盖率报告

Success Criteria

成功标准

Code is ready for production when:
  • Implements full specification
  • All tests pass
  • Coverage >80%
  • Documentation complete
  • CLI functional
  • Copilot approved
  • Biologist validated
  • Ready for deployment
代码具备生产就绪状态的条件:
  • 完整实现规范要求
  • 所有测试通过
  • 覆盖率>80%
  • 文档完整
  • CLI功能正常
  • Copilot审核通过
  • 生物学家验证通过
  • 可部署