component-identification-sizing

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Component Identification and Sizing

Component识别与规模评估

This skill identifies architectural components (logical building blocks) in a codebase and calculates size metrics to assess decomposition feasibility and identify oversized components.
本skill用于识别代码库中的架构组件(逻辑构建块)并计算规模指标,以评估拆分可行性并识别过大的组件。

How to Use

使用方法

Quick Start

快速开始

Request analysis of your codebase:
  • "Identify and size all components in this codebase"
  • "Find oversized components that need splitting"
  • "Create a component inventory for decomposition planning"
  • "Analyze component size distribution"
请求分析你的代码库:
  • "识别并评估此代码库中所有Component的规模"
  • "找出需要拆分的过大Component"
  • "创建Component清单以规划拆分工作"
  • "分析Component规模分布"

Usage Examples

使用示例

Example 1: Complete Analysis
User: "Identify and size all components in this codebase"

The skill will:
1. Map directory/namespace structures
2. Identify all components (leaf nodes)
3. Calculate size metrics (statements, files, percentages)
4. Generate component inventory table
5. Flag oversized/undersized components
6. Provide recommendations
Example 2: Find Oversized Components
User: "Which components are too large?"

The skill will:
1. Calculate mean and standard deviation
2. Identify components >2 std dev or >10% threshold
3. Analyze functional areas within large components
4. Suggest specific splits with estimated sizes
Example 3: Component Size Analysis
User: "Analyze component sizes and distribution"

The skill will:
1. Calculate all size metrics
2. Generate size distribution summary
3. Identify outliers
4. Provide statistics and recommendations
示例1:完整分析
User: "Identify and size all components in this codebase"

该skill将:
1. 映射directory/namespace结构
2. 识别所有Component(leaf nodes)
3. 计算规模指标(statements、文件数、占比)
4. 生成Component清单表格
5. 标记过大/过小的Component
6. 提供建议
示例2:查找过大Component
User: "Which components are too large?"

该skill将:
1. 计算均值和standard deviation
2. 识别超过2倍standard deviation或占比>10%的Component
3. 分析大型Component内的功能区域
4. 建议具体拆分方案及预估规模
示例3:Component规模分析
User: "Analyze component sizes and distribution"

该skill将:
1. 计算所有规模指标
2. 生成规模分布摘要
3. 识别异常值
4. 提供统计数据和建议

Step-by-Step Process

分步流程

  1. Initial Analysis: Start with complete component inventory
  2. Identify Issues: Find components that need attention
  3. Get Recommendations: Request actionable split/consolidation suggestions
  4. Monitor Progress: Track component growth over time
  1. 初始分析:从完整的Component清单开始
  2. 识别问题:找出需要关注的Component
  3. 获取建议:请求可执行的拆分/合并建议
  4. 监控进度:跟踪Component随时间的增长情况

When to Use

适用场景

Apply this skill when:
  • Starting a monolithic decomposition effort
  • Assessing codebase structure and organization
  • Identifying components that are too large or too small
  • Creating component inventory for migration planning
  • Analyzing code distribution across components
  • Preparing for component-based decomposition patterns
在以下场景中应用本skill:
  • 启动单体应用拆分工作时
  • 评估代码库结构与组织方式时
  • 识别过大或过小的Component时
  • 为迁移规划创建Component清单时
  • 分析代码在各Component间的分布时
  • 准备基于Component的拆分模式时

Core Concepts

核心概念

Component Definition

Component定义

A component is an architectural building block that:
  • Has a well-defined role and responsibility
  • Is identified by a namespace, package structure, or directory path
  • Contains source code files (classes, functions, modules) grouped together
  • Performs specific business or infrastructure functionality
Key Rule: Components are identified by leaf nodes in directory/namespace structures. If a namespace is extended (e.g.,
services/billing
extended to
services/billing/payment
), the parent becomes a subdomain, not a component.
Component是一种架构构建块,具备以下特征:
  • 拥有明确的角色与职责
  • 通过namespace、包结构或目录路径识别
  • 包含分组在一起的源代码文件(类、函数、模块)
  • 执行特定的业务或基础设施功能
关键规则:Component由directory/namespace结构中的leaf nodes识别。若某个namespace被扩展(例如
services/billing
扩展为
services/billing/payment
),则父级成为子域(subdomain),而非Component。

Size Metrics

规模指标

Statements (not lines of code):
  • Count executable statements terminated by semicolons or newlines
  • More accurate than lines of code for size comparison
  • Accounts for code complexity, not formatting
Component Size Indicators:
  • Percent of codebase: Component statements / Total statements
  • File count: Number of source files in component
  • Standard deviation: Distance from mean component size
Statements(非代码行数):
  • 统计以分号或换行符结束的可执行语句
  • 比代码行数更适合用于规模比较
  • 考虑代码复杂度,而非格式
Component规模指标
  • 代码库占比:Component statements / 总statements数
  • 文件数:Component中的源代码文件数量
  • Standard deviation:与平均Component规模的偏差值

Analysis Process

分析流程

Phase 1: Identify Components

阶段1:识别Component

Scan the codebase directory structure:
  1. Map directory/namespace structure
    • For Node.js:
      services/
      ,
      routes/
      ,
      models/
      ,
      utils/
    • For Java: Package structure (e.g.,
      com.company.domain.service
      )
    • For Python: Module paths (e.g.,
      app/billing/payment
      )
  2. Identify leaf nodes
    • Components are the deepest directories containing source files
    • Example:
      services/BillingService/
      is a component
    • Example:
      services/BillingService/payment/
      extends it, making
      BillingService
      a subdomain
  3. Create component inventory
    • List each component with its namespace/path
    • Note any parent namespaces (subdomains)
扫描代码库目录结构:
  1. 映射directory/namespace结构
    • 对于Node.js:
      services/
      routes/
      models/
      utils/
    • 对于Java:包结构(例如
      com.company.domain.service
    • 对于Python:模块路径(例如
      app/billing/payment
  2. 识别leaf nodes
    • Component是包含源代码文件的最深层目录
    • 示例:
      services/BillingService/
      是一个Component
    • 示例:
      services/BillingService/payment/
      是其扩展,此时
      BillingService
      成为子域
  3. 创建Component清单
    • 列出每个Component及其namespace/路径
    • 记录所有父级namespace(子域)

Phase 2: Calculate Size Metrics

阶段2:计算规模指标

For each component:
  1. Count statements
    • Parse source files in component directory
    • Count executable statements (not comments, blank lines, or declarations alone)
    • Sum across all files in component
  2. Count files
    • Total source files (
      .js
      ,
      .ts
      ,
      .java
      ,
      .py
      , etc.)
    • Exclude test files, config files, documentation
  3. Calculate percentage
    component_percent = (component_statements / total_statements) * 100
  4. Calculate statistics
    • Mean component size:
      total_statements / number_of_components
    • Standard deviation:
      sqrt(sum((size - mean)^2) / (n - 1))
    • Component's deviation:
      (component_size - mean) / std_dev
针对每个Component:
  1. 统计statements数
    • 解析Component目录下的源代码文件
    • 统计可执行语句(不包括注释、空行或单独的声明)
    • 汇总Component内所有文件的statements数
  2. 统计文件数
    • 源代码文件总数(
      .js
      .ts
      .java
      .py
      等)
    • 排除测试文件、配置文件、文档
  3. 计算占比
    component_percent = (component_statements / total_statements) * 100
  4. 计算统计数据
    • 平均Component规模:
      total_statements / number_of_components
    • Standard deviation:
      sqrt(sum((size - mean)^2) / (n - 1))
    • Component偏差值:
      (component_size - mean) / std_dev

Phase 3: Identify Size Issues

阶段3:识别规模问题

Oversized Components (candidates for splitting):
  • Exceeds 30% of total codebase (for small apps with <10 components)
  • Exceeds 10% of total codebase (for large apps with >20 components)
  • More than 2 standard deviations above mean
  • Contains multiple distinct functional areas
Undersized Components (candidates for consolidation):
  • Less than 1% of codebase (may be too granular)
  • Less than 1 standard deviation below mean
  • Contains only a few files with minimal functionality
Well-Sized Components:
  • Between 1-2 standard deviations from mean
  • Represents a single, cohesive functional area
  • Appropriate percentage for application size
过大Component(拆分候选):
  • 占代码库总规模的30%以上(适用于Component数量<10的小型应用)
  • 占代码库总规模的10%以上(适用于Component数量>20的大型应用)
  • 比平均规模高出2倍以上standard deviation
  • 包含多个不同的功能区域
过小Component(合并候选):
  • 占代码库总规模的1%以下(可能过于细化)
  • 比平均规模低1倍以上standard deviation
  • 仅包含少量功能有限的文件
规模合理的Component
  • 与平均规模的偏差在1-2倍standard deviation之间
  • 代表单一、内聚的功能区域
  • 占比符合应用规模的标准

Output Format

输出格式

Component Inventory Table

Component清单表格

markdown
undefined
markdown
undefined

Component Inventory

Component Inventory

Component NameNamespace/PathStatementsFilesPercentStatus
Billing Paymentservices/BillingService4,312235%✅ OK
Reportingservices/ReportingService27,76516233%⚠️ Too Large
Notificationservices/NotificationService1,43372%✅ OK

**Status Legend**:

- ✅ OK: Well-sized (within 1-2 std dev from mean)
- ⚠️ Too Large: Exceeds size threshold or >2 std dev above mean
- 🔍 Too Small: <1% of codebase or <1 std dev below mean
Component NameNamespace/PathStatementsFilesPercentStatus
Billing Paymentservices/BillingService4,312235%✅ OK
Reportingservices/ReportingService27,76516233%⚠️ Too Large
Notificationservices/NotificationService1,43372%✅ OK

**状态说明**:

- ✅ OK:规模合理(与平均规模偏差在1-2倍standard deviation以内)
- ⚠️ Too Large:超过规模阈值或比平均规模高出2倍以上standard deviation
- 🔍 Too Small:占比<1%或比平均规模低1倍以上standard deviation

Size Analysis Summary

规模分析摘要

markdown
undefined
markdown
undefined

Size Analysis Summary

Size Analysis Summary

Total Components: 18 Total Statements: 82,931 Mean Component Size: 4,607 statements Standard Deviation: 5,234 statements
Oversized Components (>2 std dev or >10%):
  • Reporting (33% - 27,765 statements) - Consider splitting into:
    • Ticket Reports
    • Expert Reports
    • Financial Reports
Well-Sized Components (within 1-2 std dev):
  • Billing Payment (5%)
  • Customer Profile (5%)
  • Ticket Assignment (9%)
Undersized Components (<1 std dev):
  • Login (2% - 1,865 statements) - Consider consolidating with Authentication
undefined
Total Components: 18 Total Statements: 82,931 Mean Component Size: 4,607 statements Standard Deviation: 5,234 statements
Oversized Components (>2 std dev or >10%):
  • Reporting (33% - 27,765 statements) - Consider splitting into:
    • Ticket Reports
    • Expert Reports
    • Financial Reports
Well-Sized Components (within 1-2 std dev):
  • Billing Payment (5%)
  • Customer Profile (5%)
  • Ticket Assignment (9%)
Undersized Components (<1 std dev):
  • Login (2% - 1,865 statements) - Consider consolidating with Authentication
undefined

Component Size Distribution

Component规模分布

markdown
undefined
markdown
undefined

Component Size Distribution

Component Size Distribution


Component Size Distribution (by percent of codebase)

[Visual representation or histogram if possible]

Largest: ████████████████████████████████████ 33% (Reporting)
████████ 9% (Ticket Assign)
██████ 8% (Ticket)
██████ 6% (Expert Profile)
█████ 5% (Billing Payment)
████ 4% (Billing History)
...

Component Size Distribution (by percent of codebase)

[若可能,提供可视化图表或直方图]

Largest: ████████████████████████████████████ 33% (Reporting)
████████ 9% (Ticket Assign)
██████ 8% (Ticket)
██████ 6% (Expert Profile)
█████ 5% (Billing Payment)
████ 4% (Billing History)
...

Recommendations

建议

markdown
undefined
markdown
undefined

Recommendations

Recommendations

High Priority: Split Large Components

高优先级:拆分大型Component

Reporting Component (33% of codebase):
  • Current: Single component with 27,765 statements
  • Issue: Too large, contains multiple functional areas
  • Recommendation: Split into:
    1. Reporting Shared (common utilities)
    2. Ticket Reports (ticket-related reports)
    3. Expert Reports (expert-related reports)
    4. Financial Reports (financial reports)
  • Expected Result: Each component ~7-9% of codebase
Reporting Component (33% of codebase):
  • 当前状态:单一Component包含27,765条statements
  • 问题:规模过大,包含多个功能区域
  • 建议:拆分为:
    1. Reporting Shared(通用工具)
    2. Ticket Reports(工单相关报表)
    3. Expert Reports(专家相关报表)
    4. Financial Reports(财务报表)
  • 预期结果:每个Component占代码库的7-9%

Medium Priority: Review Small Components

中优先级:审查小型Component

Login Component (2% of codebase):
  • Current: 1,865 statements, 3 files
  • Consideration: May be too granular if related to broader authentication
  • Recommendation: Evaluate if should be consolidated with Authentication/User components
Login Component (2% of codebase):
  • 当前状态:1,865条statements,3个文件
  • 考虑点:若与更广泛的认证功能相关,可能过于细化
  • 建议:评估是否应与Authentication/User Component合并

Low Priority: Monitor Well-Sized Components

低优先级:监控规模合理的Component

Most components are appropriately sized. Continue monitoring during decomposition.
undefined
大多数Component规模合理。在拆分过程中持续监控即可。
undefined

Analysis Checklist

分析检查清单

Component Identification:
  • Mapped all directory/namespace structures
  • Identified leaf nodes (components) vs parent nodes (subdomains)
  • Created complete component inventory
  • Documented namespace/path for each component
Size Calculation:
  • Counted statements (not lines) for each component
  • Counted source files (excluding tests/configs)
  • Calculated percentage of total codebase
  • Calculated mean and standard deviation
Size Assessment:
  • Identified oversized components (>threshold or >2 std dev)
  • Identified undersized components (<1% or <1 std dev)
  • Flagged components for splitting or consolidation
  • Documented size distribution
Recommendations:
  • Suggested splits for oversized components
  • Suggested consolidations for undersized components
  • Prioritized recommendations by impact
  • Created architecture stories for refactoring
Component识别
  • 映射所有directory/namespace结构
  • 识别leaf nodes(Component)与父节点(子域)
  • 创建完整的Component清单
  • 记录每个Component的namespace/路径
规模计算
  • 统计每个Component的statements数(非代码行数)
  • 统计源代码文件数(排除测试/配置文件)
  • 计算占代码库总规模的比例
  • 计算均值和standard deviation
规模评估
  • 识别过大的Component(超过阈值或>2倍standard deviation)
  • 识别过小的Component(<1%或<1倍standard deviation)
  • 标记需要拆分或合并的Component
  • 记录规模分布情况
建议
  • 为过大的Component建议拆分方案
  • 为过小的Component建议合并方案
  • 根据影响优先级排序建议
  • 创建重构的架构任务

Implementation Notes

实现说明

For Node.js/Express Applications

针对Node.js/Express应用

Components typically found in:
  • services/
    - Business logic components
  • routes/
    - API endpoint components
  • models/
    - Data model components
  • utils/
    - Utility components
  • middleware/
    - Middleware components
Example Component Identification:
services/
├── BillingService/          ← Component (leaf node)
│   ├── index.js
│   └── BillingService.js
├── CustomerService/          ← Component (leaf node)
│   └── CustomerService.js
└── NotificationService/      ← Component (leaf node)
    └── NotificationService.js
通常可在以下目录中找到Component:
  • services/
    - 业务逻辑Component
  • routes/
    - API端点Component
  • models/
    - 数据模型Component
  • utils/
    - 工具类Component
  • middleware/
    - 中间件Component
Component识别示例
services/
├── BillingService/          ← Component (leaf node)
│   ├── index.js
│   └── BillingService.js
├── CustomerService/          ← Component (leaf node)
│   └── CustomerService.js
└── NotificationService/      ← Component (leaf node)
    └── NotificationService.js

For Java Applications

针对Java应用

Components identified by package structure:
  • com.company.domain.service
    - Service components
  • com.company.domain.model
    - Model components
  • com.company.domain.repository
    - Repository components
Example Component Identification:
com.company.billing.payment   ← Component (leaf package)
com.company.billing.history   ← Component (leaf package)
com.company.billing           ← Subdomain (parent of payment/history)
通过包结构识别Component:
  • com.company.domain.service
    - 服务Component
  • com.company.domain.model
    - 模型Component
  • com.company.domain.repository
    - 仓库Component
Component识别示例
com.company.billing.payment   ← Component (leaf package)
com.company.billing.history   ← Component (leaf package)
com.company.billing           ← Subdomain (parent of payment/history)

Statement Counting

Statements统计

JavaScript/TypeScript:
  • Count statements terminated by
    ;
    or newline
  • Include: assignments, function calls, returns, conditionals, loops
  • Exclude: comments, blank lines, declarations without assignment
Java:
  • Count statements terminated by
    ;
  • Include: method calls, assignments, returns, conditionals
  • Exclude: class/interface declarations, comments, blank lines
Python:
  • Count executable statements (not comments or blank lines)
  • Include: assignments, function calls, returns, conditionals
  • Exclude: docstrings, comments, blank lines
JavaScript/TypeScript
  • 统计以
    ;
    或换行符结束的语句
  • 包含:赋值、函数调用、返回、条件判断、循环
  • 排除:注释、空行、无赋值的声明
Java
  • 统计以
    ;
    结束的语句
  • 包含:方法调用、赋值、返回、条件判断
  • 排除:类/接口声明、注释、空行
Python
  • 统计可执行语句(不包括注释或空行)
  • 包含:赋值、函数调用、返回、条件判断
  • 排除:文档字符串、注释、空行

Fitness Functions

适配函数

After identifying and sizing components, create automated checks:
识别并评估Component规模后,创建自动化检查:

Component Size Threshold

Component规模阈值检查

javascript
// Alert if any component exceeds 10% of codebase
function checkComponentSize(components, threshold = 0.1) {
  const totalStatements = components.reduce((sum, c) => sum + c.statements, 0)
  return components
    .filter((c) => c.statements / totalStatements > threshold)
    .map((c) => ({
      component: c.name,
      percent: ((c.statements / totalStatements) * 100).toFixed(1),
      issue: 'Exceeds size threshold',
    }))
}
javascript
// Alert if any component exceeds 10% of codebase
function checkComponentSize(components, threshold = 0.1) {
  const totalStatements = components.reduce((sum, c) => sum + c.statements, 0)
  return components
    .filter((c) => c.statements / totalStatements > threshold)
    .map((c) => ({
      component: c.name,
      percent: ((c.statements / totalStatements) * 100).toFixed(1),
      issue: 'Exceeds size threshold',
    }))
}

Standard Deviation Check

Standard Deviation检查

javascript
// Alert if component is >2 standard deviations from mean
function checkStandardDeviation(components) {
  const sizes = components.map((c) => c.statements)
  const mean = sizes.reduce((a, b) => a + b, 0) / sizes.length
  const stdDev = Math.sqrt(sizes.reduce((sum, size) => sum + Math.pow(size - mean, 2), 0) / (sizes.length - 1))

  return components
    .filter((c) => Math.abs(c.statements - mean) > 2 * stdDev)
    .map((c) => ({
      component: c.name,
      deviation: ((c.statements - mean) / stdDev).toFixed(2),
      issue: 'More than 2 standard deviations from mean',
    }))
}
javascript
// Alert if component is >2 standard deviations from mean
function checkStandardDeviation(components) {
  const sizes = components.map((c) => c.statements)
  const mean = sizes.reduce((a, b) => a + b, 0) / sizes.length
  const stdDev = Math.sqrt(sizes.reduce((sum, size) => sum + Math.pow(size - mean, 2), 0) / (sizes.length - 1))

  return components
    .filter((c) => Math.abs(c.statements - mean) > 2 * stdDev)
    .map((c) => ({
      component: c.name,
      deviation: ((c.statements - mean) / stdDev).toFixed(2),
      issue: 'More than 2 standard deviations from mean',
    }))
}

Best Practices

最佳实践

Do's ✅

建议✅

  • Use statements, not lines of code
  • Identify components as leaf nodes only
  • Calculate both percentage and standard deviation
  • Consider application size when setting thresholds
  • Document namespace/path for each component
  • Create visual size distribution if possible
  • 使用statements而非代码行数
  • 仅将leaf nodes识别为Component
  • 同时计算占比和standard deviation
  • 设置阈值时考虑应用规模
  • 记录每个Component的namespace/路径
  • 若可能,创建可视化的规模分布图

Don'ts ❌

禁忌❌

  • Don't count test files in component size
  • Don't treat parent directories as components
  • Don't use fixed thresholds without considering app size
  • Don't ignore small components (may need consolidation)
  • Don't skip standard deviation calculation
  • Don't mix infrastructure and domain components in same analysis
  • 不要将测试文件计入Component规模
  • 不要将父目录视为Component
  • 不考虑应用规模就使用固定阈值
  • 不要忽略小型Component(可能需要合并)
  • 不要跳过standard deviation计算
  • 不要在同一分析中混合基础设施和领域Component

Next Steps

后续步骤

After completing component identification and sizing:
  1. Apply Gather Common Domain Components Pattern - Identify duplicate functionality
  2. Apply Flatten Components Pattern - Remove orphaned classes from root namespaces
  3. Apply Determine Component Dependencies Pattern - Analyze coupling between components
  4. Create Component Domains - Group components into logical domains
完成Component识别与规模评估后:
  1. 应用通用领域Component收集模式 - 识别重复功能
  2. 应用Component扁平化模式 - 从根namespace中移除孤立类
  3. 应用Component依赖关系确定模式 - 分析Component间的耦合度
  4. 创建Component域 - 将Component分组为逻辑域

Notes

注意事项

  • Component size thresholds vary by application size
  • Small apps (<10 components): 30% threshold may be appropriate
  • Large apps (>20 components): 10% threshold is more appropriate
  • Standard deviation is more reliable than fixed percentages
  • Well-sized components are 1-2 standard deviations from mean
  • Oversized components often contain multiple functional areas that can be split
  • Component规模阈值因应用规模而异
  • 小型应用(<10个Component):30%的阈值可能更合适
  • 大型应用(>20个Component):10%的阈值更合适
  • Standard deviation比固定百分比更可靠
  • 规模合理的Component与平均规模的偏差在1-2倍standard deviation之间
  • 过大的Component通常包含多个可拆分的功能区域