component-identification-sizing

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Component Identification and Sizing

Component识别与规模度量

This skill identifies architectural components (logical building blocks) in a codebase and calculates size metrics to assess decomposition feasibility and identify oversized components.
本Skill可识别代码库中的架构Component(逻辑构建块),并计算规模指标以评估分解可行性,同时识别过大的Component。

How to Use

使用方法

Quick Start

快速开始

Request analysis of your codebase:
  • "Identify and size all components in this codebase"
  • "Find oversized components that need splitting"
  • "Create a component inventory for decomposition planning"
  • "Analyze component size distribution"
请求分析你的代码库:
  • "识别并度量此代码库中所有Component的规模"
  • "找出需要拆分的过大Component"
  • "创建Component清单用于分解规划"
  • "分析Component规模分布"

Usage Examples

使用示例

Example 1: Complete Analysis
User: "Identify and size all components in this codebase"

The skill will:
1. Map directory/namespace structures
2. Identify all components (leaf nodes)
3. Calculate size metrics (statements, files, percentages)
4. Generate component inventory table
5. Flag oversized/undersized components
6. Provide recommendations
Example 2: Find Oversized Components
User: "Which components are too large?"

The skill will:
1. Calculate mean and standard deviation
2. Identify components >2 std dev or >10% threshold
3. Analyze functional areas within large components
4. Suggest specific splits with estimated sizes
Example 3: Component Size Analysis
User: "Analyze component sizes and distribution"

The skill will:
1. Calculate all size metrics
2. Generate size distribution summary
3. Identify outliers
4. Provide statistics and recommendations
示例1:完整分析
用户: "识别并度量此代码库中所有Component的规模"

该Skill将:
1. 映射目录/命名空间结构
2. 识别所有Component(叶子节点)
3. 计算规模指标(statements、文件数、占比)
4. 生成Component清单表格
5. 标记过大/过小的Component
6. 提供建议
示例2:查找过大Component
用户: "哪些Component规模过大?"

该Skill将:
1. 计算平均值和标准差
2. 识别超出2倍标准差或10%阈值的Component
3. 分析大型Component内的功能区域
4. 提供带预估规模的具体拆分建议
示例3:Component规模分析
用户: "分析Component的规模及分布情况"

该Skill将:
1. 计算所有规模指标
2. 生成规模分布摘要
3. 识别异常值
4. 提供统计数据和建议

Step-by-Step Process

分步流程

  1. Initial Analysis: Start with complete component inventory
  2. Identify Issues: Find components that need attention
  3. Get Recommendations: Request actionable split/consolidation suggestions
  4. Monitor Progress: Track component growth over time
  1. 初始分析:从完整的Component清单开始
  2. 问题识别:找出需要关注的Component
  3. 获取建议:请求可执行的拆分/合并建议
  4. 进度监控:跟踪Component随时间的增长情况

When to Use

适用场景

Apply this skill when:
  • Starting a monolithic decomposition effort
  • Assessing codebase structure and organization
  • Identifying components that are too large or too small
  • Creating component inventory for migration planning
  • Analyzing code distribution across components
  • Preparing for component-based decomposition patterns
在以下场景中应用本Skill:
  • 启动单体应用分解工作时
  • 评估代码库结构与组织方式时
  • 识别过大或过小的Component时
  • 为迁移规划创建Component清单时
  • 分析代码在各Component间的分布情况时
  • 准备基于Component的分解模式时

Core Concepts

核心概念

Component Definition

Component定义

A component is an architectural building block that:
  • Has a well-defined role and responsibility
  • Is identified by a namespace, package structure, or directory path
  • Contains source code files (classes, functions, modules) grouped together
  • Performs specific business or infrastructure functionality
Key Rule: Components are identified by leaf nodes in directory/namespace structures. If a namespace is extended (e.g.,
services/billing
extended to
services/billing/payment
), the parent becomes a subdomain, not a component.
Component是一种架构构建块,需满足:
  • 具备明确的角色与职责
  • 通过命名空间、包结构或目录路径识别
  • 包含分组在一起的源代码文件(类、函数、模块)
  • 执行特定的业务或基础设施功能
关键规则:Component由目录/命名空间结构中的叶子节点识别。若某个命名空间被扩展(例如
services/billing
扩展为
services/billing/payment
),则父节点成为子域,而非Component。

Size Metrics

规模指标

Statements (not lines of code):
  • Count executable statements terminated by semicolons or newlines
  • More accurate than lines of code for size comparison
  • Accounts for code complexity, not formatting
Component Size Indicators:
  • Percent of codebase: Component statements / Total statements
  • File count: Number of source files in component
  • Standard deviation: Distance from mean component size
Statements(非代码行数):
  • 统计以分号或换行符结尾的可执行语句
  • 相比代码行数,更适合用于规模比较
  • 可体现代码复杂度,不受格式影响
Component规模指标
  • 代码库占比:Component语句数 / 总语句数
  • 文件数:Component中的源代码文件数量
  • 标准差:与平均Component规模的偏差值

Analysis Process

分析流程

Phase 1: Identify Components

阶段1:识别Component

Scan the codebase directory structure:
  1. Map directory/namespace structure
    • For Node.js:
      services/
      ,
      routes/
      ,
      models/
      ,
      utils/
    • For Java: Package structure (e.g.,
      com.company.domain.service
      )
    • For Python: Module paths (e.g.,
      app/billing/payment
      )
  2. Identify leaf nodes
    • Components are the deepest directories containing source files
    • Example:
      services/BillingService/
      is a component
    • Example:
      services/BillingService/payment/
      extends it, making
      BillingService
      a subdomain
  3. Create component inventory
    • List each component with its namespace/path
    • Note any parent namespaces (subdomains)
扫描代码库目录结构:
  1. 映射目录/命名空间结构
    • 针对Node.js:
      services/
      routes/
      models/
      utils/
    • 针对Java:包结构(例如
      com.company.domain.service
    • 针对Python:模块路径(例如
      app/billing/payment
  2. 识别叶子节点
    • Component是包含源代码文件的最深层级目录
    • 示例:
      services/BillingService/
      是一个Component
    • 示例:
      services/BillingService/payment/
      是其扩展,此时
      BillingService
      成为子域
  3. 创建Component清单
    • 列出每个Component及其命名空间/路径
    • 记录所有父命名空间(子域)

Phase 2: Calculate Size Metrics

阶段2:计算规模指标

For each component:
  1. Count statements
    • Parse source files in component directory
    • Count executable statements (not comments, blank lines, or declarations alone)
    • Sum across all files in component
  2. Count files
    • Total source files (
      .js
      ,
      .ts
      ,
      .java
      ,
      .py
      , etc.)
    • Exclude test files, config files, documentation
  3. Calculate percentage
    component_percent = (component_statements / total_statements) * 100
  4. Calculate statistics
    • Mean component size:
      total_statements / number_of_components
    • Standard deviation:
      sqrt(sum((size - mean)^2) / (n - 1))
    • Component's deviation:
      (component_size - mean) / std_dev
针对每个Component:
  1. 统计Statements数量
    • 解析Component目录下的源代码文件
    • 统计可执行语句(不包含注释、空行或单独的声明)
    • 汇总Component下所有文件的语句数
  2. 统计文件数
    • 源代码文件总数(
      .js
      .ts
      .java
      .py
      等)
    • 排除测试文件、配置文件、文档
  3. 计算占比
    component_percent = (component_statements / total_statements) * 100
  4. 计算统计数据
    • 平均Component规模:
      total_statements / number_of_components
    • 标准差:
      sqrt(sum((size - mean)^2) / (n - 1))
    • Component偏差值:
      (component_size - mean) / std_dev

Phase 3: Identify Size Issues

阶段3:识别规模问题

Oversized Components (candidates for splitting):
  • Exceeds 30% of total codebase (for small apps with <10 components)
  • Exceeds 10% of total codebase (for large apps with >20 components)
  • More than 2 standard deviations above mean
  • Contains multiple distinct functional areas
Undersized Components (candidates for consolidation):
  • Less than 1% of codebase (may be too granular)
  • Less than 1 standard deviation below mean
  • Contains only a few files with minimal functionality
Well-Sized Components:
  • Between 1-2 standard deviations from mean
  • Represents a single, cohesive functional area
  • Appropriate percentage for application size
过大Component(拆分候选):
  • 占代码库比例超过30%(适用于Component数量<10的小型应用)
  • 占代码库比例超过10%(适用于Component数量>20的大型应用)
  • 超过平均规模2倍标准差
  • 包含多个不同的功能区域
过小Component(合并候选):
  • 占代码库比例不足1%(可能过于细分)
  • 低于平均规模1倍标准差
  • 仅包含少量功能简单的文件
规模合理的Component
  • 处于平均规模1-2倍标准差范围内
  • 代表单一、内聚的功能区域
  • 占比符合应用规模的合理范围

Output Format

输出格式

Component Inventory Table

Component清单表格

markdown
undefined
markdown
undefined

Component Inventory

Component清单

Component NameNamespace/PathStatementsFilesPercentStatus
Billing Paymentservices/BillingService4,312235%✅ OK
Reportingservices/ReportingService27,76516233%⚠️ Too Large
Notificationservices/NotificationService1,43372%✅ OK

**Status Legend**:

- ✅ OK: Well-sized (within 1-2 std dev from mean)
- ⚠️ Too Large: Exceeds size threshold or >2 std dev above mean
- 🔍 Too Small: <1% of codebase or <1 std dev below mean
Component名称命名空间/路径Statements文件数占比状态
Billing Paymentservices/BillingService4,312235%✅ 正常
Reportingservices/ReportingService27,76516233%⚠️ 过大
Notificationservices/NotificationService1,43372%✅ 正常

**状态说明**:

- ✅ 正常:规模合理(处于平均规模1-2倍标准差范围内)
- ⚠️ 过大:超出规模阈值或高于平均规模2倍标准差
- 🔍 过小:占代码库比例<1%或低于平均规模1倍标准差

Size Analysis Summary

规模分析摘要

markdown
undefined
markdown
undefined

Size Analysis Summary

规模分析摘要

Total Components: 18 Total Statements: 82,931 Mean Component Size: 4,607 statements Standard Deviation: 5,234 statements
Oversized Components (>2 std dev or >10%):
  • Reporting (33% - 27,765 statements) - Consider splitting into:
    • Ticket Reports
    • Expert Reports
    • Financial Reports
Well-Sized Components (within 1-2 std dev):
  • Billing Payment (5%)
  • Customer Profile (5%)
  • Ticket Assignment (9%)
Undersized Components (<1 std dev):
  • Login (2% - 1,865 statements) - Consider consolidating with Authentication
undefined
总Component数: 18 总Statements数: 82,931 平均Component规模: 4,607条statements 标准差: 5,234条statements
过大Component(超过2倍标准差或10%占比):
  • Reporting(33% - 27,765条statements)- 建议拆分为:
    • Ticket Reports
    • Expert Reports
    • Financial Reports
规模合理的Component(处于1-2倍标准差范围内):
  • Billing Payment(5%)
  • Customer Profile(5%)
  • Ticket Assignment(9%)
过小Component(低于1倍标准差):
  • Login(2% - 1,865条statements)- 建议考虑与Authentication合并
undefined

Component Size Distribution

Component规模分布

markdown
undefined
markdown
undefined

Component Size Distribution

Component规模分布


Component Size Distribution (by percent of codebase)

[Visual representation or histogram if possible]

Largest: ████████████████████████████████████ 33% (Reporting)
████████ 9% (Ticket Assign)
██████ 8% (Ticket)
██████ 6% (Expert Profile)
█████ 5% (Billing Payment)
████ 4% (Billing History)
...

Component规模分布(按代码库占比)

[若可能,提供可视化图表或直方图]

最大: ████████████████████████████████████ 33% (Reporting)
████████ 9% (Ticket Assign)
██████ 8% (Ticket)
██████ 6% (Expert Profile)
█████ 5% (Billing Payment)
████ 4% (Billing History)
...

Recommendations

建议

markdown
undefined
markdown
undefined

Recommendations

建议

High Priority: Split Large Components

高优先级:拆分大型Component

Reporting Component (33% of codebase):
  • Current: Single component with 27,765 statements
  • Issue: Too large, contains multiple functional areas
  • Recommendation: Split into:
    1. Reporting Shared (common utilities)
    2. Ticket Reports (ticket-related reports)
    3. Expert Reports (expert-related reports)
    4. Financial Reports (financial reports)
  • Expected Result: Each component ~7-9% of codebase
Reporting Component(占代码库33%):
  • 当前状态:单一Component包含27,765条statements
  • 问题:规模过大,包含多个功能区域
  • 建议:拆分为:
    1. Reporting Shared(通用工具)
    2. Ticket Reports(工单相关报表)
    3. Expert Reports(专家相关报表)
    4. Financial Reports(财务报表)
  • 预期结果:每个Component占代码库7-9%

Medium Priority: Review Small Components

中优先级:审查小型Component

Login Component (2% of codebase):
  • Current: 1,865 statements, 3 files
  • Consideration: May be too granular if related to broader authentication
  • Recommendation: Evaluate if should be consolidated with Authentication/User components
Login Component(占代码库2%):
  • 当前状态:1,865条statements,3个文件
  • 考虑点:若与更广泛的认证功能相关,可能过于细分
  • 建议:评估是否应与Authentication/User Component合并

Low Priority: Monitor Well-Sized Components

低优先级:监控规模合理的Component

Most components are appropriately sized. Continue monitoring during decomposition.
undefined
大多数Component规模合理,分解过程中持续监控即可。
undefined

Analysis Checklist

分析检查清单

Component Identification:
  • Mapped all directory/namespace structures
  • Identified leaf nodes (components) vs parent nodes (subdomains)
  • Created complete component inventory
  • Documented namespace/path for each component
Size Calculation:
  • Counted statements (not lines) for each component
  • Counted source files (excluding tests/configs)
  • Calculated percentage of total codebase
  • Calculated mean and standard deviation
Size Assessment:
  • Identified oversized components (>threshold or >2 std dev)
  • Identified undersized components (<1% or <1 std dev)
  • Flagged components for splitting or consolidation
  • Documented size distribution
Recommendations:
  • Suggested splits for oversized components
  • Suggested consolidations for undersized components
  • Prioritized recommendations by impact
  • Created architecture stories for refactoring
Component识别:
  • 已映射所有目录/命名空间结构
  • 已区分叶子节点(Component)与父节点(子域)
  • 已创建完整的Component清单
  • 已记录每个Component的命名空间/路径
规模计算:
  • 已统计每个Component的statements数(非代码行数)
  • 已统计源代码文件数(排除测试/配置文件)
  • 已计算占代码库的比例
  • 已计算平均值和标准差
规模评估:
  • 已识别过大Component(超出阈值或2倍标准差)
  • 已识别过小Component(占比<1%或低于1倍标准差)
  • 已标记需要拆分或合并的Component
  • 已记录规模分布情况
建议:
  • 已为过大Component提供拆分建议
  • 已为过小Component提供合并建议
  • 已按影响优先级排序建议
  • 已为重构创建架构任务

Implementation Notes

实现说明

For Node.js/Express Applications

针对Node.js/Express应用

Components typically found in:
  • services/
    - Business logic components
  • routes/
    - API endpoint components
  • models/
    - Data model components
  • utils/
    - Utility components
  • middleware/
    - Middleware components
Example Component Identification:
services/
├── BillingService/          ← Component (leaf node)
│   ├── index.js
│   └── BillingService.js
├── CustomerService/          ← Component (leaf node)
│   └── CustomerService.js
└── NotificationService/      ← Component (leaf node)
    └── NotificationService.js
通常在以下目录中找到Component:
  • services/
    - 业务逻辑Component
  • routes/
    - API端点Component
  • models/
    - 数据模型Component
  • utils/
    - 工具类Component
  • middleware/
    - 中间件Component
Component识别示例:
services/
├── BillingService/          ← Component(叶子节点)
│   ├── index.js
│   └── BillingService.js
├── CustomerService/          ← Component(叶子节点)
│   └── CustomerService.js
└── NotificationService/      ← Component(叶子节点)
    └── NotificationService.js

For Java Applications

针对Java应用

Components identified by package structure:
  • com.company.domain.service
    - Service components
  • com.company.domain.model
    - Model components
  • com.company.domain.repository
    - Repository components
Example Component Identification:
com.company.billing.payment   ← Component (leaf package)
com.company.billing.history   ← Component (leaf package)
com.company.billing           ← Subdomain (parent of payment/history)
通过包结构识别Component:
  • com.company.domain.service
    - 服务Component
  • com.company.domain.model
    - 模型Component
  • com.company.domain.repository
    - 仓库Component
Component识别示例:
com.company.billing.payment   ← Component(叶子包)
com.company.billing.history   ← Component(叶子包)
com.company.billing           ← 子域(payment/history的父节点)

Statement Counting

Statements统计

JavaScript/TypeScript:
  • Count statements terminated by
    ;
    or newline
  • Include: assignments, function calls, returns, conditionals, loops
  • Exclude: comments, blank lines, declarations without assignment
Java:
  • Count statements terminated by
    ;
  • Include: method calls, assignments, returns, conditionals
  • Exclude: class/interface declarations, comments, blank lines
Python:
  • Count executable statements (not comments or blank lines)
  • Include: assignments, function calls, returns, conditionals
  • Exclude: docstrings, comments, blank lines
JavaScript/TypeScript:
  • 统计以
    ;
    或换行符结尾的语句
  • 包含:赋值、函数调用、返回语句、条件判断、循环
  • 排除:注释、空行、无赋值的声明
Java:
  • 统计以
    ;
    结尾的语句
  • 包含:方法调用、赋值、返回语句、条件判断
  • 排除:类/接口声明、注释、空行
Python:
  • 统计可执行语句(排除注释和空行)
  • 包含:赋值、函数调用、返回语句、条件判断
  • 排除:文档字符串、注释、空行

Fitness Functions

适配函数

After identifying and sizing components, create automated checks:
识别并度量Component规模后,创建自动化检查:

Component Size Threshold

Component规模阈值检查

javascript
// Alert if any component exceeds 10% of codebase
function checkComponentSize(components, threshold = 0.1) {
  const totalStatements = components.reduce((sum, c) => sum + c.statements, 0)
  return components
    .filter((c) => c.statements / totalStatements > threshold)
    .map((c) => ({
      component: c.name,
      percent: ((c.statements / totalStatements) * 100).toFixed(1),
      issue: 'Exceeds size threshold',
    }))
}
javascript
// 若任何Component占代码库比例超过10%则发出警报
function checkComponentSize(components, threshold = 0.1) {
  const totalStatements = components.reduce((sum, c) => sum + c.statements, 0)
  return components
    .filter((c) => c.statements / totalStatements > threshold)
    .map((c) => ({
      component: c.name,
      percent: ((c.statements / totalStatements) * 100).toFixed(1),
      issue: 'Exceeds size threshold',
    }))
}

Standard Deviation Check

标准差检查

javascript
// Alert if component is >2 standard deviations from mean
function checkStandardDeviation(components) {
  const sizes = components.map((c) => c.statements)
  const mean = sizes.reduce((a, b) => a + b, 0) / sizes.length
  const stdDev = Math.sqrt(sizes.reduce((sum, size) => sum + Math.pow(size - mean, 2), 0) / (sizes.length - 1))

  return components
    .filter((c) => Math.abs(c.statements - mean) > 2 * stdDev)
    .map((c) => ({
      component: c.name,
      deviation: ((c.statements - mean) / stdDev).toFixed(2),
      issue: 'More than 2 standard deviations from mean',
    }))
}
javascript
// 若Component与平均规模偏差超过2倍标准差则发出警报
function checkStandardDeviation(components) {
  const sizes = components.map((c) => c.statements)
  const mean = sizes.reduce((a, b) => a + b, 0) / sizes.length
  const stdDev = Math.sqrt(sizes.reduce((sum, size) => sum + Math.pow(size - mean, 2), 0) / (sizes.length - 1))

  return components
    .filter((c) => Math.abs(c.statements - mean) > 2 * stdDev)
    .map((c) => ({
      component: c.name,
      deviation: ((c.statements - mean) / stdDev).toFixed(2),
      issue: 'More than 2 standard deviations from mean',
    }))
}

Best Practices

最佳实践

Do's ✅

建议✅

  • Use statements, not lines of code
  • Identify components as leaf nodes only
  • Calculate both percentage and standard deviation
  • Consider application size when setting thresholds
  • Document namespace/path for each component
  • Create visual size distribution if possible
  • 使用statements而非代码行数
  • 仅将叶子节点识别为Component
  • 同时计算占比和标准差
  • 设置阈值时考虑应用规模
  • 记录每个Component的命名空间/路径
  • 尽可能创建可视化的规模分布

Don'ts ❌

禁忌❌

  • Don't count test files in component size
  • Don't treat parent directories as components
  • Don't use fixed thresholds without considering app size
  • Don't ignore small components (may need consolidation)
  • Don't skip standard deviation calculation
  • Don't mix infrastructure and domain components in same analysis
  • 不要将测试文件计入Component规模
  • 不要将父目录视为Component
  • 不要不考虑应用规模就使用固定阈值
  • 不要忽略小型Component(可能需要合并)
  • 不要跳过标准差计算
  • 不要在同一分析中混合基础设施和领域Component

Next Steps

后续步骤

After completing component identification and sizing:
  1. Apply Gather Common Domain Components Pattern - Identify duplicate functionality
  2. Apply Flatten Components Pattern - Remove orphaned classes from root namespaces
  3. Apply Determine Component Dependencies Pattern - Analyze coupling between components
  4. Create Component Domains - Group components into logical domains
完成Component识别与规模度量后:
  1. 应用通用领域Component收集模式 - 识别重复功能
  2. 应用Component扁平化模式 - 从根命名空间中移除孤立类
  3. 应用Component依赖关系确定模式 - 分析Component间的耦合度
  4. 创建Component域 - 将Component分组为逻辑域

Notes

注意事项

  • Component size thresholds vary by application size
  • Small apps (<10 components): 30% threshold may be appropriate
  • Large apps (>20 components): 10% threshold is more appropriate
  • Standard deviation is more reliable than fixed percentages
  • Well-sized components are 1-2 standard deviations from mean
  • Oversized components often contain multiple functional areas that can be split
  • Component规模阈值因应用规模而异
  • 小型应用(<10个Component):30%的阈值可能更合适
  • 大型应用(>20个Component):10%的阈值更合适
  • 标准差比固定百分比更可靠
  • 规模合理的Component处于平均规模1-2倍标准差范围内
  • 过大的Component通常包含可拆分的多个功能区域