component-identification-sizing
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseComponent Identification and Sizing
Component识别与规模度量
This skill identifies architectural components (logical building blocks) in a codebase and calculates size metrics to assess decomposition feasibility and identify oversized components.
本Skill可识别代码库中的架构Component(逻辑构建块),并计算规模指标以评估分解可行性,同时识别过大的Component。
How to Use
使用方法
Quick Start
快速开始
Request analysis of your codebase:
- "Identify and size all components in this codebase"
- "Find oversized components that need splitting"
- "Create a component inventory for decomposition planning"
- "Analyze component size distribution"
请求分析你的代码库:
- "识别并度量此代码库中所有Component的规模"
- "找出需要拆分的过大Component"
- "创建Component清单用于分解规划"
- "分析Component规模分布"
Usage Examples
使用示例
Example 1: Complete Analysis
User: "Identify and size all components in this codebase"
The skill will:
1. Map directory/namespace structures
2. Identify all components (leaf nodes)
3. Calculate size metrics (statements, files, percentages)
4. Generate component inventory table
5. Flag oversized/undersized components
6. Provide recommendationsExample 2: Find Oversized Components
User: "Which components are too large?"
The skill will:
1. Calculate mean and standard deviation
2. Identify components >2 std dev or >10% threshold
3. Analyze functional areas within large components
4. Suggest specific splits with estimated sizesExample 3: Component Size Analysis
User: "Analyze component sizes and distribution"
The skill will:
1. Calculate all size metrics
2. Generate size distribution summary
3. Identify outliers
4. Provide statistics and recommendations示例1:完整分析
用户: "识别并度量此代码库中所有Component的规模"
该Skill将:
1. 映射目录/命名空间结构
2. 识别所有Component(叶子节点)
3. 计算规模指标(statements、文件数、占比)
4. 生成Component清单表格
5. 标记过大/过小的Component
6. 提供建议示例2:查找过大Component
用户: "哪些Component规模过大?"
该Skill将:
1. 计算平均值和标准差
2. 识别超出2倍标准差或10%阈值的Component
3. 分析大型Component内的功能区域
4. 提供带预估规模的具体拆分建议示例3:Component规模分析
用户: "分析Component的规模及分布情况"
该Skill将:
1. 计算所有规模指标
2. 生成规模分布摘要
3. 识别异常值
4. 提供统计数据和建议Step-by-Step Process
分步流程
- Initial Analysis: Start with complete component inventory
- Identify Issues: Find components that need attention
- Get Recommendations: Request actionable split/consolidation suggestions
- Monitor Progress: Track component growth over time
- 初始分析:从完整的Component清单开始
- 问题识别:找出需要关注的Component
- 获取建议:请求可执行的拆分/合并建议
- 进度监控:跟踪Component随时间的增长情况
When to Use
适用场景
Apply this skill when:
- Starting a monolithic decomposition effort
- Assessing codebase structure and organization
- Identifying components that are too large or too small
- Creating component inventory for migration planning
- Analyzing code distribution across components
- Preparing for component-based decomposition patterns
在以下场景中应用本Skill:
- 启动单体应用分解工作时
- 评估代码库结构与组织方式时
- 识别过大或过小的Component时
- 为迁移规划创建Component清单时
- 分析代码在各Component间的分布情况时
- 准备基于Component的分解模式时
Core Concepts
核心概念
Component Definition
Component定义
A component is an architectural building block that:
- Has a well-defined role and responsibility
- Is identified by a namespace, package structure, or directory path
- Contains source code files (classes, functions, modules) grouped together
- Performs specific business or infrastructure functionality
Key Rule: Components are identified by leaf nodes in directory/namespace structures. If a namespace is extended (e.g., extended to ), the parent becomes a subdomain, not a component.
services/billingservices/billing/paymentComponent是一种架构构建块,需满足:
- 具备明确的角色与职责
- 通过命名空间、包结构或目录路径识别
- 包含分组在一起的源代码文件(类、函数、模块)
- 执行特定的业务或基础设施功能
关键规则:Component由目录/命名空间结构中的叶子节点识别。若某个命名空间被扩展(例如扩展为),则父节点成为子域,而非Component。
services/billingservices/billing/paymentSize Metrics
规模指标
Statements (not lines of code):
- Count executable statements terminated by semicolons or newlines
- More accurate than lines of code for size comparison
- Accounts for code complexity, not formatting
Component Size Indicators:
- Percent of codebase: Component statements / Total statements
- File count: Number of source files in component
- Standard deviation: Distance from mean component size
Statements(非代码行数):
- 统计以分号或换行符结尾的可执行语句
- 相比代码行数,更适合用于规模比较
- 可体现代码复杂度,不受格式影响
Component规模指标:
- 代码库占比:Component语句数 / 总语句数
- 文件数:Component中的源代码文件数量
- 标准差:与平均Component规模的偏差值
Analysis Process
分析流程
Phase 1: Identify Components
阶段1:识别Component
Scan the codebase directory structure:
-
Map directory/namespace structure
- For Node.js: ,
services/,routes/,models/utils/ - For Java: Package structure (e.g., )
com.company.domain.service - For Python: Module paths (e.g., )
app/billing/payment
- For Node.js:
-
Identify leaf nodes
- Components are the deepest directories containing source files
- Example: is a component
services/BillingService/ - Example: extends it, making
services/BillingService/payment/a subdomainBillingService
-
Create component inventory
- List each component with its namespace/path
- Note any parent namespaces (subdomains)
扫描代码库目录结构:
-
映射目录/命名空间结构
- 针对Node.js:、
services/、routes/、models/utils/ - 针对Java:包结构(例如)
com.company.domain.service - 针对Python:模块路径(例如)
app/billing/payment
- 针对Node.js:
-
识别叶子节点
- Component是包含源代码文件的最深层级目录
- 示例:是一个Component
services/BillingService/ - 示例:是其扩展,此时
services/BillingService/payment/成为子域BillingService
-
创建Component清单
- 列出每个Component及其命名空间/路径
- 记录所有父命名空间(子域)
Phase 2: Calculate Size Metrics
阶段2:计算规模指标
For each component:
-
Count statements
- Parse source files in component directory
- Count executable statements (not comments, blank lines, or declarations alone)
- Sum across all files in component
-
Count files
- Total source files (,
.js,.ts,.java, etc.).py - Exclude test files, config files, documentation
- Total source files (
-
Calculate percentage
component_percent = (component_statements / total_statements) * 100 -
Calculate statistics
- Mean component size:
total_statements / number_of_components - Standard deviation:
sqrt(sum((size - mean)^2) / (n - 1)) - Component's deviation:
(component_size - mean) / std_dev
- Mean component size:
针对每个Component:
-
统计Statements数量
- 解析Component目录下的源代码文件
- 统计可执行语句(不包含注释、空行或单独的声明)
- 汇总Component下所有文件的语句数
-
统计文件数
- 源代码文件总数(、
.js、.ts、.java等).py - 排除测试文件、配置文件、文档
- 源代码文件总数(
-
计算占比
component_percent = (component_statements / total_statements) * 100 -
计算统计数据
- 平均Component规模:
total_statements / number_of_components - 标准差:
sqrt(sum((size - mean)^2) / (n - 1)) - Component偏差值:
(component_size - mean) / std_dev
- 平均Component规模:
Phase 3: Identify Size Issues
阶段3:识别规模问题
Oversized Components (candidates for splitting):
- Exceeds 30% of total codebase (for small apps with <10 components)
- Exceeds 10% of total codebase (for large apps with >20 components)
- More than 2 standard deviations above mean
- Contains multiple distinct functional areas
Undersized Components (candidates for consolidation):
- Less than 1% of codebase (may be too granular)
- Less than 1 standard deviation below mean
- Contains only a few files with minimal functionality
Well-Sized Components:
- Between 1-2 standard deviations from mean
- Represents a single, cohesive functional area
- Appropriate percentage for application size
过大Component(拆分候选):
- 占代码库比例超过30%(适用于Component数量<10的小型应用)
- 占代码库比例超过10%(适用于Component数量>20的大型应用)
- 超过平均规模2倍标准差
- 包含多个不同的功能区域
过小Component(合并候选):
- 占代码库比例不足1%(可能过于细分)
- 低于平均规模1倍标准差
- 仅包含少量功能简单的文件
规模合理的Component:
- 处于平均规模1-2倍标准差范围内
- 代表单一、内聚的功能区域
- 占比符合应用规模的合理范围
Output Format
输出格式
Component Inventory Table
Component清单表格
markdown
undefinedmarkdown
undefinedComponent Inventory
Component清单
| Component Name | Namespace/Path | Statements | Files | Percent | Status |
|---|---|---|---|---|---|
| Billing Payment | services/BillingService | 4,312 | 23 | 5% | ✅ OK |
| Reporting | services/ReportingService | 27,765 | 162 | 33% | ⚠️ Too Large |
| Notification | services/NotificationService | 1,433 | 7 | 2% | ✅ OK |
**Status Legend**:
- ✅ OK: Well-sized (within 1-2 std dev from mean)
- ⚠️ Too Large: Exceeds size threshold or >2 std dev above mean
- 🔍 Too Small: <1% of codebase or <1 std dev below mean| Component名称 | 命名空间/路径 | Statements | 文件数 | 占比 | 状态 |
|---|---|---|---|---|---|
| Billing Payment | services/BillingService | 4,312 | 23 | 5% | ✅ 正常 |
| Reporting | services/ReportingService | 27,765 | 162 | 33% | ⚠️ 过大 |
| Notification | services/NotificationService | 1,433 | 7 | 2% | ✅ 正常 |
**状态说明**:
- ✅ 正常:规模合理(处于平均规模1-2倍标准差范围内)
- ⚠️ 过大:超出规模阈值或高于平均规模2倍标准差
- 🔍 过小:占代码库比例<1%或低于平均规模1倍标准差Size Analysis Summary
规模分析摘要
markdown
undefinedmarkdown
undefinedSize Analysis Summary
规模分析摘要
Total Components: 18
Total Statements: 82,931
Mean Component Size: 4,607 statements
Standard Deviation: 5,234 statements
Oversized Components (>2 std dev or >10%):
- Reporting (33% - 27,765 statements) - Consider splitting into:
- Ticket Reports
- Expert Reports
- Financial Reports
Well-Sized Components (within 1-2 std dev):
- Billing Payment (5%)
- Customer Profile (5%)
- Ticket Assignment (9%)
Undersized Components (<1 std dev):
- Login (2% - 1,865 statements) - Consider consolidating with Authentication
undefined总Component数: 18
总Statements数: 82,931
平均Component规模: 4,607条statements
标准差: 5,234条statements
过大Component(超过2倍标准差或10%占比):
- Reporting(33% - 27,765条statements)- 建议拆分为:
- Ticket Reports
- Expert Reports
- Financial Reports
规模合理的Component(处于1-2倍标准差范围内):
- Billing Payment(5%)
- Customer Profile(5%)
- Ticket Assignment(9%)
过小Component(低于1倍标准差):
- Login(2% - 1,865条statements)- 建议考虑与Authentication合并
undefinedComponent Size Distribution
Component规模分布
markdown
undefinedmarkdown
undefinedComponent Size Distribution
Component规模分布
Component Size Distribution (by percent of codebase)
[Visual representation or histogram if possible]
Largest: ████████████████████████████████████ 33% (Reporting)
████████ 9% (Ticket Assign)
██████ 8% (Ticket)
██████ 6% (Expert Profile)
█████ 5% (Billing Payment)
████ 4% (Billing History)
...
Component规模分布(按代码库占比)
[若可能,提供可视化图表或直方图]
最大: ████████████████████████████████████ 33% (Reporting)
████████ 9% (Ticket Assign)
██████ 8% (Ticket)
██████ 6% (Expert Profile)
█████ 5% (Billing Payment)
████ 4% (Billing History)
...
Recommendations
建议
markdown
undefinedmarkdown
undefinedRecommendations
建议
High Priority: Split Large Components
高优先级:拆分大型Component
Reporting Component (33% of codebase):
- Current: Single component with 27,765 statements
- Issue: Too large, contains multiple functional areas
- Recommendation: Split into:
- Reporting Shared (common utilities)
- Ticket Reports (ticket-related reports)
- Expert Reports (expert-related reports)
- Financial Reports (financial reports)
- Expected Result: Each component ~7-9% of codebase
Reporting Component(占代码库33%):
- 当前状态:单一Component包含27,765条statements
- 问题:规模过大,包含多个功能区域
- 建议:拆分为:
- Reporting Shared(通用工具)
- Ticket Reports(工单相关报表)
- Expert Reports(专家相关报表)
- Financial Reports(财务报表)
- 预期结果:每个Component占代码库7-9%
Medium Priority: Review Small Components
中优先级:审查小型Component
Login Component (2% of codebase):
- Current: 1,865 statements, 3 files
- Consideration: May be too granular if related to broader authentication
- Recommendation: Evaluate if should be consolidated with Authentication/User components
Login Component(占代码库2%):
- 当前状态:1,865条statements,3个文件
- 考虑点:若与更广泛的认证功能相关,可能过于细分
- 建议:评估是否应与Authentication/User Component合并
Low Priority: Monitor Well-Sized Components
低优先级:监控规模合理的Component
Most components are appropriately sized. Continue monitoring during decomposition.
undefined大多数Component规模合理,分解过程中持续监控即可。
undefinedAnalysis Checklist
分析检查清单
Component Identification:
- Mapped all directory/namespace structures
- Identified leaf nodes (components) vs parent nodes (subdomains)
- Created complete component inventory
- Documented namespace/path for each component
Size Calculation:
- Counted statements (not lines) for each component
- Counted source files (excluding tests/configs)
- Calculated percentage of total codebase
- Calculated mean and standard deviation
Size Assessment:
- Identified oversized components (>threshold or >2 std dev)
- Identified undersized components (<1% or <1 std dev)
- Flagged components for splitting or consolidation
- Documented size distribution
Recommendations:
- Suggested splits for oversized components
- Suggested consolidations for undersized components
- Prioritized recommendations by impact
- Created architecture stories for refactoring
Component识别:
- 已映射所有目录/命名空间结构
- 已区分叶子节点(Component)与父节点(子域)
- 已创建完整的Component清单
- 已记录每个Component的命名空间/路径
规模计算:
- 已统计每个Component的statements数(非代码行数)
- 已统计源代码文件数(排除测试/配置文件)
- 已计算占代码库的比例
- 已计算平均值和标准差
规模评估:
- 已识别过大Component(超出阈值或2倍标准差)
- 已识别过小Component(占比<1%或低于1倍标准差)
- 已标记需要拆分或合并的Component
- 已记录规模分布情况
建议:
- 已为过大Component提供拆分建议
- 已为过小Component提供合并建议
- 已按影响优先级排序建议
- 已为重构创建架构任务
Implementation Notes
实现说明
For Node.js/Express Applications
针对Node.js/Express应用
Components typically found in:
- - Business logic components
services/ - - API endpoint components
routes/ - - Data model components
models/ - - Utility components
utils/ - - Middleware components
middleware/
Example Component Identification:
services/
├── BillingService/ ← Component (leaf node)
│ ├── index.js
│ └── BillingService.js
├── CustomerService/ ← Component (leaf node)
│ └── CustomerService.js
└── NotificationService/ ← Component (leaf node)
└── NotificationService.js通常在以下目录中找到Component:
- - 业务逻辑Component
services/ - - API端点Component
routes/ - - 数据模型Component
models/ - - 工具类Component
utils/ - - 中间件Component
middleware/
Component识别示例:
services/
├── BillingService/ ← Component(叶子节点)
│ ├── index.js
│ └── BillingService.js
├── CustomerService/ ← Component(叶子节点)
│ └── CustomerService.js
└── NotificationService/ ← Component(叶子节点)
└── NotificationService.jsFor Java Applications
针对Java应用
Components identified by package structure:
- - Service components
com.company.domain.service - - Model components
com.company.domain.model - - Repository components
com.company.domain.repository
Example Component Identification:
com.company.billing.payment ← Component (leaf package)
com.company.billing.history ← Component (leaf package)
com.company.billing ← Subdomain (parent of payment/history)通过包结构识别Component:
- - 服务Component
com.company.domain.service - - 模型Component
com.company.domain.model - - 仓库Component
com.company.domain.repository
Component识别示例:
com.company.billing.payment ← Component(叶子包)
com.company.billing.history ← Component(叶子包)
com.company.billing ← 子域(payment/history的父节点)Statement Counting
Statements统计
JavaScript/TypeScript:
- Count statements terminated by or newline
; - Include: assignments, function calls, returns, conditionals, loops
- Exclude: comments, blank lines, declarations without assignment
Java:
- Count statements terminated by
; - Include: method calls, assignments, returns, conditionals
- Exclude: class/interface declarations, comments, blank lines
Python:
- Count executable statements (not comments or blank lines)
- Include: assignments, function calls, returns, conditionals
- Exclude: docstrings, comments, blank lines
JavaScript/TypeScript:
- 统计以或换行符结尾的语句
; - 包含:赋值、函数调用、返回语句、条件判断、循环
- 排除:注释、空行、无赋值的声明
Java:
- 统计以结尾的语句
; - 包含:方法调用、赋值、返回语句、条件判断
- 排除:类/接口声明、注释、空行
Python:
- 统计可执行语句(排除注释和空行)
- 包含:赋值、函数调用、返回语句、条件判断
- 排除:文档字符串、注释、空行
Fitness Functions
适配函数
After identifying and sizing components, create automated checks:
识别并度量Component规模后,创建自动化检查:
Component Size Threshold
Component规模阈值检查
javascript
// Alert if any component exceeds 10% of codebase
function checkComponentSize(components, threshold = 0.1) {
const totalStatements = components.reduce((sum, c) => sum + c.statements, 0)
return components
.filter((c) => c.statements / totalStatements > threshold)
.map((c) => ({
component: c.name,
percent: ((c.statements / totalStatements) * 100).toFixed(1),
issue: 'Exceeds size threshold',
}))
}javascript
// 若任何Component占代码库比例超过10%则发出警报
function checkComponentSize(components, threshold = 0.1) {
const totalStatements = components.reduce((sum, c) => sum + c.statements, 0)
return components
.filter((c) => c.statements / totalStatements > threshold)
.map((c) => ({
component: c.name,
percent: ((c.statements / totalStatements) * 100).toFixed(1),
issue: 'Exceeds size threshold',
}))
}Standard Deviation Check
标准差检查
javascript
// Alert if component is >2 standard deviations from mean
function checkStandardDeviation(components) {
const sizes = components.map((c) => c.statements)
const mean = sizes.reduce((a, b) => a + b, 0) / sizes.length
const stdDev = Math.sqrt(sizes.reduce((sum, size) => sum + Math.pow(size - mean, 2), 0) / (sizes.length - 1))
return components
.filter((c) => Math.abs(c.statements - mean) > 2 * stdDev)
.map((c) => ({
component: c.name,
deviation: ((c.statements - mean) / stdDev).toFixed(2),
issue: 'More than 2 standard deviations from mean',
}))
}javascript
// 若Component与平均规模偏差超过2倍标准差则发出警报
function checkStandardDeviation(components) {
const sizes = components.map((c) => c.statements)
const mean = sizes.reduce((a, b) => a + b, 0) / sizes.length
const stdDev = Math.sqrt(sizes.reduce((sum, size) => sum + Math.pow(size - mean, 2), 0) / (sizes.length - 1))
return components
.filter((c) => Math.abs(c.statements - mean) > 2 * stdDev)
.map((c) => ({
component: c.name,
deviation: ((c.statements - mean) / stdDev).toFixed(2),
issue: 'More than 2 standard deviations from mean',
}))
}Best Practices
最佳实践
Do's ✅
建议✅
- Use statements, not lines of code
- Identify components as leaf nodes only
- Calculate both percentage and standard deviation
- Consider application size when setting thresholds
- Document namespace/path for each component
- Create visual size distribution if possible
- 使用statements而非代码行数
- 仅将叶子节点识别为Component
- 同时计算占比和标准差
- 设置阈值时考虑应用规模
- 记录每个Component的命名空间/路径
- 尽可能创建可视化的规模分布
Don'ts ❌
禁忌❌
- Don't count test files in component size
- Don't treat parent directories as components
- Don't use fixed thresholds without considering app size
- Don't ignore small components (may need consolidation)
- Don't skip standard deviation calculation
- Don't mix infrastructure and domain components in same analysis
- 不要将测试文件计入Component规模
- 不要将父目录视为Component
- 不要不考虑应用规模就使用固定阈值
- 不要忽略小型Component(可能需要合并)
- 不要跳过标准差计算
- 不要在同一分析中混合基础设施和领域Component
Next Steps
后续步骤
After completing component identification and sizing:
- Apply Gather Common Domain Components Pattern - Identify duplicate functionality
- Apply Flatten Components Pattern - Remove orphaned classes from root namespaces
- Apply Determine Component Dependencies Pattern - Analyze coupling between components
- Create Component Domains - Group components into logical domains
完成Component识别与规模度量后:
- 应用通用领域Component收集模式 - 识别重复功能
- 应用Component扁平化模式 - 从根命名空间中移除孤立类
- 应用Component依赖关系确定模式 - 分析Component间的耦合度
- 创建Component域 - 将Component分组为逻辑域
Notes
注意事项
- Component size thresholds vary by application size
- Small apps (<10 components): 30% threshold may be appropriate
- Large apps (>20 components): 10% threshold is more appropriate
- Standard deviation is more reliable than fixed percentages
- Well-sized components are 1-2 standard deviations from mean
- Oversized components often contain multiple functional areas that can be split
- Component规模阈值因应用规模而异
- 小型应用(<10个Component):30%的阈值可能更合适
- 大型应用(>20个Component):10%的阈值更合适
- 标准差比固定百分比更可靠
- 规模合理的Component处于平均规模1-2倍标准差范围内
- 过大的Component通常包含可拆分的多个功能区域