xlsx
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese输出要求
Output Requirements
所有 Excel 文件
All Excel Files
专业字体
Professional Fonts
- 除非用户另有说明,否则所有交付物使用一致的专业字体(例如:Arial、Times New Roman)
- Unless specified otherwise by the user, use a consistent professional font (e.g., Arial, Times New Roman) for all deliverables
零公式错误
Zero Formula Errors
- 每个 Excel 模型必须以零公式错误交付(#REF!、#DIV/0!、#VALUE!、#N/A、#NAME?)
- Each Excel model must be delivered with zero formula errors (#REF!, #DIV/0!, #VALUE!, #N/A, #NAME?)
保留现有模板(更新模板时)
Preserve Existing Templates (When Updating)
- 修改文件时,研究和完全匹配现有的格式、样式和约定
- 不要对有既定模式的文件强加标准化格式
- 现有模板约定始终覆盖这些指南
- When modifying files, research and fully match existing formatting, styles, and conventions
- Do not impose standardized formatting on files with established patterns
- Existing template conventions always override these guidelines
财务模型
Financial Modeling
颜色编码标准
Color Coding Standards
除非用户或现有模板另有说明
Unless specified otherwise by the user or existing templates
行业标准颜色约定
Industry Standard Color Conventions
- 蓝色文本 (RGB: 0,0,255):硬编码输入和用户将更改场景的数字
- 黑色文本 (RGB: 0,0,0):所有公式和计算
- 绿色文本 (RGB: 0,128,0):从同一工作簿内其他工作表提取的链接
- 红色文本 (RGB: 255,0,0):到其他文件的外部链接
- 黄色背景 (RGB: 255,255,0):需要注意的关键假设或需要更新的单元格
- Blue Text (RGB: 0,0,255): Hard-coded inputs and numbers users will change for scenarios
- Black Text (RGB: 0,0,0): All formulas and calculations
- Green Text (RGB: 0,128,0): Links extracted from other worksheets within the same workbook
- Red Text (RGB: 255,0,0): External links to other files
- Yellow Background (RGB: 255,255,0): Key assumptions requiring attention or cells needing updates
数字格式标准
Number Format Standards
必需格式规则
Mandatory Format Rules
- 年份:格式化为文本字符串(例如:"2024"而不是"2,024")
- 货币:使用$#,##0格式;始终在标题中指定单位("收入(百万美元)")
- 零:使用数字格式使所有零显示为"-",包括百分比(例如:"$#,##0;($#,##0);-")
- 百分比:默认为0.0%格式(一位小数)
- 倍数:格式化为0.0x用于估值倍数(EV/EBITDA、P/E)
- 负数:使用括号(123)而不是减号-123
- Years: Formatted as text strings (e.g., "2024" instead of "2,024")
- Currency: Use $#,##0 format; always specify units in headers ("Revenue (USD Millions)")
- Zeros: Use number formatting to display all zeros as "-", including percentages (e.g., "$#,##0;($#,##0);-")
- Percentages: Default to 0.0% format (one decimal place)
- Multiples: Formatted as 0.0x for valuation multiples (EV/EBITDA, P/E)
- Negative Numbers: Use parentheses (123) instead of hyphens -123
公式构建规则
Formula Construction Rules
假设位置
Assumption Placement
- 将所有假设(增长率、利润率、倍数等)放在单独的假设单元格中
- 在公式中使用单元格引用而不是硬编码值
- 示例:使用=B5*(1+$B$6)而不是=B5*1.05
- Place all assumptions (growth rates, profit margins, multiples, etc.) in separate assumption cells
- Use cell references in formulas instead of hard-coded values
- Example: Use =B5*(1+$B$6) instead of =B5*1.05
公式错误预防
Formula Error Prevention
- 验证所有单元格引用正确
- 检查范围中的差一错误
- 确保所有预测期间的公式一致
- 使用边缘情况测试(零值、负数)
- 验证没有意外的循环引用
- Verify all cell references are correct
- Check for off-by-one errors in ranges
- Ensure formulas are consistent across all forecast periods
- Test with edge cases (zero values, negative numbers)
- Verify there are no accidental circular references
硬编码的文档要求
Documentation Requirements for Hard-Coded Values
- 在单元格旁边注释(如果在表格末尾)。格式:"来源:[系统/文档],[日期],[具体引用],[URL(如适用)]"
- 示例:
- "来源:公司10-K,2024财年,第45页,收入说明,[SEC EDGAR URL]"
- "来源:公司10-Q,2025年第二季度,展品99.1,[SEC EDGAR URL]"
- "来源:Bloomberg终端,2025年8月15日,AAPL美国股票"
- "来源:FactSet,2025年8月20日,一致预期屏幕"
- Add comments next to cells (if at the end of a table). Format: "Source: [System/Document], [Date], [Specific Reference], [URL (if applicable)]"
- Examples:
- "Source: Company 10-K, FY2024, Page 45, Revenue Statement, [SEC EDGAR URL]"
- "Source: Company 10-Q, Q2 2025, Exhibit 99.1, [SEC EDGAR URL]"
- "Source: Bloomberg Terminal, August 15, 2025, AAPL US Equity"
- "Source: FactSet, August 20, 2025, Consensus Estimates Screen"
XLSX 创建、编辑和分析
XLSX Creation, Editing, and Analysis
概述
Overview
用户可能要求你创建、编辑或分析.xlsx文件的内容。对于不同的任务,你有不同的工具和工作流程。
Users may ask you to create, edit, or analyze the content of .xlsx files. You have different tools and workflows for different tasks.
重要要求
Key Requirements
LibreOffice 公式重新计算所需:你可以假设安装了LibreOffice,用于使用脚本重新计算公式值。该脚本在首次运行时自动配置LibreOffice,包括在Unix套接字受限的沙盒环境中(由处理)
scripts/recalc.pyscripts/office/soffice.pyLibreOffice Required for Formula Recalculation: You can assume LibreOffice is installed for recalculating formula values using the script. This script automatically configures LibreOffice on first run, including in sandbox environments with restricted Unix sockets (handled by )
scripts/recalc.pyscripts/office/soffice.py读取和分析数据
Reading and Analyzing Data
使用pandas进行数据分析
Data Analysis with pandas
对于数据分析、可视化和基本操作,使用pandas,它提供强大的数据处理功能:
python
import pandas as pdFor data analysis, visualization, and basic operations, use pandas, which provides powerful data processing capabilities:
python
import pandas as pd读取Excel
Read Excel
df = pd.read_excel('file.xlsx') # 默认:第一个工作表
all_sheets = pd.read_excel('file.xlsx', sheet_name=None) # 所有工作表作为字典
df = pd.read_excel('file.xlsx') # Default: first worksheet
all_sheets = pd.read_excel('file.xlsx', sheet_name=None) # All worksheets as dictionary
分析
Analysis
df.head() # 预览数据
df.info() # 列信息
df.describe() # 统计信息
df.head() # Preview data
df.info() # Column information
df.describe() # Statistical information
写入Excel
Write Excel
df.to_excel('output.xlsx', index=False)
undefineddf.to_excel('output.xlsx', index=False)
undefinedExcel 文件工作流程
Excel File Workflow
关键:使用公式,而不是硬编码值
Critical: Use Formulas, Not Hard-Coded Values
**始终使用Excel公式,而不是在Python中计算值并硬编码它们。**这确保电子表格保持动态和可更新。
Always use Excel formulas instead of calculating values in Python and hard-coding them. This ensures the spreadsheet remains dynamic and updatable.
❌ 错误 - 硬编码计算值
❌ Incorrect - Hard-Coded Calculated Values
python
undefinedpython
undefined坏:在Python中计算并硬编码结果
Bad: Calculate in Python and hard-code the result
total = df['Sales'].sum()
sheet['B10'] = total # 硬编码5000
total = df['Sales'].sum()
sheet['B10'] = total # Hard-codes 5000
坏:在Python中计算增长率
Bad: Calculate growth rate in Python
growth = (df.iloc[-1]['Revenue'] - df.iloc[0]['Revenue']) / df.iloc[0]['Revenue']
sheet['C5'] = growth # 硬编码0.15
growth = (df.iloc[-1]['Revenue'] - df.iloc[0]['Revenue']) / df.iloc[0]['Revenue']
sheet['C5'] = growth # Hard-codes 0.15
坏:Python计算平均值
Bad: Calculate average in Python
avg = sum(values) / len(values)
sheet['D20'] = avg # 硬编码42.5
undefinedavg = sum(values) / len(values)
sheet['D20'] = avg # Hard-codes 42.5
undefined✅ 正确 - 使用Excel公式
✅ Correct - Use Excel Formulas
python
undefinedpython
undefined好:让Excel计算总和
Good: Let Excel calculate the sum
sheet['B10'] = '=SUM(B2:B9)'
sheet['B10'] = '=SUM(B2:B9)'
好:增长率作为Excel公式
Good: Growth rate as Excel formula
sheet['C5'] = '=(C4-C2)/C2'
sheet['C5'] = '=(C4-C2)/C2'
好:使用Excel函数计算平均值
Good: Calculate average using Excel function
sheet['D20'] = '=AVERAGE(D2:D19)'
这适用于所有计算——总计、百分比、比率、差异等。当源数据更改时,电子表格应该能够重新计算。sheet['D20'] = '=AVERAGE(D2:D19)'
This applies to all calculations—totals, percentages, ratios, differences, etc. The spreadsheet should be able to recalculate when source data changes.常见工作流程
Common Workflows
- 选择工具:pandas用于数据,openpyxl用于公式/格式化
- 创建/加载:创建新工作簿或加载现有文件
- 修改:添加/编辑数据、公式和格式
- 保存:写入文件
- 重新计算公式(如果使用公式则必须):使用scripts/recalc.py脚本
bash
python scripts/recalc.py output.xlsx - 验证并修复任何错误:
- 脚本返回包含错误详细信息的JSON
- 如果是
status,检查errors_found以获取特定错误类型和位置error_summary - 修复已识别的错误并重新计算
- 要修复的常见错误:
- :无效的单元格引用
#REF! - :除以零
#DIV/0! - :公式中错误的数据类型
#VALUE! - :无法识别的公式名称
#NAME?
- Select Tool: Use pandas for data, openpyxl for formulas/formatting
- Create/Load: Create a new workbook or load an existing file
- Modify: Add/edit data, formulas, and formatting
- Save: Write to file
- Recalculate Formulas (Mandatory if using formulas): Use the scripts/recalc.py script
bash
python scripts/recalc.py output.xlsx - Verify and Fix Any Errors:
- The script returns JSON with error details
- If is
status, checkerrors_foundfor specific error types and locationserror_summary - Fix identified errors and recalculate
- Common errors to fix:
- : Invalid cell reference
#REF! - : Division by zero
#DIV/0! - : Incorrect data type in formula
#VALUE! - : Unrecognized formula name
#NAME?
创建新的Excel文件
Creating a New Excel File
python
undefinedpython
undefined使用openpyxl进行公式和格式化
Use openpyxl for formulas and formatting
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment
wb = Workbook()
sheet = wb.active
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment
wb = Workbook()
sheet = wb.active
添加数据
Add data
sheet['A1'] = 'Hello'
sheet['B1'] = 'World'
sheet.append(['Row', 'of', 'data'])
sheet['A1'] = 'Hello'
sheet['B1'] = 'World'
sheet.append(['Row', 'of', 'data'])
添加公式
Add formula
sheet['B2'] = '=SUM(A1:A10)'
sheet['B2'] = '=SUM(A1:A10)'
格式化
Format
sheet['A1'].font = Font(bold=True, color='FF0000')
sheet['A1'].fill = PatternFill('solid', start_color='FFFF00')
sheet['A1'].alignment = Alignment(horizontal='center')
sheet['A1'].font = Font(bold=True, color='FF0000')
sheet['A1'].fill = PatternFill('solid', start_color='FFFF00')
sheet['A1'].alignment = Alignment(horizontal='center')
列宽
Column width
sheet.column_dimensions['A'].width = 20
wb.save('output.xlsx')
undefinedsheet.column_dimensions['A'].width = 20
wb.save('output.xlsx')
undefined编辑现有的Excel文件
Editing an Existing Excel File
python
undefinedpython
undefined使用openpyxl保留公式和格式
Use openpyxl to preserve formulas and formatting
from openpyxl import load_workbook
from openpyxl import load_workbook
加载现有文件
Load existing file
wb = load_workbook('existing.xlsx')
sheet = wb.active # 或wb['SheetName']用于特定工作表
wb = load_workbook('existing.xlsx')
sheet = wb.active # or wb['SheetName'] for specific worksheets
处理多个工作表
Process multiple worksheets
for sheet_name in wb.sheetnames:
sheet = wb[sheet_name]
print(f"工作表:{sheet_name}")
for sheet_name in wb.sheetnames:
sheet = wb[sheet_name]
print(f"Worksheet: {sheet_name}")
修改单元格
Modify cells
sheet['A1'] = 'New Value'
sheet.insert_rows(2) # 在位置2插入行
sheet.delete_cols(3) # 删除第3列
sheet['A1'] = 'New Value'
sheet.insert_rows(2) # Insert row at position 2
sheet.delete_cols(3) # Delete column 3
添加新工作表
Add new worksheet
new_sheet = wb.create_sheet('NewSheet')
new_sheet['A1'] = 'Data'
wb.save('modified.xlsx')
undefinednew_sheet = wb.create_sheet('NewSheet')
new_sheet['A1'] = 'Data'
wb.save('modified.xlsx')
undefined重新计算公式
Recalculating Formulas
由openpyxl创建或修改的Excel文件包含作为字符串的公式但没有计算值。使用提供的脚本重新计算公式:
scripts/recalc.pybash
python scripts/recalc.py <excel_file> [timeout_seconds]示例:
bash
python scripts/recalc.py output.xlsx 30该脚本:
- 首次运行时自动设置LibreOffice宏
- 重新计算所有工作表中的所有公式
- 扫描所有单元格的Excel错误(#REF!、#DIV/0!等)
- 返回包含详细错误位置和计数的JSON
- 在Linux和macOS上均可工作
Excel files created or modified by openpyxl contain formulas as strings but no calculated values. Use the provided script to recalculate formulas:
scripts/recalc.pybash
python scripts/recalc.py <excel_file> [timeout_seconds]Example:
bash
python scripts/recalc.py output.xlsx 30The script:
- Automatically sets up LibreOffice macros on first run
- Recalculates all formulas in all worksheets
- Scans all cells for Excel errors (#REF!, #DIV/0!, etc.)
- Returns JSON with detailed error locations and counts
- Works on both Linux and macOS
公式验证清单
Formula Verification Checklist
快速检查以确保公式正常工作:
Quick checks to ensure formulas work correctly:
基本验证
Basic Verification
- 测试2-3个示例引用:在构建完整模型之前验证它们提取正确的值
- 列映射:确认Excel列匹配(例如:第64列=BL,而不是BK)
- 行偏移:记住Excel行是1索引的(DataFrame第5行=Excel第6行)
- Test 2-3 sample references: Verify they extract the correct values before building the full model
- Column Mapping: Confirm Excel columns match (e.g., Column 64 = BL, not BK)
- Row Offset: Remember Excel rows are 1-indexed (DataFrame row 5 = Excel row 6)
常见陷阱
Common Pitfalls
- NaN处理:使用检查空值
pd.notna() - 最右列:财年数据通常在第50+列中
- 多个匹配:搜索所有出现,而不仅仅是第一个
- 除以零:在公式中使用之前检查分母(#DIV/0!)
/ - 错误引用:验证所有单元格引用指向预期的单元格(#REF!)
- 跨工作表引用:使用正确的格式(Sheet1!A1)链接工作表
- NaN Handling: Use to check for empty values
pd.notna() - Rightmost Columns: Fiscal year data is often in columns 50+
- Multiple Matches: Search for all occurrences, not just the first
- Division by Zero: Check denominators before using in formulas (#DIV/0!)
/ - Incorrect References: Verify all cell references point to intended cells (#REF!)
- Cross-Worksheet References: Use correct formatting (Sheet1!A1) to link worksheets
公式测试策略
Formula Testing Strategy
- 从小开始:在广泛应用之前在2-3个单元格上测试公式
- 验证依赖项:检查公式中引用的所有单元格都存在
- 测试边缘情况:包括零、负数和非常大的值
- Start Small: Test formulas on 2-3 cells before applying broadly
- Verify Dependencies: Check that all cells referenced in formulas exist
- Test Edge Cases: Include zero, negative, and very large values
解释scripts/recalc.py输出
Interpreting scripts/recalc.py Output
脚本返回包含错误详细信息的JSON:
json
{
"status": "success", // 或"errors_found"
"total_errors": 0, // 总错误计数
"total_formulas": 42, // 文件中的公式数量
"error_summary": { // 仅在发现错误时存在
"#REF!": {
"count": 2,
"locations": ["Sheet1!B5", "Sheet1!C10"]
}
}
}The script returns JSON with error details:
json
{
"status": "success", // or "errors_found"
"total_errors": 0, // Total error count
"total_formulas": 42, // Number of formulas in file
"error_summary": { // Only present if errors are found
"#REF!": {
"count": 2,
"locations": ["Sheet1!B5", "Sheet1!C10"]
}
}
}最佳实践
Best Practices
库选择
Library Selection
- pandas:最适合数据分析、批量操作和简单的数据导出
- openpyxl:最适合复杂格式化、公式和Excel特定功能
- pandas: Best for data analysis, bulk operations, and simple data export
- openpyxl: Best for complex formatting, formulas, and Excel-specific features
使用openpyxl
Using openpyxl
- 单元格索引从1开始(row=1, column=1指单元格A1)
- 使用读取计算值:
data_only=Trueload_workbook('file.xlsx', data_only=True) - 警告:如果使用打开并保存,公式将被值替换并永久丢失
data_only=True - 对于大文件:使用读取或
read_only=True写入write_only=True - 公式被保留但不被评估 - 使用scripts/recalc.py更新值
- Cell indexing starts at 1 (row=1, column=1 refers to cell A1)
- Use to read calculated values:
data_only=Trueload_workbook('file.xlsx', data_only=True) - Warning: If you open and save with , formulas will be replaced with values and lost permanently
data_only=True - For large files: Use for reading or
read_only=Truefor writingwrite_only=True - Formulas are preserved but not evaluated - use scripts/recalc.py to update values
使用pandas
Using pandas
- 指定数据类型以避免推断问题:
pd.read_excel('file.xlsx', dtype={'id': str}) - 对于大文件,读取特定列:
pd.read_excel('file.xlsx', usecols=['A', 'C', 'E']) - 正确处理日期:
pd.read_excel('file.xlsx', parse_dates=['date_column'])
- Specify data types to avoid inference issues:
pd.read_excel('file.xlsx', dtype={'id': str}) - For large files, read specific columns:
pd.read_excel('file.xlsx', usecols=['A', 'C', 'E']) - Handle dates correctly:
pd.read_excel('file.xlsx', parse_dates=['date_column'])
代码风格指南
Code Style Guidelines
重要:为Excel操作生成Python代码时:
- 编写最少的、简洁的Python代码,没有不必要的注释
- 避免冗长的变量名和冗余操作
- 避免不必要的print语句
对于Excel文件本身:
- 向具有复杂公式或重要假设的单元格添加注释
- 记录硬编码值的数据源
- 包含关键计算和模型部分的注释
Important: When generating Python code for Excel operations:
- Write minimal, concise Python code with no unnecessary comments
- Avoid verbose variable names and redundant operations
- Avoid unnecessary print statements
For Excel Files Themselves:
- Add comments to cells with complex formulas or important assumptions
- Document data sources for hard-coded values
- Include comments for key calculations and model sections