excel-automation
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseExcel Automation
Excel自动化
Create professional Excel files, parse complex financial models, and control Excel on macOS.
创建专业格式的Excel文件、解析复杂财务模型并在macOS上控制Excel。
Quick Start
快速开始
bash
undefinedbash
undefinedCreate a formatted Excel report
Create a formatted Excel report
uv run --with openpyxl scripts/create_formatted_excel.py output.xlsx
uv run --with openpyxl scripts/create_formatted_excel.py output.xlsx
Parse a complex xlsm that openpyxl can't handle
Parse a complex xlsm that openpyxl can't handle
uv run scripts/parse_complex_excel.py model.xlsm # List sheets
uv run scripts/parse_complex_excel.py model.xlsm "DCF" # Extract a sheet
uv run scripts/parse_complex_excel.py model.xlsm --fix # Fix corrupted names
uv run scripts/parse_complex_excel.py model.xlsm # List sheets
uv run scripts/parse_complex_excel.py model.xlsm "DCF" # Extract a sheet
uv run scripts/parse_complex_excel.py model.xlsm --fix # Fix corrupted names
Control Excel via AppleScript (with timeout to prevent hangs)
Control Excel via AppleScript (with timeout to prevent hangs)
timeout 5 osascript -e 'tell application "Microsoft Excel" to activate'
undefinedtimeout 5 osascript -e 'tell application "Microsoft Excel" to activate'
undefinedOverview
概述
Three capabilities:
| Capability | Tool | When to Use |
|---|---|---|
| Create formatted Excel | | Reports, mockups, dashboards |
| Parse complex xlsm/xlsx | | Financial models, VBA workbooks, >1MB files |
| Control Excel window | AppleScript ( | Zoom, scroll, select cells programmatically |
三大功能:
| 功能 | 工具 | 适用场景 |
|---|---|---|
| 创建格式化Excel文件 | | 报表、原型、仪表盘 |
| 解析复杂xlsm/xlsx文件 | | 财务模型、VBA工作簿、大于1MB的文件 |
| 控制Excel窗口 | AppleScript ( | 程序化调整缩放、滚动、选择单元格 |
Tool Selection Decision Tree
工具选择决策树
Is the file simple (data export, no VBA, <1MB)?
├─ YES → openpyxl or pandas
└─ NO
├─ Is it .xlsm or from investment bank / >1MB?
│ └─ YES → zipfile + xml.etree.ElementTree (stdlib)
└─ Is it truly .xls (BIFF format)?
└─ YES → xlrdSignals of "complex" Excel: file >1MB, extension, from investment bank/broker, contains VBA macros.
.xlsmIMPORTANT: Always run first — extensions lie. A file may actually be a ZIP-based xlsx.
file <path>.xlsIs the file simple (data export, no VBA, <1MB)?
├─ YES → openpyxl or pandas
└─ NO
├─ Is it .xlsm or from investment bank / >1MB?
│ └─ YES → zipfile + xml.etree.ElementTree (stdlib)
└─ Is it truly .xls (BIFF format)?
└─ YES → xlrd“复杂”Excel的特征:文件大小超过1MB、扩展名为、来自投资银行/券商、包含VBA宏。
.xlsm重要提示:始终先运行命令——文件扩展名不可信。文件实际上可能是基于ZIP的xlsx格式。
file <path>.xlsCreating Excel Files (openpyxl)
创建Excel文件(openpyxl)
Professional Color Convention (Investment Banking Standard)
专业配色规范(投资银行标准)
| Color | RGB Code | Meaning |
|---|---|---|
| Blue | | User input / assumption |
| Black | | Calculated value |
| Green | | Cross-sheet reference |
| White on dark blue | | Section headers |
| Dark blue text | | Title |
| 颜色 | RGB代码 | 含义 |
|---|---|---|
| 蓝色 | | 用户输入/假设值 |
| 黑色 | | 计算值 |
| 绿色 | | 跨工作表引用 |
| 深蓝底白字 | | 章节标题 |
| 深蓝色文本 | | 主标题 |
Core Formatting Patterns
核心格式模板
python
from openpyxl.styles import Font, PatternFill, Border, Side, Alignmentpython
from openpyxl.styles import Font, PatternFill, Border, Side, AlignmentFonts
Fonts
BLUE_FONT = Font(color="0000FF", size=10, name="Calibri")
BLACK_FONT_BOLD = Font(color="000000", size=10, name="Calibri", bold=True)
GREEN_FONT = Font(color="008000", size=10, name="Calibri")
HEADER_FONT = Font(color="FFFFFF", size=12, name="Calibri", bold=True)
BLUE_FONT = Font(color="0000FF", size=10, name="Calibri")
BLACK_FONT_BOLD = Font(color="000000", size=10, name="Calibri", bold=True)
GREEN_FONT = Font(color="008000", size=10, name="Calibri")
HEADER_FONT = Font(color="FFFFFF", size=12, name="Calibri", bold=True)
Fills
Fills
DARK_BLUE_FILL = PatternFill("solid", fgColor="4472C4")
LIGHT_BLUE_FILL = PatternFill("solid", fgColor="D9E1F2")
INPUT_GREEN_FILL = PatternFill("solid", fgColor="E2EFDA")
LIGHT_GRAY_FILL = PatternFill("solid", fgColor="F2F2F2")
DARK_BLUE_FILL = PatternFill("solid", fgColor="4472C4")
LIGHT_BLUE_FILL = PatternFill("solid", fgColor="D9E1F2")
INPUT_GREEN_FILL = PatternFill("solid", fgColor="E2EFDA")
LIGHT_GRAY_FILL = PatternFill("solid", fgColor="F2F2F2")
Borders
Borders
THIN_BORDER = Border(bottom=Side(style="thin", color="B2B2B2"))
BOTTOM_DOUBLE = Border(bottom=Side(style="double", color="000000"))
undefinedTHIN_BORDER = Border(bottom=Side(style="thin", color="B2B2B2"))
BOTTOM_DOUBLE = Border(bottom=Side(style="double", color="000000"))
undefinedNumber Format Codes
数字格式代码
| Format | Code | Example |
|---|---|---|
| Currency | | $1,234 |
| Currency with decimals | | $1,234.56 |
| Percentage | | 12.3% |
| Percentage (2 decimal) | | 12.34% |
| Number with commas | | 1,234 |
| Multiplier | | 1.5x |
| 格式 | 代码 | 示例 |
|---|---|---|
| 货币 | | $1,234 |
| 带小数的货币 | | $1,234.56 |
| 百分比 | | $12.3% |
| 两位小数的百分比 | | $12.34% |
| 带千分位的数字 | | 1,234 |
| 倍数 | | 1.5x |
Conditional Formatting (Sensitivity Tables)
条件格式(敏感性分析表)
Red-to-green gradient for sensitivity analysis:
python
from openpyxl.formatting.rule import ColorScaleRule
rule = ColorScaleRule(
start_type="min", start_color="F8696B", # Red (low)
mid_type="percentile", mid_value=50, mid_color="FFEB84", # Yellow (mid)
end_type="max", end_color="63BE7B" # Green (high)
)
ws.conditional_formatting.add(f"B2:F6", rule)用于敏感性分析的红-黄-绿渐变:
python
from openpyxl.formatting.rule import ColorScaleRule
rule = ColorScaleRule(
start_type="min", start_color="F8696B", # Red (low)
mid_type="percentile", mid_value=50, mid_color="FFEB84", # Yellow (mid)
end_type="max", end_color="63BE7B" # Green (high)
)
ws.conditional_formatting.add(f"B2:F6", rule)Execution
执行命令
bash
uv run --with openpyxl scripts/create_formatted_excel.pyFull template script: See
scripts/create_formatted_excel.pybash
uv run --with openpyxl scripts/create_formatted_excel.py完整模板脚本:参见
scripts/create_formatted_excel.pyParsing Complex Excel (zipfile + xml)
解析复杂Excel文件(zipfile + xml)
When openpyxl fails on complex xlsm files (corrupted DefinedNames, complex VBA), use stdlib directly.
当openpyxl无法处理复杂xlsm文件(如损坏的DefinedNames、复杂VBA)时,直接使用标准库。
XLSX Internal ZIP Structure
XLSX内部ZIP结构
file.xlsx (ZIP archive)
├── [Content_Types].xml
├── xl/
│ ├── workbook.xml ← Sheet names + order
│ ├── sharedStrings.xml ← All text values (lookup table)
│ ├── worksheets/
│ │ ├── sheet1.xml ← Cell data for sheet 1
│ │ ├── sheet2.xml ← Cell data for sheet 2
│ │ └── ...
│ └── _rels/
│ └── workbook.xml.rels ← Maps rId → sheetN.xml
└── _rels/.relsfile.xlsx (ZIP archive)
├── [Content_Types].xml
├── xl/
│ ├── workbook.xml ← Sheet names + order
│ ├── sharedStrings.xml ← All text values (lookup table)
│ ├── worksheets/
│ │ ├── sheet1.xml ← Cell data for sheet 1
│ │ ├── sheet2.xml ← Cell data for sheet 2
│ │ └── ...
│ └── _rels/
│ └── workbook.xml.rels ← Maps rId → sheetN.xml
└── _rels/.relsSheet Name Resolution (Two-Step)
工作表名称解析(两步法)
Sheet names in link to physical files via :
workbook.xml_rels/workbook.xml.relspython
import zipfile
import xml.etree.ElementTree as ET
MAIN_NS = 'http://schemas.openxmlformats.org/spreadsheetml/2006/main'
REL_NS = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships'
RELS_NS = 'http://schemas.openxmlformats.org/package/2006/relationships'
def get_sheet_path(zf, sheet_name):
"""Resolve sheet name to physical XML file path inside ZIP."""
# Step 1: workbook.xml → find rId for the sheet name
wb_xml = ET.fromstring(zf.read('xl/workbook.xml'))
sheets = wb_xml.findall(f'.//{{{MAIN_NS}}}sheet')
rid = None
for s in sheets:
if s.get('name') == sheet_name:
rid = s.get(f'{{{REL_NS}}}id')
break
if not rid:
raise ValueError(f"Sheet '{sheet_name}' not found")
# Step 2: workbook.xml.rels → map rId to file path
rels_xml = ET.fromstring(zf.read('xl/_rels/workbook.xml.rels'))
for rel in rels_xml.findall(f'{{{RELS_NS}}}Relationship'):
if rel.get('Id') == rid:
return 'xl/' + rel.get('Target')
raise ValueError(f"No file mapping for {rid}")工作表名称在中通过关联到物理文件:
workbook.xml_rels/workbook.xml.relspython
import zipfile
import xml.etree.ElementTree as ET
MAIN_NS = 'http://schemas.openxmlformats.org/spreadsheetml/2006/main'
REL_NS = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships'
RELS_NS = 'http://schemas.openxmlformats.org/package/2006/relationships'
def get_sheet_path(zf, sheet_name):
"""Resolve sheet name to physical XML file path inside ZIP."""
# Step 1: workbook.xml → find rId for the sheet name
wb_xml = ET.fromstring(zf.read('xl/workbook.xml'))
sheets = wb_xml.findall(f'.//{{{MAIN_NS}}}sheet')
rid = None
for s in sheets:
if s.get('name') == sheet_name:
rid = s.get(f'{{{REL_NS}}}id')
break
if not rid:
raise ValueError(f"Sheet '{sheet_name}' not found")
# Step 2: workbook.xml.rels → map rId to file path
rels_xml = ET.fromstring(zf.read('xl/_rels/workbook.xml.rels'))
for rel in rels_xml.findall(f'{{{RELS_NS}}}Relationship'):
if rel.get('Id') == rid:
return 'xl/' + rel.get('Target')
raise ValueError(f"No file mapping for {rid}")Cell Data Extraction
单元格数据提取
python
def extract_cells(zf, sheet_path):
"""Extract all cell values from a sheet XML."""
# Build shared strings lookup
shared = []
try:
ss_xml = ET.fromstring(zf.read('xl/sharedStrings.xml'))
for si in ss_xml.findall(f'{{{MAIN_NS}}}si'):
texts = si.itertext()
shared.append(''.join(texts))
except KeyError:
pass # No shared strings
# Parse sheet cells
sheet_xml = ET.fromstring(zf.read(sheet_path))
rows = sheet_xml.findall(f'.//{{{MAIN_NS}}}row')
data = {}
for row in rows:
for cell in row.findall(f'{{{MAIN_NS}}}c'):
ref = cell.get('r') # e.g., "A1"
cell_type = cell.get('t') # "s" = shared string, None = number
val_el = cell.find(f'{{{MAIN_NS}}}v')
if val_el is not None and val_el.text:
if cell_type == 's':
data[ref] = shared[int(val_el.text)]
else:
try:
data[ref] = float(val_el.text)
except ValueError:
data[ref] = val_el.text
return datapython
def extract_cells(zf, sheet_path):
"""Extract all cell values from a sheet XML."""
# Build shared strings lookup
shared = []
try:
ss_xml = ET.fromstring(zf.read('xl/sharedStrings.xml'))
for si in ss_xml.findall(f'{{{MAIN_NS}}}si'):
texts = si.itertext()
shared.append(''.join(texts))
except KeyError:
pass # No shared strings
# Parse sheet cells
sheet_xml = ET.fromstring(zf.read(sheet_path))
rows = sheet_xml.findall(f'.//{{{MAIN_NS}}}row')
data = {}
for row in rows:
for cell in row.findall(f'{{{MAIN_NS}}}c'):
ref = cell.get('r') # e.g., "A1"
cell_type = cell.get('t') # "s" = shared string, None = number
val_el = cell.find(f'{{{MAIN_NS}}}v')
if val_el is not None and val_el.text:
if cell_type == 's':
data[ref] = shared[int(val_el.text)]
else:
try:
data[ref] = float(val_el.text)
except ValueError:
data[ref] = val_el.text
return dataFixing Corrupted DefinedNames
修复损坏的DefinedNames
Investment bank xlsm files often have corrupted entries containing "Formula removed":
<definedName>python
def fix_defined_names(zf_in_path, zf_out_path):
"""Remove corrupted DefinedNames and repackage."""
import shutil, tempfile
with tempfile.TemporaryDirectory() as tmp:
tmp = Path(tmp)
with zipfile.ZipFile(zf_in_path, 'r') as zf:
zf.extractall(tmp)
wb_xml_path = tmp / 'xl' / 'workbook.xml'
tree = ET.parse(wb_xml_path)
root = tree.getroot()
ns = {'main': MAIN_NS}
defined_names = root.find('.//main:definedNames', ns)
if defined_names is not None:
for name in list(defined_names):
if name.text and "Formula removed" in name.text:
defined_names.remove(name)
tree.write(wb_xml_path, encoding='utf-8', xml_declaration=True)
with zipfile.ZipFile(zf_out_path, 'w', zipfile.ZIP_DEFLATED) as zf:
for fp in tmp.rglob('*'):
if fp.is_file():
zf.write(fp, fp.relative_to(tmp))Full template script: See
scripts/parse_complex_excel.py投资银行的xlsm文件通常包含损坏的条目,内容为“Formula removed”:
<definedName>python
def fix_defined_names(zf_in_path, zf_out_path):
"""Remove corrupted DefinedNames and repackage."""
import shutil, tempfile
with tempfile.TemporaryDirectory() as tmp:
tmp = Path(tmp)
with zipfile.ZipFile(zf_in_path, 'r') as zf:
zf.extractall(tmp)
wb_xml_path = tmp / 'xl' / 'workbook.xml'
tree = ET.parse(wb_xml_path)
root = tree.getroot()
ns = {'main': MAIN_NS}
defined_names = root.find('.//main:definedNames', ns)
if defined_names is not None:
for name in list(defined_names):
if name.text and "Formula removed" in name.text:
defined_names.remove(name)
tree.write(wb_xml_path, encoding='utf-8', xml_declaration=True)
with zipfile.ZipFile(zf_out_path, 'w', zipfile.ZIP_DEFLATED) as zf:
for fp in tmp.rglob('*'):
if fp.is_file():
zf.write(fp, fp.relative_to(tmp))完整模板脚本:参见
scripts/parse_complex_excel.pyControlling Excel on macOS (AppleScript)
在macOS上控制Excel(AppleScript)
All commands verified on macOS with Microsoft Excel.
所有命令均已在macOS系统的Microsoft Excel上验证。
Verified Commands
已验证命令
bash
undefinedbash
undefinedActivate Excel (bring to front)
Activate Excel (bring to front)
osascript -e 'tell application "Microsoft Excel" to activate'
osascript -e 'tell application "Microsoft Excel" to activate'
Open a file
Open a file
osascript -e 'tell application "Microsoft Excel" to open POSIX file "/path/to/file.xlsx"'
osascript -e 'tell application "Microsoft Excel" to open POSIX file "/path/to/file.xlsx"'
Set zoom level (percentage)
Set zoom level (percentage)
osascript -e 'tell application "Microsoft Excel"
set zoom of active window to 120
end tell'
osascript -e 'tell application "Microsoft Excel"
set zoom of active window to 120
end tell'
Scroll to specific row
Scroll to specific row
osascript -e 'tell application "Microsoft Excel"
set scroll row of active window to 45
end tell'
osascript -e 'tell application "Microsoft Excel"
set scroll row of active window to 45
end tell'
Scroll to specific column
Scroll to specific column
osascript -e 'tell application "Microsoft Excel"
set scroll column of active window to 3
end tell'
osascript -e 'tell application "Microsoft Excel"
set scroll column of active window to 3
end tell'
Select a cell range
Select a cell range
osascript -e 'tell application "Microsoft Excel"
select range "A1" of active sheet
end tell'
osascript -e 'tell application "Microsoft Excel"
select range "A1" of active sheet
end tell'
Select a specific sheet by name
Select a specific sheet by name
osascript -e 'tell application "Microsoft Excel"
activate object sheet "DCF" of active workbook
end tell'
undefinedosascript -e 'tell application "Microsoft Excel"
activate object sheet "DCF" of active workbook
end tell'
undefinedTiming and Timeout
计时与超时设置
Always add between AppleScript commands and subsequent operations (e.g., screenshot) to allow UI rendering.
sleep 1IMPORTANT: will hang indefinitely if Excel is not running or not responding. Always wrap with :
osascripttimeoutbash
undefined在AppleScript命令和后续操作(如截图)之间始终添加,以等待UI渲染完成。
sleep 1重要提示:如果Excel未运行或无响应,会无限期挂起。务必用命令包裹:
osascripttimeoutbash
undefinedSafe pattern: 5-second timeout
Safe pattern: 5-second timeout
timeout 5 osascript -e 'tell application "Microsoft Excel" to activate'
timeout 5 osascript -e 'tell application "Microsoft Excel" to activate'
Check exit code: 124 = timed out
Check exit code: 124 = timed out
if [ $? -eq 124 ]; then
echo "Excel not responding — is it running?"
fi
undefinedif [ $? -eq 124 ]; then
echo "Excel not responding — is it running?"
fi
undefinedCommon Mistakes
常见错误
| Mistake | Correction |
|---|---|
| openpyxl fails on complex xlsm → try monkey-patching | Switch to |
Count Chinese characters with | Use |
| Trust file extension | Run |
openpyxl | Use |
| 错误操作 | 修正方法 |
|---|---|
| openpyxl处理复杂xlsm失败 → 尝试猴子补丁 | 立即切换为 |
用 | 使用 |
| 信任文件扩展名 | 先运行 |
openpyxl的 | 使用 |
Important Notes
重要注意事项
- Execute Python scripts with (never use system Python)
uv run --with openpyxl - LibreOffice () can convert formats and recalculate formulas
soffice --headless - Detailed formatting reference: See
references/formatting-reference.md
- 使用执行Python脚本(切勿使用系统Python)
uv run --with openpyxl - LibreOffice ()可用于格式转换和公式重计算
soffice --headless - 详细格式参考:参见
references/formatting-reference.md