game-performance-profiler
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGame Performance Profiler
游戏性能剖析器
Use this skill when the main question is "what packet do we trust, which bottleneck family is most likely, and what is the next capture or review artifact worth producing?"
The job is not to dump generic optimization advice.
The job is to normalize the current packet, choose one operating mode, name one primary bottleneck family, recommend the smallest next capture that can improve confidence, and return one profiling artifact teams can actually act on.
Read references/mode-selection-and-route-outs.md before handling an unfamiliar request shape.
Read references/capture-packets-and-benchmark-routes.md before designing a reproducible pass.
Read references/device-review-and-steam-deck.md before trusting editor numbers for handheld or target-device review.
Read references/profiling-patterns.md before classifying weak or ambiguous evidence.
Read references/escalation-ladder.md before jumping from screenshots or stat packets to deeper engine or GPU tools.
当核心问题是**“我们该信任哪个数据包,最可能的瓶颈类别是什么,以及值得生成的下一个捕获内容或评测工件是什么?”**时,请使用此技能。
本技能的任务不是提供通用优化建议,而是标准化当前数据包、选择一种操作模式、确定一个主要瓶颈类别、推荐能提升置信度的最小后续捕获内容,并返回一份团队可实际执行的性能剖析工件。
在处理不熟悉的请求类型前,请阅读references/mode-selection-and-route-outs.md。
在设计可复现的测试流程前,请阅读references/capture-packets-and-benchmark-routes.md。
在信任编辑器数据用于掌机或目标设备评测前,请阅读references/device-review-and-steam-deck.md。
在对薄弱或模糊的证据进行分类前,请阅读references/profiling-patterns.md。
在从截图或统计数据包转向更深入的引擎或GPU工具前,请阅读references/escalation-ladder.md。
When to use this skill
何时使用本技能
- Unity or Unreal projects with low FPS, frame-time spikes, hitches, or performance regressions where the bottleneck is not yet isolated
- Profiler screenshots, /
stat unitoutput, Unreal Insights traces, overlay captures, route notes, or benchmark complaints that need interpretationstat gpu - Steam Deck, handheld, mobile, console, VR, or low-spec review requests where packaged-on-device behavior matters more than editor impressions
- Requests for a profiling plan, benchmark route, device review packet, CPU/GPU split note, or escalation choice instead of immediate code edits
- Mixed performance packets where the next owner is still unclear and the first job is choosing between quick packet, deeper trace, or neighboring game skill
- Unity或Unreal项目出现低帧率、帧时间波动、卡顿或性能退化,且瓶颈尚未定位的情况
- 需要解读性能剖析器截图、/
stat unit输出、Unreal Insights追踪数据、覆盖层捕获内容、路径记录或基准测试问题反馈时stat gpu - Steam Deck、掌机、移动设备、主机、VR或低配置设备的评测请求,此时设备上的打包版本表现比编辑器中的印象更重要
- 请求性能剖析计划、基准测试路径、设备评测数据包、CPU/GPU拆分说明或升级方案选择,而非直接进行代码修改时
- 性能数据包混杂,下一负责人尚不明确,首要任务是在快速数据包、深度追踪或相邻游戏技能间进行选择时
When not to use this skill
何时不使用本技能
- The main issue is a Unity/Unreal build, cook, package, editor, or CI failure with no runtime frame-time diagnosis yet → use
game-build-log-triage - The main issue is generic web/app/service performance rather than engine-specific runtime capture interpretation → use
performance-optimization - The real task is broad milestone or production coordination across bugs, playtest notes, launch goals, and roadmap tradeoffs → use
bmad-gds - The packet is mainly playtest/demo/community feedback and the question is fix-first prioritization → use
game-demo-feedback-triage - The next move is already a deep implementation change with a confirmed cause; route to the implementation skill after producing the profiling brief
- 核心问题是Unity/Unreal的构建、烘焙、打包、编辑器或CI失败,且尚未进行运行时帧时间诊断 → 使用
game-build-log-triage - 核心问题是通用Web/应用/服务性能,而非引擎特定的运行时捕获内容解读 → 使用
performance-optimization - 实际任务是跨Bug、测试笔记、发布目标和路线图权衡的广泛里程碑或生产协调工作 → 使用
bmad-gds - 数据包主要是测试/演示/社区反馈,问题是优先修复事项的排序 → 使用
game-demo-feedback-triage - 下一步行动已确定为深度实现变更且原因已确认;生成性能剖析简报后转至实现类技能
Instructions
操作步骤
Step 1: Frame the packet
步骤1:梳理数据包
Capture the minimum useful context before diagnosing anything.
Record:
- engine: |
Unity|UnrealUnknown - target: PC | Steam Deck / handheld | console | mobile | VR | low-spec laptop | unknown
- environment: editor | packaged/dev build | release/shipping build | unknown
- symptom: low average FPS | intermittent hitch | traversal hitch | combat spike | loading stall | thermal drift | unknown
- evidence available: profiler screenshot, trace/capture file, stat-command screenshot, overlay screenshot, video, benchmark notes, reproduction steps
- reproduction shape: exact scene, save slot, encounter, traversal path, menu, cutscene, or unknown
- quality variables: resolution, preset, frame cap, upscaler state, power mode, battery state if relevant
Quick frame:
markdown
Engine: Unreal
Target: Steam Deck
Environment: packaged build unknown
Symptom: traversal hitch after two minutes in market square
Evidence: `stat unit` screenshot + overlay photo
Repro: route not yet fixedRule: if the packet is thin, keep confidence low and make the next capture smaller, not broader.
在进行诊断前,先捕获最少量的有效上下文信息。
记录:
- 引擎:|
Unity| 未知Unreal - 目标设备:PC | Steam Deck/掌机 | 主机 | 移动设备 | VR | 低配置笔记本 | 未知
- 环境:编辑器 | 开发打包版本 | 发布/正式版本 | 未知
- 症状:平均帧率低 | 间歇性卡顿 | 场景切换卡顿 | 战斗帧率骤降 | 加载停滞 | 热漂移 | 未知
- 可用证据:性能剖析器截图、追踪/捕获文件、统计命令截图、覆盖层截图、视频、基准测试笔记、复现步骤
- 复现场景:具体场景、存档位置、遭遇战、移动路径、菜单、过场动画或未知
- 画质变量:分辨率、预设、帧率上限、缩放状态、电源模式、电池状态(如相关)
快速梳理示例:
markdown
引擎:Unreal
目标设备:Steam Deck
环境:打包版本未知
症状:在市场广场行走两分钟后出现场景切换卡顿
证据:`stat unit`截图 + 覆盖层照片
复现:路径尚未固定规则:如果数据包信息不足,保持低置信度,选择更小的后续捕获内容,而非更宽泛的内容。
Step 2: Choose one primary mode
步骤2:选择一种主要模式
Use references/mode-selection-and-route-outs.md.
Pick exactly one primary mode:
quick-triage-packetbottleneck-classificationbenchmark-route-plandevice-reviewtool-escalation
Rule: one primary mode, optional secondary note. Do not try to handle every mode at once.
请参考references/mode-selection-and-route-outs.md。
精确选择一种主要模式:
- (快速分类数据包)
quick-triage-packet - (瓶颈分类)
bottleneck-classification - (基准测试路径计划)
benchmark-route-plan - (设备评测)
device-review - (工具升级)
tool-escalation
规则:仅选择一种主要模式,可添加可选的次要说明。不要同时处理多种模式。
Step 3: Name the likely bottleneck family before proposing fixes
步骤3:在提出修复方案前确定可能的瓶颈类别
Choose one primary family and an optional secondary family.
Primary families:
cpu-gameplay-scriptingcpu-render-thread-draw-call-pressuregpu-rendering-shaders-postfxmemory-gc-allocation-churnloading-streaming-iophysics-animation-simulationplatform-config-thermal-device-specificunknown-needs-better-capture
Good bottleneck statements:
- "The strongest signal points to streaming / IO hitching during traversal, not steady-state GPU load."
- "The packet suggests GC/allocation spikes during combat more than rendering saturation."
- "This looks device/config-bound because the team is still relying on editor impressions instead of packaged-on-target evidence."
Avoid: "performance is bad overall."
选择一种主要类别,可添加可选的次要类别。
主要类别:
- (CPU游戏玩法脚本)
cpu-gameplay-scripting - (CPU渲染线程绘制调用压力)
cpu-render-thread-draw-call-pressure - (GPU渲染着色器与后处理)
gpu-rendering-shaders-postfx - (内存GC分配波动)
memory-gc-allocation-churn - (加载/流IO)
loading-streaming-io - (物理与动画模拟)
physics-animation-simulation - (平台配置/散热/设备特定问题)
platform-config-thermal-device-specific - (未知,需要更完善的捕获内容)
unknown-needs-better-capture
优秀的瓶颈描述示例:
- “最明确的信号指向场景切换时的流加载/IO卡顿,而非稳态GPU负载。”
- “数据包显示战斗期间的GC/分配波动比渲染饱和更可能是问题根源。”
- “这看起来是设备/配置相关问题,因为团队仍依赖编辑器中的印象,而非目标设备上的打包版本证据。”
避免:“整体性能很差。”
Step 4: Recommend the smallest next capture
步骤4:推荐最小的后续捕获内容
Pick the cheapest capture that can materially separate the likely causes.
Typical next captures:
- one better Unity Profiler CPU/GPU/Memory packet from a representative player build
- one +
stat unitpair on the exact Unreal repro routestat gpu - one fixed traversal route with warm-up and repeat counts
- one packaged-on-device capture instead of more editor screenshots
- one deeper trace (Unreal Insights, GPU Visualizer, Frame Debugger, or external GPU tool) only after the first packet justifies it
Rule: prefer engine-native captures before vendor GPU tools unless the packet is already clearly render-path specific.
选择能有效区分可能原因的最低成本捕获内容。
典型的后续捕获内容:
- 一份来自代表性玩家版本的更完善的Unity Profiler CPU/GPU/内存数据包
- 一份在Unreal精确复现路径上的+
stat unit组合数据stat gpu - 一份包含预热和重复次数的固定移动路径测试
- 一份设备上的打包版本捕获内容,而非更多编辑器截图
- 仅在首个数据包证明必要时,才使用更深度的追踪工具(Unreal Insights、GPU可视化工具、Frame Debugger或外部GPU工具)
规则:除非数据包已明确是渲染路径相关问题,否则优先使用引擎原生捕获工具,而非厂商GPU工具。
Step 5: Make route and device context explicit
步骤5:明确路径和设备上下文
If reproducibility is missing, define the smallest repeatable pass.
Specify:
- save slot / checkpoint / scene
- start point and traversal path
- warm-up pass count
- measured repeat count or duration
- graphics preset / frame cap / power mode
- whether the packet is editor-only, packaged-only, or target-device
Do not treat "the market area feels bad" as a durable benchmark route.
如果缺少可复现性,定义最小的可重复测试流程。
指定:
- 存档位置/检查点/场景
- 起点和移动路径
- 预热次数
- 测量的重复次数或时长
- 画质预设/帧率上限/电源模式
- 数据包是仅编辑器版本、仅打包版本还是目标设备版本
不要将“市场区域感觉卡顿”视为可靠的基准测试路径。
Step 6: Return one profiling brief
步骤6:返回一份性能剖析简报
Always return one concise artifact with this shape:
markdown
undefined始终返回一份符合以下格式的简洁工件:
markdown
undefinedGame Performance Profiling Brief
游戏性能剖析简报
Scope
范围
- Mode: ...
- Engine: ...
- Target: ...
- Environment: ...
- Symptom: ...
- Confidence: high | medium | low
- 模式:...
- 引擎:...
- 目标设备:...
- 环境:...
- 症状:...
- 置信度:高 | 中 | 低
Evidence packet
证据数据包
- What exists now: ...
- What's missing: ...
- Editor vs packaged / device note: ...
- 当前已有内容:...
- 缺失内容:...
- 编辑器与打包/设备版本说明:...
Primary bottleneck hypothesis
主要瓶颈假设
- Bucket: ...
- Why it fits: ...
- Evidence: ...
- 类别:...
- 匹配原因:...
- 证据:...
Secondary hypothesis
次要假设
- Bucket: ...
- Why it still matters: ...
- 类别:...
- 重要性:...
Next capture
后续捕获内容
- ...
- ...
- ...
- ...
- ...
- ...
Benchmark route / device review
基准测试路径/设备评测
- Repro route or save: ...
- Repeat / warm-up guidance: ...
- Device or packaged-build checks: ...
- 复现路径或存档:...
- 重复/预热指导:...
- 设备或打包版本检查:...
Escalation path
升级路径
- Stay with quick packet | move to engine profiler | escalate to GPU tool
- Why: ...
- 继续使用快速数据包 | 切换到引擎性能剖析器 | 升级到GPU工具
- 原因:...
Recommended next artifact
推荐的下一个工件
- Choose one: quick triage packet | profiling plan | benchmark route brief | CPU/GPU split note | memory/GC checklist | streaming hitch checklist | device review brief
- 选择其一:快速分类数据包 | 性能剖析计划 | 基准测试路径简报 | CPU/GPU拆分说明 | 内存/GC检查清单 | 流加载卡顿检查清单 | 设备评测简报
What not to do yet
暂不执行的操作
- 1-3 bullets that prevent premature optimization or blind rewrites
undefined- 1-3条避免过早优化或盲目重写的要点
undefinedOutput format
输出格式要求
Required qualities:
- classify the bottleneck before talking about fixes
- separate evidence from hypothesis
- recommend the next capture, route, or device review step instead of a giant backlog
- make editor-vs-packaged and packet-vs-trace boundaries explicit
- keep the report roughly 300-550 words unless the user asks for more
- use engine-native terms such as Unity Profiler, Frame Debugger, Unreal Insights, ,
stat unit, frame time, draw-call pressure, GC, streaming, and packaged buildstat gpu
必备特性:
- 在讨论修复方案前先对瓶颈进行分类
- 将证据与假设分开
- 推荐下一个捕获内容、路径或设备评测步骤,而非庞大的待办清单
- 明确区分编辑器与打包版本、数据包与追踪数据的边界
- 报告字数控制在300-550字左右,除非用户要求更多内容
- 使用引擎原生术语,如Unity Profiler、Frame Debugger、Unreal Insights、、
stat unit、帧时间、绘制调用压力、GC、流加载和打包版本stat gpu
Examples
示例
Example 1: Unity combat spike
示例1:Unity战斗帧率骤降
Input: "Our Unity game drops from 120 to 45 FPS in combat. We have Profiler screenshots and someone suspects GC spikes. Triage what to look at first."
Expected shape: classify around or , keep the current screenshots as a real packet, recommend the smallest next capture, and avoid jumping to rendering advice first.
memory-gc-allocation-churncpu-gameplay-scripting输入:“我们的Unity游戏在战斗中帧率从120降至45。我们有性能剖析器截图,有人怀疑是GC波动。请分类优先查看的内容。”
预期格式: 围绕或进行分类,保留当前截图作为有效数据包,推荐最小的后续捕获内容,避免先给出渲染相关建议。
memory-gc-allocation-churncpu-gameplay-scriptingExample 2: Unreal open-world traversal hitch
示例2:Unreal开放世界场景切换卡顿
Input: "Unreal is fine indoors but frame time explodes in our open world area. Help me triage whether this is CPU, GPU, streaming, or shaders."
Expected shape: use or , recommend / or Unreal Insights as appropriate, and define a reproducible traversal route instead of guessing fixes.
bottleneck-classificationbenchmark-route-planstat unitstat gpu输入:“Unreal在室内表现正常,但在开放世界区域帧时间骤增。帮我分类这是CPU、GPU、流加载还是着色器问题。”
预期格式: 使用或模式,根据情况推荐/或Unreal Insights,定义可复现的移动路径而非猜测修复方案。
bottleneck-classificationbenchmark-route-planstat unitstat gpuExample 3: Steam Deck review packet
示例3:Steam Deck评测数据包
Input: "We need a Steam Deck performance review plan before our demo release. The editor feels rough but we have not profiled the packaged build on device yet."
Expected shape: choose , keep confidence limited, prioritize packaged-on-device evidence, and return a device review brief or benchmark route brief.
device-review输入:“我们需要在演示发布前制定Steam Deck性能评测计划。编辑器中的表现很差,但我们还未在设备上对打包版本进行性能剖析。”
预期格式: 选择模式,保持有限的置信度,优先考虑设备上的打包版本证据,返回设备评测简报或基准测试路径简报。
device-reviewExample 4: Route-out to build failure triage
示例4:转至构建失败分类
Input: "Our Unreal packaged build crashes during cook and we do not even have runtime numbers yet."
Expected shape: route to instead of pretending this is already a profiling problem.
game-build-log-triage输入:“我们的Unreal打包版本在烘焙时崩溃,甚至还没有运行时数据。”
预期格式: 转至,而非将其当作性能剖析问题处理。
game-build-log-triageBest practices
最佳实践
- Start from the packet the team already has instead of demanding an ideal trace immediately.
- Name one primary bottleneck family before discussing optimizations.
- Treat reproducibility as part of the diagnosis, not an optional extra.
- Prefer packaged-on-device evidence over editor impressions when the release target is a handheld or constrained machine.
- Escalate from screenshot/stat packets to engine-native profilers before jumping to GPU-vendor tooling.
- Recommend one next artifact, not a giant optimization backlog.
- Keep route-outs explicit so game-performance work does not sprawl into build triage, generic app tuning, or production coordination.
- 从团队已有的数据包入手,而非立即要求理想的追踪数据。
- 在讨论优化方案前先确定一个主要瓶颈类别。
- 将可复现性视为诊断的一部分,而非可选附加项。
- 当发布目标是掌机或受限设备时,优先使用设备上的打包版本证据,而非编辑器中的印象。
- 从截图/统计数据包升级到引擎原生性能剖析器,再转向GPU厂商工具。
- 推荐一个下工件,而非庞大的优化待办清单。
- 明确转至其他技能的条件,避免游戏性能工作蔓延到构建分类、通用应用调优或生产协调领域。