architecture-well-architected-commerce

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Well-Architected Commerce on VTEX

VTEX精良电商架构(Well-Architected Commerce)

When this skill applies

该技能的适用场景

Use this skill when the task is cross-cutting or decision-oriented across VTEX commerce capabilities — not when a single product skill already fully defines the work.
  • Defining or reviewing solution architecture (storefront model + integrations + operations).
  • Choosing between native VTEX capabilities and custom services (IO apps, external BFFs, middleware).
  • Running an architecture or readiness review (security baseline, scalability posture, observability, delivery process).
  • Scoping work that will span FastStore, Headless, VTEX IO, Marketplace, Payments and/or any other VTEX module.
Do not use this skill as a substitute for product skills when the task is already localized (e.g. “implement PPP refunds” → payment track; “Feed v3 vs Hook” → marketplace track).
当任务是横跨VTEX电商能力的跨领域决策导向类工作时使用本技能——如果已有单一产品专项技能完全覆盖该工作,则无需使用。
  • 定义或评审解决方案架构(店面模型+集成+运营)
  • VTEX原生能力自定义服务(IO应用、外部BFF、中间件)之间做选型
  • 开展架构或就绪度评审(安全基线、可扩展性态势、可观测性、交付流程)
  • 对覆盖FastStore、Headless、VTEX IO、市场、支付和/或其他VTEX模块的工作做范围界定
如果任务已经有明确的对应产品赛道,请勿使用本技能替代产品专项技能(例如「实现PPP退款」对应支付赛道;「Feed v3 与 Hook 对比」对应市场赛道)。

Three pillars (framework)

三大支柱(框架)

These pillars are the Well-Architected Commerce lens for every architecture choice. Summaries below follow the internal framework narrative; for full objectives, core values, and critical areas of focus, use the Well-Architected Commerce MCP (and your program’s canonical framework document).
这些支柱是所有架构选型的精良电商架构评估视角。下方是内部框架的摘要说明;如需了解完整目标、核心价值和核心关注领域,请参考精良电商架构MCP(以及项目的官方框架文档)。

Technical Foundation

技术基础

Objective: A secure, reliable, compliant base that earns trust.
Core values: Reliability (consistent performance), Trust (transparent, accountable processes), Integrity (ethical handling of data, code, and resources).
Critical areas (examples): Advanced security (data protection, transaction security, threat awareness); reliable infrastructure (availability, scalability, recovery); compliance (regulations, audit trails, monitoring). Continuous learning keeps guidance current with technology and VTEX direction.
Nothing in Future-proof or Operational Excellence relaxes this pillar.
目标: 构建安全、可靠、合规的底座,赢得信任
核心价值: 可靠性(性能稳定)、信任(透明、可问责的流程)、完整性(数据、代码和资源的合规处理)。
核心关注领域(示例): 高级安全能力(数据保护、交易安全、威胁感知);可靠基础设施(可用性、可扩展性、灾备恢复);合规性(监管要求、审计轨迹、监控)。通过持续学习保证指导规范与技术发展和VTEX路线保持同步。
面向未来和卓越运营两个支柱的要求都不能降低本支柱的标准。

Future-proof

面向未来

Objective: Solutions that stay adaptable and maintainable as the business and platform evolve.
Core values: Innovation (current VTEX and industry best practices), Simplicity (the overarching value—minimum viable custom surface, whole-solution coherence), Efficiency (optimize effort and platform use).
Critical areas (examples): Scalable solutions; business and market adaptability; modular / compositional design; rapid deployment (agile delivery, CI/CD); system integration (VTEX-centric, API-first connectivity).
目标: 解决方案可随业务和平台发展保持可适配可维护
核心价值: 创新性(符合当前VTEX和行业最佳实践)、简洁性(首要价值——最小化自定义开发范围,整体解决方案连贯一致)、高效性(优化人力投入和平台使用效率)。
核心关注领域(示例): 可扩展解决方案;业务和市场适配性;模块化/组合式设计;快速部署(敏捷交付、CI/CD);系统集成(以VTEX为中心、API优先的连接能力)。

Operational Excellence

卓越运营

Objective: Run the program with data-informed decisions and accountable execution.
Core values: Accuracy, Integrity, Accountability, Data-driven decision-making (plus operational excellence as a discipline).
Critical areas (examples): Process optimization (efficiency, lean, automation); data-driven strategies (analytics, predictive insight, monitoring); performance improvement (VTEX insights, continuous monitoring, agility); customer experience (personalization, feedback, omnichannel).
目标:数据驱动的决策和可问责的执行来推进项目。
核心价值: 准确性完整性可问责性数据驱动决策(以及卓越运营的相关方法论)。
核心关注领域(示例): 流程优化(效率、精益、自动化);数据驱动策略(分析、预测性洞察、监控);性能改进(VTEX洞察、持续监控、敏捷性);客户体验(个性化、反馈、全渠道)。

Routing to product tracks

对应到产品赛道

Platform-specific how belongs in product skills, not in this meta-skill. After pillar alignment, use:
TopicTrack skill
VTEX IO service paths, edge/CDN behavior,
Cache-Control
vs data scope
vtex-io-service-paths-and-cdn
VTEX IO application performance (caching layers, AppSettings, parallel fetches, tenant-scoped in-memory keys)vtex-io-application-performance
Master Data storage fit (challenge whether MD is the right place), purchase path, BFF, single source of truthvtex-io-masterdata
Marketplace fulfillment, simulation, integration flowmarketplace-fulfillment (and related marketplace skills)
Cross-cutting VTEX rules (still architecture-level):
  1. Native and OOTB before VTEX IO — Prefer native VTEX capabilities and configuration before a VTEX IO extension. Use IO only when there is no suitable native path; document why not native if IO is chosen anyway.
  2. Simplicity and commodities — Prefer platform-native behaviors for commodity capabilities; reserve custom work for differentiators or genuine gaps, not for substituting process or ownership fixes.
  3. Integration discipline — Prefer fewer hops, clear ownership, and API-centric design (see Future-proof system integration)—detailed patterns live in IO, headless, and marketplace skills.
平台相关的实现细节属于产品专项技能范畴,不属于本元技能的覆盖范围。对齐支柱要求后,请参考对应赛道规范:
主题对应赛道技能
VTEX IO 服务路径、edge/CDN行为、
Cache-Control
与数据范围
vtex-io-service-paths-and-cdn
VTEX IO 应用性能(缓存层、AppSettings、并行请求、租户级内存键)vtex-io-application-performance
Master Data存储适配(评估MD是否是合适的存储方案)、购买路径、BFF、单一数据源vtex-io-masterdata
Marketplace履约、模拟、集成流程marketplace-fulfillment(以及其他相关市场赛道技能)
跨领域VTEX规则(仍属于架构层面):
  1. 优先使用原生和开箱即用能力,再考虑VTEX IO —— 优先选择VTEX原生能力配置化实现,再考虑VTEX IO扩展。仅当没有合适的原生实现路径时才使用IO;如果最终选择IO,需要记录「不使用原生方案的原因」。
  2. 简洁性与通用能力优先 —— 通用能力优先选择平台原生行为;自定义开发仅用于实现差异化能力或填补真实功能缺口,不要用自定义开发替代流程或权责问题的修复。
  3. 集成规范 —— 优先选择更少的调用链路、清晰的权责归属API中心化设计(参考面向未来部分的系统集成要求)——具体实现模式见IO、headless和市场赛道技能。

Decision rules

决策规则

  1. Classify every major decision under one or more pillars (see Three pillars (framework)). If a choice does not map to any pillar, question whether it is necessary.
  2. When extending the platform (VTEX IO, Master Data, integrations), use the Routing to product tracks table—implement caching, paths, MD usage, and marketplace flows with those skills, and record how the choice supports Future-proof and Operational Excellence without weakening Technical Foundation.
  3. Prefer fewer integration hops where custom code remains necessary: each hop adds failure modes and operational load. Additional services or backends are valid when they isolate failure domains or clear team boundaries, not by default.
  4. After architecture choices are clear, assign execution to product track skills (see Routing to product tracks and Related skills). This skill sets direction; product skills enforce VTEX-specific contracts.
  5. Operational discipline requires definable metrics and ownership (who runs it, how incidents are detected, how changes are released). Undocumented “best effort” operations violate Operational Excellence even if the design is lean.
  1. 所有重大决策都要对应到一个或多个支柱(参考「三大支柱(框架)」部分)。如果某个选型不对应任何支柱,需要评估其必要性。
  2. 扩展平台时(VTEX IO、Master Data、集成),参考「对应到产品赛道」表格——按照对应赛道技能实现缓存、路径、MD使用和市场流程,同时记录选型如何支撑「面向未来」和「卓越运营」要求,且不会削弱「技术基础」标准。
  3. 在必须保留自定义代码的场景下尽量减少集成链路:每增加一个链路都会引入更多故障模式和运维负担。额外的服务或后端仅用于隔离故障域明确团队边界,不能作为默认选型。
  4. 架构选型明确后,将执行工作对应到产品赛道技能(参考「对应到产品赛道」和「相关技能」部分)。本技能用于确定方向,产品技能用于执行VTEX特定的规范约束。
  5. 运营规范要求明确定义指标和权责归属(谁负责运维、如何检测故障、如何发布变更)。未记录的「尽力而为」式运维即使设计简洁,也违反「卓越运营」要求。

Hard constraints

硬性约束

Constraint: Do not bypass Technical Foundation for speed

约束:禁止为了速度牺牲技术基础要求

Security, credential handling, PCI scope, and private API access must follow VTEX and industry baselines. Architectural shortcuts that expose secrets, widen PCI scope, or call private APIs from untrusted clients are never acceptable tradeoffs for velocity.
Why this matters — Data breaches, fraud, and account compromise destroy customer trust and can invalidate compliance posture for the whole program.
Detection — If the design places
VTEX_APP_KEY
/
VTEX_APP_TOKEN
, raw card data, or shopper session tokens in browser code, public repos, or logs → stop and redesign using product skills (e.g. headless BFF, payment Secure Proxy).
Correct — Classify data and APIs; keep secrets and private calls server-side; reference PCI and authentication guides for the chosen integration style.
text
Architecture decision log:
- Private VTEX APIs → server-side only (BFF or IO service).
- Card data → Payment Provider Protocol / Secure Proxy patterns only.
Wrong — “We will call Checkout OMS from the SPA for speed” or “store app token in NEXT_PUBLIC for dev convenience.”
安全、凭证处理、PCI合规范围、私有API访问必须符合VTEX和行业基线。暴露密钥、扩大PCI合规范围、从未受信客户端调用私有API的架构捷径,永远不能作为提升交付速度的可接受权衡。
重要性 —— 数据泄露、欺诈和账户被盗会摧毁客户信任,甚至导致整个项目失去合规资质。
检测方式 —— 如果设计将
VTEX_APP_KEY
/
VTEX_APP_TOKEN
、原始银行卡数据、购物者会话令牌放在浏览器代码、公开仓库或日志中 → 立即停止,使用产品技能重新设计(例如headless BFF、支付安全代理)。
正确做法 —— 对数据和API做分级;将密钥和私有调用放在服务端;参考所选集成模式的PCI和认证指南。
text
Architecture decision log:
- Private VTEX APIs → server-side only (BFF or IO service).
- Card data → Payment Provider Protocol / Secure Proxy patterns only.
错误做法 —— 「我们直接从SPA调用Checkout OMS来提升速度」或者「为了开发方便把应用令牌存在NEXT_PUBLIC里」。

Constraint: Future-proof means justified complexity, not maximal decomposition

约束:面向未来意味着合理的复杂度,而非最大化拆分

Every new service, queue, or datastore must have a stated owner, failure mode, and reason tied to a pillar (e.g. isolation, scale, regulatory boundary). Unbounded service proliferation violates the Simplicity core value.
Why this matters — Undocumented distributed systems become impossible to operate, debug, or upgrade; they increase cost and incident duration.
Detection — If a diagram adds a new box “for flexibility” without a pillar mapping → challenge it. If two services could be one VTEX IO app with clear modules → merge or defer.
Correct — Document: “Service X owns partner webhook translation; Technical Foundation: audit log; Future-proof: replaceable adapter; Ops: on-call rotation Z.”
text
Before adding a service:
1. Which pillar(s) require it?
2. What fails if it is absent?
3. Can native/OOTB, existing BFF, or a minimal IO surface cover it?
Wrong — “Microservices architecture” as a default with no operational model or without VTEX integration constraints from product skills.
每新增一个服务、队列或数据存储都必须有明确的负责人故障模式对应到支柱的理由(例如隔离、扩展性、监管边界要求)。无限制的服务扩散违反「简洁性」核心价值。
重要性 —— 未记录的分布式系统会变得无法运维、调试或升级;会提升成本和故障持续时间。
检测方式 —— 如果架构图里新增了一个「为了灵活性」的模块,但没有对应到任何支柱 → 提出质疑。如果两个服务可以合并为一个带有清晰模块的VTEX IO应用 → 合并或延后拆分。
正确做法 —— 记录:「服务X负责合作伙伴webhook转换;技术基础:审计日志;面向未来:可替换适配器;运维:值班组Z负责。」
text
Before adding a service:
1. Which pillar(s) require it?
2. What fails if it is absent?
3. Can native/OOTB, existing BFF, or a minimal IO surface cover it?
错误做法 —— 默认采用「微服务架构」,没有运营模型,也不符合产品技能中的VTEX集成约束。

Constraint: VTEX IO extension requires exhausted native/OOTB options

约束:使用VTEX IO扩展前必须用尽所有原生/开箱即用选项

Choosing VTEX IO to implement a capability that already exists natively or OOTB (or could be met with configuration and standard integrations) without a documented exception rationale creates long-term cost, upgrade risk, and operational debt.
Why this matters — Duplicate implementations drift from platform evolution, break on upgrades, and consume engineering that should go to differentiation.
Detection — Proposal leads with “we will build an IO app for X” before listing native/OOTB alternatives considered. No written “why not native” when a Help Center or Developers guide describes a standard path.
Correct — Decision log entry: native/OOTB options evaluated → rejected because [specific gap] → IO scope minimized to that gap only.
text
Native/OOTB check before IO:
1. What does VTEX ship for this (admin, module, API, partner app)?
2. If still building IO: what exactly cannot be done natively?
Wrong — “We always customize via IO” or “IO is simpler than learning native features” without evidence that native cannot meet the requirement.
如果某能力已经有原生或开箱即用(OOTB)实现(或可以通过配置和标准集成满足),在没有记录例外理由的情况下选择VTEX IO实现会带来长期成本、升级风险和运维债务。
重要性 —— 重复实现的功能会偏离平台演进路线,升级时容易故障,也会浪费本应用于差异化能力开发的工程资源。
检测方式 —— 方案开篇就提出「我们将为X功能构建IO应用」,但没有列出评估过的原生/开箱即用替代方案。当帮助中心或开发者文档已经描述了标准实现路径时,没有书面的「不使用原生方案的原因」。
正确做法 —— 决策日志记录:评估了原生/开箱即用选项 → 因[具体缺口]被否决 → IO范围仅限定于填补该缺口。
text
Native/OOTB check before IO:
1. What does VTEX ship for this (admin, module, API, partner app)?
2. If still building IO: what exactly cannot be done natively?
错误做法 —— 「我们总是通过IO做定制」或者「IO比学习原生功能更简单」,但没有证据证明原生能力无法满足需求。

Constraint: Master Data is not a general-purpose or checkout-critical datastore

约束:Master Data不是通用数据库,也不能作为结账关键路径存储

Using Master Data without understanding its storage model, limits, and consistency—or using it as the default store for all custom data, or on the purchase critical path—risks latency, reliability, and support issues at scale.
Why this matters — MD is optimized for documented entity patterns, not arbitrary OLTP in the middle of checkout; misuse impacts revenue and stability.
Detection — Synchronous order flow calls MD on every cart mutation; “put all custom fields in MD” with no schema discipline; MD chosen before catalog, profile, or OMS-native options are evaluated.
Correct — Follow vtex-io-masterdata (storage fit, catalog-first, BFF); entities scoped to justified use cases; off critical path or async patterns for non-essential MD access during purchase.
text
Before MD for new data:
1. Can Catalog, Checkout, Profile, or another native store hold this?
2. Is access in the hot path of purchase? If yes, redesign or justify.
Wrong — “MD for everything” or blocking checkout on MD round-trips under peak load without hard performance proof.
在不了解存储模型、限制和一致性特性的情况下使用Master Data,或者将其作为所有自定义数据的默认存储,或者放在购买关键路径上,会在规模扩大时带来延迟、可靠性和支持问题的风险。
重要性 —— MD是为已验证的实体模式优化的,不是为结账流程中的任意OLTP场景设计的;滥用会影响收入和稳定性。
检测方式 —— 同步订单流程在每次购物车变更时都调用MD;没有 schema 规范就「把所有自定义字段都存在MD里」;在评估catalog、profile或OMS原生选项之前就选择MD。
正确做法 —— 遵循vtex-io-masterdata规范(存储适配、优先用catalog、BFF);实体仅用于合理的用例;购买流程中非必要的MD访问采用非关键路径或异步模式。
text
Before MD for new data:
1. Can Catalog, Checkout, Profile, or another native store hold this?
2. Is access in the hot path of purchase? If yes, redesign or justify.
错误做法 —— 「所有数据都存在MD里」或者在峰值负载下让MD往返调用阻塞结账流程,且没有过硬的性能证明。

Preferred pattern

推荐实践流程

  1. Align the team on the three pillars (framework) in-session; use the Well-Architected Commerce MCP when you need expanded framework wording or the latest narrative.
  2. Route IO, MD, and marketplace concerns to the Routing to product tracks table—native/OOTB first; then scoped IO or MD with documented rationale.
  3. Separate differentiator from ops gap: for each custom build, label strategic differentiator vs process/operational fix; if the latter, prefer process or native tooling before code.
  4. Produce a short decision log: pillars addressed, native vs IO vs MD choices, “why not native” where applicable, open risks.
  5. Attach the relevant product track guidance for each implementation stream (storefront, payments, IO, marketplace).
  6. Revisit Operational Excellence: commodities on platform, team focus on support and differentiators, plus metrics, release process, and incident response before go-live.
  1. 开会时让团队对齐三大支柱(框架)要求;如果需要扩展的框架说明或最新的规范描述,参考精良电商架构MCP
  2. 将IO、MD和市场相关的需求对应到「对应到产品赛道」表格——优先使用原生/开箱即用能力;再考虑有明确理由的限定范围IO或MD开发。
  3. 区分差异化能力和运营缺口:每个自定义开发都要标注是战略差异化能力还是流程/运营修复;如果是后者,优先考虑流程优化或原生工具,再考虑代码开发。
  4. 产出简短的决策日志:覆盖的支柱、原生/IO/MD选型、适用时的「不使用原生方案的原因」、未解决的风险。
  5. 为每个实现流(店面、支付、IO、市场)附上对应的产品赛道指导规范。
  6. 上线前复查卓越运营要求:通用能力跑在平台上,团队聚焦于支持和差异化能力,同时明确指标、发布流程和故障响应机制。

Common failure modes

常见失效模式

  • Meta-skill overuse — Spending architecture narrative on problems already fully specified by a product skill (e.g. PPP idempotency rules).
  • Pillar theater — Labeling slides with three pillars without changing concrete decisions or ownership.
  • IO-default bias — Reaching for VTEX IO before proving native/OOTB cannot satisfy the requirement; treating customization as the first step.
  • MD-as-Postgres — Using Master Data for every entity without modeling, or coupling checkout to synchronous MD reads/writes.
  • Simplicity misunderstood — Interpreting “simple architecture” as “one big IO app that does everything” instead of “minimum custom surface, maximum native leverage.”
  • Tech over process — Automating or coding around broken operational workflows instead of fixing ownership, SLAs, or training.
  • Commodity customization — Customer tech teams maintaining bespoke implementations of behaviors VTEX already provides as standard product, starving real differentiators.
  • Missing handoff — Architecture doc with no pointers to which product skills and which official VTEX guides developers must follow.
  • 元技能滥用 —— 在产品技能已经完全明确规范的问题上做架构层面的讨论(例如PPP幂等性规则)。
  • 支柱形式主义 —— 只在PPT上标注三大支柱,没有改变实际决策或权责归属。
  • IO默认偏好 —— 在验证原生/开箱即用能力无法满足需求之前就选择VTEX IO;把自定义开发作为第一步。
  • 把MD当Postgres用 —— 不建模就把所有实体都存在Master Data里,或者把结账流程和同步MD读写耦合。
  • 对简洁性的误解 —— 把「简洁架构」理解为「一个大IO应用实现所有功能」,而非「最小化自定义范围,最大化利用原生能力」。
  • 重技术轻流程 —— 通过自动化或代码绕过有问题的运营工作流,而不是修复权责、SLA或培训问题。
  • 通用能力定制 —— 客户技术团队维护VTEX已经作为标准产品提供的功能的自定义实现,占用了真正差异化能力的开发资源。
  • 缺失交接指引 —— 架构文档没有指明开发者必须遵循哪些产品技能和哪些VTEX官方指南。

Review checklist

评审检查清单

  • Are Technical Foundation concerns (auth, secrets, PCI scope, private APIs) explicitly addressed?
  • For each VTEX IO extension: was native/OOTB evaluated first, and is there a written “why not native” when IO was chosen?
  • For Master Data use: does the team understand MD architecture (see Reference), and is MD off the purchase critical path unless strongly justified?
  • Is each custom component labeled differentiator vs operational/process gap, and are process fixes considered before new code?
  • Does Future-proof hold: each new service, queue, or datastore has owner, failure mode, and pillar-based reason; no unmotivated sprawl?
  • Are integration hops minimized; are extra services justified by failure isolation or team boundaries?
  • Does Operational Excellence show commodities on the platform, clear focus for support and differentiators, plus metrics, monitoring, and release/incident ownership?
  • Has every implementation stream been mapped to a product track skill?
  • Are official VTEX docs linked for areas that have platform-specific constraints?
  • 是否明确处理了技术基础相关问题(认证、密钥、PCI合规范围、私有API)?
  • 每个VTEX IO扩展是否都先评估了原生/开箱即用能力,选择IO时是否有书面的「不使用原生方案的原因」
  • 对于Master Data的使用:团队是否理解MD架构(见参考资料),除非有强理由,否则MD是否不在购买关键路径上
  • 每个自定义组件是否都标注了是差异化能力还是运营/流程缺口,是否在写代码前先考虑了流程修复方案?
  • 是否符合面向未来要求:每个新增服务、队列或数据存储都有负责人故障模式基于支柱的理由;没有无理由的架构膨胀?
  • 是否最小化了集成链路;额外服务是否有故障隔离或团队边界的合理理由?
  • 是否符合卓越运营要求:通用能力跑在平台上,支持和差异化能力的定位清晰,同时明确了指标监控发布/故障权责
  • 每个实现流是否都对应到了产品赛道技能
  • 有平台特定约束的领域是否附上了官方VTEX文档链接?

Related skills

相关技能

  • vtex-io-service-paths-and-cdn
    service.json
    paths, edge/CDN and session behavior.
  • vtex-io-application-performance — LRU/VBase, AppSettings, parallel fetches, tenant keys on shared pods.
  • vtex-io-app-structure — IO manifest, builders, policies (use only after native/OOTB path is ruled out).
  • vtex-io-masterdata — Master Data v2 storage-fit scrutiny, BFF, single source of truth.
  • headless-bff-architecture — BFF and credential boundaries for headless.
  • payment-pci-security — PCI and Secure Proxy constraints.
  • faststore-data-fetching — GraphQL extensions and data layer.
  • marketplace-order-hook — Marketplace order integration patterns.
  • vtex-io-service-paths-and-cdn ——
    service.json
    路径、edge/CDN和会话行为。
  • vtex-io-application-performance —— LRU/VBase、AppSettings、并行请求、共享Pod上的租户密钥。
  • vtex-io-app-structure —— IO manifest、构建器、权限(仅在排除原生/开箱即用路径后使用)。
  • vtex-io-masterdata —— Master Data v2存储适配评估、BFF、单一数据源。
  • headless-bff-architecture —— headless场景下的BFF和凭证边界。
  • payment-pci-security —— PCI和安全代理约束。
  • faststore-data-fetching —— GraphQL扩展和数据层。
  • marketplace-order-hook —— 市场订单集成模式。

Reference

参考资料