marketplace-search-recsys-planning

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Marketplace Engineering Two-Sided Search and Recsys Planning Best Practices

市场工程:双边信任型搜索与推荐系统规划最佳实践

Comprehensive planning, design and diagnostic guide for search and recommendation systems in two-sided trust marketplaces. Covers OpenSearch index, query and ranking patterns, the methodology for planning retrieval work, the handoff points to recommendation-specific tooling, and the instrumentation and dashboard layer that turns measurement into ongoing decision making. Contains 57 rules across 10 categories ordered by cascade impact, plus two playbooks (plan a new system from scratch, diagnose an existing one) and explicit living-artefact conventions (decisions log, golden set, gotchas).
本指南为双边信任型市场中的搜索与推荐系统提供全面的规划、设计与诊断方案,涵盖OpenSearch索引、查询与排序模式,检索工作的规划方法论,向推荐专用工具的交接要点,以及将度量数据转化为持续决策的监控仪表盘层。内容包含10个类别下的57条规则(按影响传导顺序排列),外加两份操作手册(从零规划新系统、诊断现有系统),以及明确的动态文档规范(决策日志、黄金测试集、问题记录)。

When to Apply

适用场景

Reference this skill when:
  • Planning a new marketplace retrieval project from scratch
  • Reviewing an existing retrieval system that feels stale, unfair, or unpersonalised
  • Designing the OpenSearch index mapping, analyzers, or query DSL
  • Choosing retrieval primitives per product surface (search, recs, hybrid, curated)
  • Deciding which search quality metrics to track and dashboard
  • Running the weekly search-quality review ritual
  • Diagnosing a silent regression in ranking, coverage, or zero-result rate
  • Deciding when a retrieval problem is actually a personalisation problem
This skill is the precursor to
marketplace-personalisation
. Start here for planning and search work; hand off to the personalisation skill when the diagnosed bottleneck is impression tracking, feedback-loop bias, or AWS Personalize-specific design.
在以下场景中可参考本技能:
  • 从零规划全新的市场检索项目
  • 评审表现不佳、缺乏公平性或个性化的现有检索系统
  • 设计OpenSearch索引映射、分词器或query DSL
  • 为不同产品界面(搜索、推荐、混合、人工精选)选择合适的检索原语
  • 确定需要跟踪并在仪表盘展示的搜索质量指标
  • 开展每周一次的搜索质量评审流程
  • 诊断排序、覆盖范围或零结果率方面的隐性退化问题
  • 判断检索问题是否实际属于个性化问题
本技能是
marketplace-personalisation
的前置技能。规划与搜索相关工作时请从本技能入手;当诊断出的瓶颈涉及曝光跟踪、反馈循环偏差或AWS Personalize专属设计时,可转交给个性化技能处理。

Living Context

动态上下文

This skill treats the system as evolving. Three living artefacts carry context across sessions, releases, and team changes — read them before making suggestions, update them after every shipped change:
  • gotchas.md
    (in this skill folder) — append-only diagnostic lessons. Every gotcha has a date and a short description of what surprised the team and how it was resolved.
  • Decisions log (maintained in the product repo, typically
    decisions/*.md
    ) — every ranking change, schema tweak, and synonym edit recorded with its hypothesis, offline and online evidence, ship criterion, outcome, and rollback path. See rule
    plan-maintain-a-decisions-log
    .
  • Golden query set (frozen per eval cycle, committed to the product repo) — the reference set of queries against which every ranking change is offline-evaluated before an online test. See rule
    plan-version-the-golden-set
    .
本技能将系统视为持续演进的对象。三类动态文档可跨会话、版本迭代与团队变更保留上下文信息——在提出建议前请先阅读这些文档,在每次上线变更后请更新它们:
  • gotchas.md
    (位于本技能文件夹中)——仅可追加的诊断经验记录。每条记录包含日期、团队遇到的意外问题及解决方法。
  • 决策日志(维护在产品代码库中,通常路径为
    decisions/*.md
    )——所有排序变更、schema调整、同义词编辑都需记录假设前提、离线与在线验证证据、上线标准、结果及回滚方案。详见规则
    plan-maintain-a-decisions-log
  • 黄金查询集(每个评估周期固定版本,提交至产品代码库)——所有排序变更在进行在线测试前,都需基于该参考查询集完成离线评估。详见规则
    plan-version-the-golden-set

Rule Categories

规则类别

Categories are ordered by cascade impact on the retrieval lifecycle: intent misunderstanding poisons architecture; wrong architecture poisons index; wrong index poisons retrieval forever until a reindex; every downstream layer inherits the upstream error.
#CategoryPrefixImpact
1Problem Framing and User Intent
intent-
CRITICAL
2Surface Taxonomy and Architecture
arch-
CRITICAL
3Index Design and Mapping
index-
HIGH
4Planning and Improvement Methodology
plan-
HIGH
5Query Understanding
query-
MEDIUM-HIGH
6Retrieval Strategy
retrieve-
MEDIUM-HIGH
7Relevance and Ranking
rank-
MEDIUM-HIGH
8Search and Recommender Blending
blend-
MEDIUM
9Measurement and Experimentation
measure-
MEDIUM
10Instrumentation, Dashboards and Decision Triggers
monitor-
MEDIUM
类别按对检索生命周期的影响传导顺序排列:意图理解错误会影响架构设计;错误的架构会导致索引设计缺陷;索引问题会永久影响检索效果,除非重新索引;下游所有环节都会继承上游的错误。
序号类别前缀影响等级
1问题构建与用户意图
intent-
CRITICAL
2界面分类与架构设计
arch-
CRITICAL
3索引设计与映射
index-
HIGH
4规划与优化方法论
plan-
HIGH
5查询理解
query-
MEDIUM-HIGH
6检索策略
retrieve-
MEDIUM-HIGH
7相关性与排序
rank-
MEDIUM-HIGH
8搜索与推荐融合
blend-
MEDIUM
9度量与实验
measure-
MEDIUM
10监控、仪表盘与决策触发
monitor-
MEDIUM

Quick Reference

快速参考

1. Problem Framing and User Intent (CRITICAL)

1. 问题构建与用户意图(CRITICAL)

  • intent-map-queries-to-intent-classes
    — classify before retrieving
  • intent-separate-known-item-from-discovery
    — different failure modes, different strategies
  • intent-audit-live-query-logs-first
    — design from real data, not imagined data
  • intent-distinguish-transactional-from-exploratory
    — precision vs diversity
  • intent-reject-one-search-for-everything
    — per-surface query shapes
  • intent-treat-no-search-as-first-class-choice
    — curated is a legitimate answer
  • intent-map-queries-to-intent-classes
    —— 先分类,再检索
  • intent-separate-known-item-from-discovery
    —— 不同的失败模式对应不同的策略
  • intent-audit-live-query-logs-first
    —— 基于真实数据设计,而非假想数据
  • intent-distinguish-transactional-from-exploratory
    —— 精准度 vs 多样性
  • intent-reject-one-search-for-everything
    —— 为不同界面设计不同的查询模式
  • intent-treat-no-search-as-first-class-choice
    —— 人工精选是合理的解决方案

2. Surface Taxonomy and Architecture (CRITICAL)

2. 界面分类与架构设计(CRITICAL)

  • arch-map-surface-to-retrieval-primitive
    — a single-source-of-truth routing table
  • arch-split-candidate-generation-from-ranking
    — two-stage pipelines
  • arch-design-zero-result-fallback
    — declare fallback owner per surface
  • arch-design-for-cold-start-from-day-one
    — cold start is permanent, not bootstrap
  • arch-avoid-mono-stack-retrieval
    — diversify primary dependencies
  • arch-route-surfaces-deliberately
    — every routing decision recorded
  • arch-map-surface-to-retrieval-primitive
    —— 构建单一可信的路由表
  • arch-split-candidate-generation-from-ranking
    —— 采用两阶段流水线架构
  • arch-design-zero-result-fallback
    —— 为每个界面指定fallback方案负责人
  • arch-design-for-cold-start-from-day-one
    —— 冷启动是长期存在的问题,而非仅初始化阶段
  • arch-avoid-mono-stack-retrieval
    —— 多样化核心依赖
  • arch-route-surfaces-deliberately
    —— 所有路由决策都需记录

3. Index Design and Mapping (HIGH)

3. 索引设计与映射(HIGH)

  • index-design-mappings-conservatively
    — reindex is expensive
  • index-use-keyword-and-text-as-multi-fields
    — full-text plus exact match
  • index-match-index-and-query-time-analyzers
    — tokens must agree
  • index-use-language-analyzers-for-language-fields
    — language-aware stemming
  • index-separate-searchable-from-display-fields
    — index only what you search
  • index-use-index-templates-for-consistency
    — prevent mapping drift
  • index-stream-listing-updates-via-cdc
    — freshness in seconds, not hours
  • index-design-mappings-conservatively
    —— 重新索引成本高昂,需谨慎设计
  • index-use-keyword-and-text-as-multi-fields
    —— 同时支持全文检索与精确匹配
  • index-match-index-and-query-time-analyzers
    —— 索引与查询阶段的分词器需保持一致
  • index-use-language-analyzers-for-language-fields
    —— 针对语言字段使用支持词干提取的分词器
  • index-separate-searchable-from-display-fields
    —— 仅索引需要搜索的字段
  • index-use-index-templates-for-consistency
    —— 防止映射漂移
  • index-stream-listing-updates-via-cdc
    —— 实现秒级新鲜度,而非小时级

4. Planning and Improvement Methodology (HIGH)

4. 规划与优化方法论(HIGH)

  • plan-audit-before-you-build
    — instrumentation gate on kick-off
  • plan-build-golden-query-set-first
    — the first artefact, not the last
  • plan-find-bottleneck-before-optimising
    — theory of constraints
  • plan-maintain-a-decisions-log
    — living context across team changes
  • plan-version-the-golden-set
    — frozen per eval cycle
  • plan-handoff-to-personalisation-skill
    — recognise the boundary
  • plan-audit-before-you-build
    —— 项目启动前先完成监控部署
  • plan-build-golden-query-set-first
    —— 黄金测试集是首个要构建的文档,而非最后一个
  • plan-find-bottleneck-before-optimising
    —— 遵循约束理论
  • plan-maintain-a-decisions-log
    —— 跨团队变更保留上下文信息
  • plan-version-the-golden-set
    —— 每个评估周期固定版本
  • plan-handoff-to-personalisation-skill
    —— 明确技能边界

5. Query Understanding (MEDIUM-HIGH)

5. 查询理解(MEDIUM-HIGH)

  • query-normalise-before-anything-else
    — canonical string in
  • query-use-language-analyzers-for-stemming
    — double-digit recall wins
  • query-curate-synonyms-by-domain
    — domain vocabulary not thesaurus
  • query-use-fuzzy-matching-for-typos
    — 10-15% of queries have typos
  • query-classify-before-routing
    — single-pass classifier
  • query-build-autocomplete-on-separate-index
    — latency isolation
  • query-normalise-before-anything-else
    —— 先将查询字符串标准化
  • query-use-language-analyzers-for-stemming
    —— 词干提取可显著提升召回率
  • query-curate-synonyms-by-domain
    —— 基于领域词汇而非通用同义词库
  • query-use-fuzzy-matching-for-typos
    —— 10-15%的查询存在拼写错误
  • query-classify-before-routing
    —— 使用单通道分类器
  • query-build-autocomplete-on-separate-index
    —— 隔离autocomplete的延迟影响

6. Retrieval Strategy (MEDIUM-HIGH)

6. 检索策略(MEDIUM-HIGH)

  • retrieve-use-filter-clauses-for-exact-matches
    — filter cache wins
  • retrieve-use-bool-structure-deliberately
    — must vs should vs filter
  • retrieve-run-expensive-signals-in-rescore
    — rescore window limits cost
  • retrieve-combine-bm25-and-knn-via-hybrid-search
    — lexical plus semantic
  • retrieve-paginate-with-search-after
    — constant-cost deep pagination
  • retrieve-choose-embedding-model-deliberately
    — re-embedding is expensive
  • retrieve-use-filter-clauses-for-exact-matches
    —— 使用过滤子句实现精确匹配,利用过滤缓存提升性能
  • retrieve-use-bool-structure-deliberately
    —— 合理使用must、should与filter结构
  • retrieve-run-expensive-signals-in-rescore
    —— 在rescore阶段处理计算成本高的信号,通过窗口限制成本
  • retrieve-combine-bm25-and-knn-via-hybrid-search
    —— 结合词法检索与语义检索
  • retrieve-paginate-with-search-after
    —— 使用search-after实现常量成本的深度分页
  • retrieve-choose-embedding-model-deliberately
    —— 重新生成嵌入向量成本高昂,需谨慎选择模型

7. Relevance and Ranking (MEDIUM-HIGH)

7. 相关性与排序(MEDIUM-HIGH)

  • rank-tune-bm25-parameters-last
    — upstream levers first
  • rank-use-function-score-for-business-signals
    — explicit named functions
  • rank-deploy-ltr-only-after-golden-set-exists
    — supervised learning needs labels
  • rank-apply-diversity-at-rank-time
    — after scoring, not before
  • rank-normalise-scores-across-retrieval-primitives
    — comparable scales
  • rank-tune-bm25-parameters-last
    —— 优先调优上游环节,最后再调整BM25参数
  • rank-use-function-score-for-business-signals
    —— 使用命名的显式函数处理业务信号
  • rank-deploy-ltr-only-after-golden-set-exists
    —— 监督式学习需要标签,需先构建黄金测试集
  • rank-apply-diversity-at-rank-time
    —— 在排序阶段实现多样性,而非检索阶段
  • rank-normalise-scores-across-retrieval-primitives
    —— 统一不同检索原语的分数尺度

8. Search and Recommender Blending (MEDIUM)

8. 搜索与推荐融合(MEDIUM)

  • blend-use-search-alone-for-specific-intent
    — precision queries
  • blend-combine-search-and-personalisation-scores
    — normalised weighted sum
  • blend-keep-hybrid-blending-explainable
    — traceable results
  • blend-never-return-zero-results
    — guaranteed cascade to non-empty
  • blend-use-search-alone-for-specific-intent
    —— 针对明确意图的查询仅使用搜索
  • blend-combine-search-and-personalisation-scores
    —— 使用标准化加权和融合搜索与个性化分数
  • blend-keep-hybrid-blending-explainable
    —— 确保混合结果可追溯
  • blend-never-return-zero-results
    —— 保证最终返回非空结果

9. Measurement and Experimentation (MEDIUM)

9. 度量与实验(MEDIUM)

  • measure-define-session-success-per-surface
    — one definition per surface
  • measure-track-ndcg-mrr-zero-result-rate
    — three metrics for one picture
  • measure-track-reformulation-rate-as-failure-signal
    — cheapest failure metric
  • measure-use-click-models-for-implicit-judgments
    — scale beyond human judges
  • measure-run-interleaving-as-cheap-ab-proxy
    — 10x less sample needed
  • measure-define-session-success-per-surface
    —— 为每个界面定义独立的会话成功标准
  • measure-track-ndcg-mrr-zero-result-rate
    —— 三个指标全面反映搜索质量
  • measure-track-reformulation-rate-as-failure-signal
    —— 查询重写率是成本最低的失败指标
  • measure-use-click-models-for-implicit-judgments
    —— 超越人工标注,实现规模化评估
  • measure-run-interleaving-as-cheap-ab-proxy
    —— 所需样本量仅为A/B测试的1/10

10. Instrumentation, Dashboards and Decision Triggers (MEDIUM)

10. 监控、仪表盘与决策触发(MEDIUM)

  • monitor-log-every-query-with-full-context
    — structured replayable events
  • monitor-scrub-pii-from-query-logs
    — redact before warehouse ingestion
  • monitor-build-search-health-dashboard
    — threshold lines, colour bands
  • monitor-alert-on-decision-triggers
    — quality metrics, not error rates
  • monitor-track-ranking-stability-churn
    — RBO churn as leading indicator
  • monitor-run-weekly-search-quality-review
    — calendar-driven ritual
  • monitor-log-every-query-with-full-context
    —— 记录包含完整上下文的结构化可重放事件
  • monitor-scrub-pii-from-query-logs
    —— 在导入数据仓库前先脱敏处理PII数据
  • monitor-build-search-health-dashboard
    —— 设置阈值线与颜色标识
  • monitor-alert-on-decision-triggers
    —— 针对质量指标告警,而非仅错误率
  • monitor-track-ranking-stability-churn
    —— 将RBO波动作为前置指标
  • monitor-run-weekly-search-quality-review
    —— 建立日历驱动的固定评审流程

Planning and Improving

规划与优化流程

Two playbooks compose the rules into end-to-end workflows:
  • references/playbooks/planning.md
    — Plan a new marketplace retrieval system from scratch. Nine-step workflow from intent audit through the first A/B-tested online lift, with explicit exit criteria per step.
  • references/playbooks/improving.md
    — Diagnose and improve an existing retrieval system. Decision tree that walks through telemetry, index freshness, coverage, baseline gap, cold start, segment regressions, and algorithm iteration in that order, with hand-off points to
    marketplace-personalisation
    when the bottleneck is personalisation-specific.
Read the playbooks first when the task is "design a new search and recommender project" or "this retrieval system needs to get better". Read individual rules when a specific question arises during implementation or review.
两份操作手册将规则整合为端到端的工作流:
  • references/playbooks/planning.md
    —— 从零规划全新的市场检索系统。包含从意图审计到首次A/B测试上线提升的九步工作流,每个步骤都有明确的退出标准。
  • references/playbooks/improving.md
    —— 诊断与优化现有检索系统。决策树将引导你依次检查遥测数据、索引新鲜度、覆盖范围、基准差距、冷启动问题、细分场景退化及算法迭代,当瓶颈为个性化相关问题时,可转交给
    marketplace-personalisation
    技能处理。
当你的任务是“设计全新的搜索与推荐项目”或“优化现有检索系统”时,请先阅读操作手册。在实现或评审过程中遇到具体问题时,再查阅对应的单条规则。

How to Use

使用方法

  • Read
    references/_sections.md
    for category structure and cascade rationale.
  • Read
    gotchas.md
    for diagnostic lessons accumulated from prior incidents.
  • Read
    references/playbooks/planning.md
    to plan a new system.
  • Read
    references/playbooks/improving.md
    to diagnose an existing one.
  • Read individual rule files when a specific task matches the rule title.
  • Use
    assets/templates/_template.md
    to author new rules as the skill grows.
  • 阅读
    references/_sections.md
    了解类别结构与影响传导逻辑。
  • 阅读
    gotchas.md
    获取过往事件积累的诊断经验。
  • 阅读
    references/playbooks/planning.md
    规划新系统。
  • 阅读
    references/playbooks/improving.md
    诊断现有系统。
  • 当遇到具体任务时,查阅对应的单条规则文件。
  • 使用
    assets/templates/_template.md
    在技能扩展时编写新规则。

Related Skills

相关技能

  • marketplace-personalisation
    — The companion skill covering AWS Personalize implementation, impression tracking, schema design, two-sided matching, feedback loops, and the personalisation-specific diagnostic playbook. Hand off to this skill when the diagnostic identifies a personalisation-specific bottleneck.
  • marketplace-personalisation
    —— 配套技能,涵盖AWS Personalize落地、曝光跟踪、schema设计、双边匹配、反馈循环及个性化专属诊断手册。当诊断出的瓶颈为个性化相关问题时,可转交给该技能处理。

Reference Files

参考文件

FileDescription
references/_sections.mdCategory definitions and impact ordering
references/playbooks/planning.mdPlan a new retrieval system
references/playbooks/improving.mdDiagnose an existing retrieval system
gotchas.mdAccumulated diagnostic lessons (living)
assets/templates/_template.mdTemplate for authoring new rules
metadata.jsonVersion, discipline, references
文件路径描述
references/_sections.md类别定义与影响排序说明
references/playbooks/planning.md新检索系统规划手册
references/playbooks/improving.md现有检索系统诊断与优化手册
gotchas.md积累的诊断经验(动态更新)
assets/templates/_template.md新规则编写模板
metadata.json版本、领域与参考信息