marketplace-search-recsys-planning

Original：🇺🇸 English

Translated

Use this skill whenever planning, designing, reviewing, or improving search and recommendation systems for a two-sided trust marketplace built on OpenSearch — covers user-intent framing, product-surface architecture, index design, query understanding, retrieval strategy, ranking, search-plus-recs blending, measurement, and a dashboard-and-alerting layer for ongoing decision making. Triggers on tasks involving marketplace search, homefeeds, ranking, relevance tuning, OpenSearch query DSL, analyzers, synonyms, golden sets, NDCG, A/B testing, or diagnosing an existing retrieval system. Use this skill BEFORE marketplace-personalisation when planning new work; hand off when the diagnosed bottleneck is personalisation-specific.

5installs

Sourcepproenca/dot-skills

Added on2026-04-15

NPX Install

npx skill4agent add pproenca/dot-skills marketplace-search-recsys-planning

SKILL.md Content

View Translation Comparison →

Marketplace Engineering Two-Sided Search and Recsys Planning Best Practices

Comprehensive planning, design and diagnostic guide for search and recommendation systems in two-sided trust marketplaces. Covers OpenSearch index, query and ranking patterns, the methodology for planning retrieval work, the handoff points to recommendation-specific tooling, and the instrumentation and dashboard layer that turns measurement into ongoing decision making. Contains 57 rules across 10 categories ordered by cascade impact, plus two playbooks (plan a new system from scratch, diagnose an existing one) and explicit living-artefact conventions (decisions log, golden set, gotchas).

When to Apply

Reference this skill when:

Planning a new marketplace retrieval project from scratch
Reviewing an existing retrieval system that feels stale, unfair, or unpersonalised
Designing the OpenSearch index mapping, analyzers, or query DSL
Choosing retrieval primitives per product surface (search, recs, hybrid, curated)
Deciding which search quality metrics to track and dashboard
Running the weekly search-quality review ritual
Diagnosing a silent regression in ranking, coverage, or zero-result rate
Deciding when a retrieval problem is actually a personalisation problem

This skill is the precursor to

marketplace-personalisation

. Start here for planning and search work; hand off to the personalisation skill when the diagnosed bottleneck is impression tracking, feedback-loop bias, or AWS Personalize-specific design.

Living Context

This skill treats the system as evolving. Three living artefacts carry context across sessions, releases, and team changes — read them before making suggestions, update them after every shipped change:

gotchas.md
(in this skill folder) — append-only diagnostic lessons. Every gotcha has a date and a short description of what surprised the team and how it was resolved.
Decisions log (maintained in the product repo, typically
```
decisions/*.md
```
) — every ranking change, schema tweak, and synonym edit recorded with its hypothesis, offline and online evidence, ship criterion, outcome, and rollback path. See rule
```
plan-maintain-a-decisions-log
```
.
Golden query set (frozen per eval cycle, committed to the product repo) — the reference set of queries against which every ranking change is offline-evaluated before an online test. See rule
```
plan-version-the-golden-set
```
.

Rule Categories

Categories are ordered by cascade impact on the retrieval lifecycle: intent misunderstanding poisons architecture; wrong architecture poisons index; wrong index poisons retrieval forever until a reindex; every downstream layer inherits the upstream error.


intent-
arch-
index-
plan-
query-
retrieve-
rank-
blend-
measure-
monitor-

#	Category	Prefix	Impact
1	Problem Framing and User Intent	`intent-`	CRITICAL
2	Surface Taxonomy and Architecture	`arch-`	CRITICAL
3	Index Design and Mapping	`index-`	HIGH
4	Planning and Improvement Methodology	`plan-`	HIGH
5	Query Understanding	`query-`	MEDIUM-HIGH
6	Retrieval Strategy	`retrieve-`	MEDIUM-HIGH
7	Relevance and Ranking	`rank-`	MEDIUM-HIGH
8	Search and Recommender Blending	`blend-`	MEDIUM
9	Measurement and Experimentation	`measure-`	MEDIUM
10	Instrumentation, Dashboards and Decision Triggers	`monitor-`	MEDIUM

Quick Reference

1. Problem Framing and User Intent (CRITICAL)

```
intent-map-queries-to-intent-classes
```
— classify before retrieving
```
intent-separate-known-item-from-discovery
```
— different failure modes, different strategies
```
intent-audit-live-query-logs-first
```
— design from real data, not imagined data

intent-distinguish-transactional-from-exploratory

— precision vs diversity

```
intent-reject-one-search-for-everything
```
— per-surface query shapes

intent-treat-no-search-as-first-class-choice

— curated is a legitimate answer

2. Surface Taxonomy and Architecture (CRITICAL)

```
arch-map-surface-to-retrieval-primitive
```
— a single-source-of-truth routing table

arch-split-candidate-generation-from-ranking

— two-stage pipelines

```
arch-design-zero-result-fallback
```
— declare fallback owner per surface
```
arch-design-for-cold-start-from-day-one
```
— cold start is permanent, not bootstrap
```
arch-avoid-mono-stack-retrieval
```
— diversify primary dependencies
```
arch-route-surfaces-deliberately
```
— every routing decision recorded

3. Index Design and Mapping (HIGH)

```
index-design-mappings-conservatively
```
— reindex is expensive

index-use-keyword-and-text-as-multi-fields

— full-text plus exact match

index-match-index-and-query-time-analyzers

— tokens must agree

index-use-language-analyzers-for-language-fields

— language-aware stemming

index-separate-searchable-from-display-fields

— index only what you search

index-use-index-templates-for-consistency

— prevent mapping drift

```
index-stream-listing-updates-via-cdc
```
— freshness in seconds, not hours

4. Planning and Improvement Methodology (HIGH)

```
plan-audit-before-you-build
```
— instrumentation gate on kick-off
```
plan-build-golden-query-set-first
```
— the first artefact, not the last
```
plan-find-bottleneck-before-optimising
```
— theory of constraints
```
plan-maintain-a-decisions-log
```
— living context across team changes
```
plan-version-the-golden-set
```
— frozen per eval cycle
```
plan-handoff-to-personalisation-skill
```
— recognise the boundary

5. Query Understanding (MEDIUM-HIGH)

```
query-normalise-before-anything-else
```
— canonical string in

query-use-language-analyzers-for-stemming

— double-digit recall wins

```
query-curate-synonyms-by-domain
```
— domain vocabulary not thesaurus
```
query-use-fuzzy-matching-for-typos
```
— 10-15% of queries have typos
```
query-classify-before-routing
```
— single-pass classifier

query-build-autocomplete-on-separate-index

— latency isolation

6. Retrieval Strategy (MEDIUM-HIGH)

retrieve-use-filter-clauses-for-exact-matches

— filter cache wins

retrieve-use-bool-structure-deliberately

— must vs should vs filter

retrieve-run-expensive-signals-in-rescore

— rescore window limits cost

retrieve-combine-bm25-and-knn-via-hybrid-search

— lexical plus semantic

```
retrieve-paginate-with-search-after
```
— constant-cost deep pagination

retrieve-choose-embedding-model-deliberately

— re-embedding is expensive

7. Relevance and Ranking (MEDIUM-HIGH)

```
rank-tune-bm25-parameters-last
```
— upstream levers first

rank-use-function-score-for-business-signals

— explicit named functions

rank-deploy-ltr-only-after-golden-set-exists

— supervised learning needs labels

```
rank-apply-diversity-at-rank-time
```
— after scoring, not before

rank-normalise-scores-across-retrieval-primitives

— comparable scales

8. Search and Recommender Blending (MEDIUM)

blend-use-search-alone-for-specific-intent

— precision queries

blend-combine-search-and-personalisation-scores

— normalised weighted sum

```
blend-keep-hybrid-blending-explainable
```
— traceable results
```
blend-never-return-zero-results
```
— guaranteed cascade to non-empty

9. Measurement and Experimentation (MEDIUM)

measure-define-session-success-per-surface

— one definition per surface

```
measure-track-ndcg-mrr-zero-result-rate
```
— three metrics for one picture

measure-track-reformulation-rate-as-failure-signal

— cheapest failure metric

measure-use-click-models-for-implicit-judgments

— scale beyond human judges

measure-run-interleaving-as-cheap-ab-proxy

— 10x less sample needed

10. Instrumentation, Dashboards and Decision Triggers (MEDIUM)

monitor-log-every-query-with-full-context

— structured replayable events

```
monitor-scrub-pii-from-query-logs
```
— redact before warehouse ingestion
```
monitor-build-search-health-dashboard
```
— threshold lines, colour bands
```
monitor-alert-on-decision-triggers
```
— quality metrics, not error rates
```
monitor-track-ranking-stability-churn
```
— RBO churn as leading indicator

monitor-run-weekly-search-quality-review

— calendar-driven ritual

Planning and Improving

Two playbooks compose the rules into end-to-end workflows:

```
references/playbooks/planning.md
```
— Plan a new marketplace retrieval system from scratch. Nine-step workflow from intent audit through the first A/B-tested online lift, with explicit exit criteria per step.
```
references/playbooks/improving.md
```
— Diagnose and improve an existing retrieval system. Decision tree that walks through telemetry, index freshness, coverage, baseline gap, cold start, segment regressions, and algorithm iteration in that order, with hand-off points to
```
marketplace-personalisation
```
when the bottleneck is personalisation-specific.

Read the playbooks first when the task is "design a new search and recommender project" or "this retrieval system needs to get better". Read individual rules when a specific question arises during implementation or review.

How to Use

Read
```
references/_sections.md
```
for category structure and cascade rationale.
Read
```
gotchas.md
```
for diagnostic lessons accumulated from prior incidents.
Read
```
references/playbooks/planning.md
```
to plan a new system.
Read
```
references/playbooks/improving.md
```
to diagnose an existing one.
Read individual rule files when a specific task matches the rule title.
Use
```
assets/templates/_template.md
```
to author new rules as the skill grows.

Related Skills

marketplace-personalisation
— The companion skill covering AWS Personalize implementation, impression tracking, schema design, two-sided matching, feedback loops, and the personalisation-specific diagnostic playbook. Hand off to this skill when the diagnostic identifies a personalisation-specific bottleneck.

Reference Files

File	Description
references/_sections.md	Category definitions and impact ordering
references/playbooks/planning.md	Plan a new retrieval system
references/playbooks/improving.md	Diagnose an existing retrieval system
gotchas.md	Accumulated diagnostic lessons (living)
assets/templates/_template.md	Template for authoring new rules
metadata.json	Version, discipline, references