Master Data Strategy
When this skill applies
Use this skill when deciding whether Master Data v2 is the right mechanism for custom data in a VTEX IO app.
- Modeling reviews, wishlists, forms, or custom app records
- Choosing entity boundaries
- Planning schema indexing and lifecycle
- Reviewing long-term Master Data design
Do not use this skill for:
- low-level client usage details
- runtime or route structure
- app settings schemas
- frontend UI behavior
Decision rules
- Use this skill once Master Data is a serious candidate storage mechanism. For the broader choice between Master Data, VBase, VTEX core APIs, and external stores, use
vtex-io-data-access-patterns
.
- Use Master Data for structured custom data that needs validation, indexing, and query support.
- Use the builder when this app introduces a new business entity, owns the data model, and wants the schema to be created and versioned as part of the app contract.
- Prefer using only the Master Data client when the entity and schema already exist and are shared or centrally managed, and this app only needs to read or write records without redefining the schema itself.
- For stable schemas that the app owns but should not be recreated or updated on every app version, keep the schema definition in code and use the Master Data client in a controlled setup path to create or update the schema only when needed.
- Remember that Master Data entities are account-scoped. Changing a shared entity or schema affects every app in that account that depends on it, so prefer client-only consumption when the schema is centrally managed.
- Keep entity boundaries intentional and aligned with the business concept being stored.
- Index fields that are actually used for filtering and search.
- Plan schema lifecycle explicitly to avoid schema sprawl.
- Consider data volume and retention from the start. If the dataset will grow unbounded and there is no retention or archival strategy, Master Data is likely not the right storage mechanism.
- Do not treat Master Data as an unbounded dumping ground for arbitrary payloads.
- Do not use Master Data as an unbounded log or event store for high-volume append-only data. Prefer dedicated logging or storage mechanisms when the main need is raw history rather than structured queries.
- Do not store secrets, credentials, or global app configuration in Master Data. Use app settings or configuration apps instead.
- Do not generate one entity or schema per account, workspace, or feature flag. Keep a stable entity name and distinguish tenants or environments through record fields when necessary.
- Be careful when tying schema evolution directly to app versioning through the builder. Frequent schema changes coupled to app releases can generate excessive schema updates, indexing changes, and long-term schema sprawl.
Choosing between the builder and the Master Data client
There are three main ways for a VTEX IO app to work with Master Data:
-
Owning the schema via the
builder:
- The app declares entities and schemas under in the repository.
- Schema fields, validation, and indexing evolve together with the app code.
- Use this when the app is the primary owner of the data model, schema changes are relatively infrequent, and the schema should be rolled out as part of the app contract.
-
Consuming an existing schema via the Master Data client only:
- The app uses a Master Data client, but does not declare entities or schemas through the builder.
- The app assumes a stable schema managed elsewhere and only reads or writes records that follow that contract.
- Use this when the entity is shared across multiple apps or managed centrally, and this app should not redefine or fragment the schema across environments.
-
Owning a stable schema definition in code and applying it through the client:
- The app keeps a stable schema definition in code instead of builder files.
- A controlled setup path checks whether the schema exists and creates or updates it only when needed.
- Use this when the app truly owns the schema, but should not couple schema rollout to every app version or every release pipeline step.
Hard constraints
Constraint: Master Data entities must have explicit schema boundaries
Each entity MUST represent a clear business concept and have a schema that matches its intended usage.
Why this matters
Weak entity boundaries create confusing queries, poor indexing choices, and schema drift.
Detection
If one entity mixes unrelated concepts or stores many unrelated record shapes, STOP and split the design.
Correct
json
{
"title": "review-schema-v1",
"type": "object",
"properties": {
"productId": { "type": "string" },
"userId": { "type": "string" },
"rating": { "type": "number" },
"approved": { "type": "boolean" }
},
"required": ["productId", "userId", "rating"],
"v-indexed": ["productId", "userId", "approved"]
}
Wrong
json
{
"title": "everything-schema",
"type": "object"
}
Constraint: Indexed fields must match real query behavior
Fields used in filters or lookups MUST be indexed intentionally.
Why this matters
Missing indexes lead to poor query behavior and unnecessary operational risk.
Detection
If queries depend on fields that are not represented in indexing strategy, STOP and align schema and access patterns.
Correct
json
{
"v-indexed": ["productId", "approved"]
}
Wrong
Constraint: Schema lifecycle must be managed explicitly
Master Data schema evolution MUST be planned with cleanup and versioning in mind.
Why this matters
Unmanaged schema growth creates long-term operational pain and can run into platform limits.
Detection
If schema versions are added with no lifecycle or cleanup plan, STOP and define that plan.
Correct
text
review-schema-v1 -> review-schema-v2 with cleanup plan
Wrong
text
review-schema-v1, v2, v3, v4, v5 with no cleanup strategy
Remember that changing indexed fields or field types can affect how existing documents are indexed and queried. When schema evolution is coupled to frequent app version changes, this risk increases.
Constraint: Entity and schema names must remain stable across environments
Entity names and schema identifiers MUST remain stable across accounts, workspaces, and environments. Do not encode account names, workspaces, or rollout flags into the entity or schema name itself.
Why this matters
Per-account or per-workspace schema naming leads to schema sprawl, harder lifecycle management, and operational limits that are difficult to clean up later.
Detection
If the design proposes one entity or schema per workspace, per account, or per environment, STOP and redesign around stable names with scoped fields or records instead.
Correct
Wrong
text
review-schema-brazil-master
RV_US_MASTER
Using one clearly managed schema for development and one for production can be acceptable when there is a deliberate plan to keep them synchronized. Avoid generating schema names per workspace, per account, or per feature flag.
Preferred pattern
Use Master Data for structured custom records, index only what you query, and plan schema evolution deliberately.
Example: app owning a schema through the
builder
masterdata/review-schema-v1.json
declares the schema and indexes for the entity.
- The app then uses a dedicated Master Data client to create and query documents.
json
{
"title": "review-schema-v1",
"v-entity": "RV",
"type": "object",
"properties": {
"productId": { "type": "string" },
"userId": { "type": "string" },
"rating": { "type": "number" },
"approved": { "type": "boolean" }
},
"required": ["productId", "userId", "rating"],
"v-indexed": ["productId", "userId", "approved"]
}
Example: app consuming an existing schema through the client only
- This app declares no builder files.
- It uses the Master Data client against an existing, stable entity managed elsewhere.
typescript
await ctx.clients.masterdata.createDocument({
dataEntity: 'RV',
fields: {
productId,
userId,
rating,
approved: false,
},
})
Example: app owning a stable schema in code and ensuring it exists through the client
- The app keeps a stable schema definition in code.
- A controlled setup path ensures the schema exists instead of relying on the builder for every rollout.
typescript
const schema = {
title: 'review-schema-v1',
'v-entity': 'RV',
}
const existing = await ctx.clients.masterdata.getSchema('review-schema-v1')
if (!existing) {
await ctx.clients.masterdata.createOrUpdateSchema('review-schema-v1', schema)
}
Common failure modes
- Creating entities that are too broad.
- Querying on fields that are not indexed.
- Accumulating schema versions with no lifecycle plan.
- Using Master Data as a high-volume log or event sink without retention or archival strategy.
- Storing configuration, secrets, or cross-app shared settings in Master Data instead of using configuration-specific mechanisms.
- Generating per-account or per-workspace entities such as instead of using a stable entity like with scoped record fields.
- Relying on the builder for frequent schema changes tied to every app version, causing excessive schema updates and indexing side effects over time.
Review checklist
Related skills
vtex-io-data-access-patterns
- Use when deciding between Master Data, VBase, VTEX core APIs, or external stores for a given dataset
Reference
- Master Data - Platform data storage context