sales-data-hygiene

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

CRM Data Hygiene & Quality

CRM 数据卫生与质量管理

Help the user clean, deduplicate, normalize, and maintain CRM data quality. This skill is tool-agnostic but includes platform-specific guidance for ZoomInfo OperationsOS, Salesforce native tools, HubSpot Operations Hub, Clay, LeanData, RingLead, Openprise, and DemandTools.
帮助用户完成CRM数据的清洗、去重、标准化和质量维护。本技能不绑定特定工具,同时包含以下平台的专属操作指引:ZoomInfo OperationsOS、Salesforce 原生工具、HubSpot Operations Hub、Clay、LeanData、RingLead、Openprise 和 DemandTools。

Step 1 — Gather context

第1步 — 收集上下文

Ask the user:
  1. What's the main data problem?
    • A) Duplicate contacts, leads, or accounts
    • B) Stale/outdated records (job changes, company changes)
    • C) Missing fields (no phone, no email, incomplete company data)
    • D) Inconsistent data (job titles, industries, company names formatted differently)
    • E) Compliance issues (opt-outs, GDPR, stale consent)
    • F) General data audit — don't know what's wrong yet
    • G) Setting up ongoing data hygiene automation
    • H) Other — describe it
  2. What CRM are you using?
    • A) Salesforce
    • B) HubSpot
    • C) Microsoft Dynamics
    • D) Pipedrive
    • E) Other CRM
    • F) Custom/in-house system
  3. How many records are affected?
    • A) Under 1,000 (small cleanup)
    • B) 1,000-10,000 (moderate)
    • C) 10,000-100,000 (large)
    • D) 100,000+ (enterprise-scale)
    • E) Not sure — need to audit first
  4. What tools do you have for data operations?
    • A) ZoomInfo OperationsOS
    • B) HubSpot Operations Hub
    • C) Salesforce native (duplicate management, data.com)
    • D) Clay
    • E) LeanData / RingLead / Openprise
    • F) DemandTools (Validity)
    • G) None — using manual processes
    • H) Other — describe it
询问用户以下信息:
  1. 核心数据问题是什么?
    • A) 联系人、线索或账户重复
    • B) 记录陈旧/过时(职位变动、公司变动)
    • C) 字段缺失(无手机号、无邮箱、公司信息不完整)
    • D) 数据不一致(职位、行业、公司名称格式不统一)
    • E) 合规问题(退订、GDPR、同意授权过期)
    • F) 通用数据审计 — 暂不清楚具体问题
    • G) 搭建常态化数据卫生自动化流程
    • H) 其他 — 请描述
  2. 你正在使用哪款CRM?
    • A) Salesforce
    • B) HubSpot
    • C) Microsoft Dynamics
    • D) Pipedrive
    • E) 其他CRM
    • F) 定制/自研系统
  3. 受影响的记录数量是多少?
    • A) 1000条以下(小型清理)
    • B) 1000-10000条(中型)
    • C) 10000-100000条(大型)
    • D) 100000条以上(企业级规模)
    • E) 不确定 — 需要先审计
  4. 你已有的数据运营工具包括哪些?
    • A) ZoomInfo OperationsOS
    • B) HubSpot Operations Hub
    • C) Salesforce 原生工具(重复管理、data.com)
    • D) Clay
    • E) LeanData / RingLead / Openprise
    • F) DemandTools (Validity)
    • G) 无 — 采用人工流程
    • H) 其他 — 请描述

Step 2 — Strategy and approach

第2步 — 策略与方法

Data quality audit framework

数据质量审计框架

Before fixing data, measure what's broken:
  1. Completeness — what % of records have all critical fields filled?
    • Email: target 95%+ for contacts
    • Phone: target 70%+ for key personas
    • Company: target 100% for accounts
    • Title/department: target 90%+ for contacts
  2. Accuracy — what % of filled fields are actually correct?
    • Email deliverability: verify a sample with an email verification tool
    • Phone connectivity: check a sample of direct dials
    • Job title currency: compare against LinkedIn for a sample
  3. Duplication rate — what % of records are duplicates?
    • Contact-level: same person, multiple records
    • Account-level: same company, different spellings
    • Cross-object: leads that are also contacts
  4. Decay rate — how fast does your data go stale?
    • Industry average: 30% of B2B data decays annually
    • Sales contacts: ~20% change jobs each year
    • Direct dials: ~15% become invalid annually
    • Emails: ~22% bounce rate after 12 months without refresh
  5. Consistency — are the same things called the same thing?
    • Job titles: "VP Sales" vs "Vice President of Sales" vs "VP, Sales"
    • Industries: "SaaS" vs "Software" vs "Technology"
    • Company names: "IBM" vs "International Business Machines" vs "IBM Corp"
修复数据前,先评估问题范围:
  1. 完整性 — 所有关键字段均已填充的记录占比?
    • 邮箱:联系人目标完成率95%以上
    • 手机号:核心角色目标完成率70%以上
    • 公司信息:账户目标完成率100%
    • 职位/部门:联系人目标完成率90%以上
  2. 准确性 — 已填充字段实际正确的占比?
    • 邮箱送达率:用邮箱验证工具抽样验证
    • 手机号连通性:抽样核实直拨号码有效性
    • 职位时效性:抽样和LinkedIn信息对比
  3. 重复率 — 重复记录的占比?
    • 联系人层级:同一人对应多条记录
    • 账户层级:同一公司对应不同拼写的多条记录
    • 跨对象:同一条线索同时也是联系人
  4. 失效率 — 数据过期的速度?
    • 行业平均:B2B数据每年失效率为30%
    • 销售联系人:每年约20%的人更换工作
    • 直拨手机号:每年约15%的号码失效
    • 邮箱:12个月不更新的话退信率约22%
  5. 一致性 — 同类信息是否采用统一命名?
    • 职位:"VP Sales" vs "Vice President of Sales" vs "VP, Sales"
    • 行业:"SaaS" vs "Software" vs "Technology"
    • 公司名:"IBM" vs "International Business Machines" vs "IBM Corp"

Deduplication strategy

去重策略

ApproachWhen to useRisk level
Exact matchEmail, phone, domain — safestLow
Fuzzy matchNames, company names, addressesMedium — review matches before merging
Rule-basedCombine multiple fields (name + company + title)Medium
ML-basedLarge datasets with complex patternsLow (if trained well) — but expensive
Merge rules (which record wins):
  • Most recently updated record keeps modifiable fields
  • Most complete record keeps enrichment data
  • Oldest record keeps the original owner/source
  • Always preserve: original lead source, first touch date, opt-in status
方法适用场景风险等级
精确匹配邮箱、手机号、域名 — 最安全
模糊匹配姓名、公司名、地址中 — 合并前需审核匹配结果
规则匹配组合多个字段匹配(姓名+公司+职位)
机器学习匹配存在复杂模式的大型数据集低(训练效果好的前提下)— 但成本高
合并规则(哪条记录保留为主记录):
  • 最近更新的记录保留可修改字段
  • 信息最完整的记录保留补全的扩展数据
  • 最早的记录保留原始归属人/来源
  • 始终保留字段:原始线索来源、首次触达日期、授权订阅状态

Data normalization

数据标准化

FieldCommon problemsSolution
Job titleAbbreviations, variations, custom titlesMap to standard taxonomy (C-Level, VP, Director, Manager, IC)
IndustryFree-text, overlapping categoriesMap to SIC/NAICS or your internal taxonomy
Company nameAbbreviations, legal suffixes, DBA namesNormalize to official name, store variants as aliases
PhoneMixed formats, extensions, country codesE.164 format (+1XXXXXXXXXX)
AddressInconsistent formatting, abbreviationsUSPS standardization or Google Maps API
CountryMix of codes and namesISO 3166-1 alpha-2 codes
字段常见问题解决方案
职位缩写、变体、自定义职位映射到标准分类体系(高管层、VP、总监、经理、普通员工)
行业自由文本、分类重叠映射到SIC/NAICS或内部分类体系
公司名缩写、法律后缀、商号名称标准化为官方名称,变体存为别名
手机号格式混杂、分机号、国家码不统一采用E.164格式(+1XXXXXXXXXX)
地址格式不一致、缩写采用USPS标准或谷歌地图API标准化
国家代码和名称混用采用ISO 3166-1 alpha-2编码

Enrichment automation

补全自动化

Set up ongoing enrichment to prevent decay:
  1. Trigger-based — enrich when a record is created or updated
  2. Scheduled — monthly/quarterly batch enrichment of all records
  3. Decay-based — re-enrich records older than X days
  4. Event-based — re-enrich when a contact's company has a news event (funding, acquisition)
搭建常态化补全流程避免数据失效:
  1. 触发式补全 — 记录创建或更新时自动补全
  2. 定时补全 — 月度/季度批量补全所有记录
  3. 失效触发补全 — 重新补全超过X天未更新的记录
  4. 事件触发补全 — 联系人所属公司发生新闻事件(融资、收购)时重新补全

Step 3 — Platform-specific guidance

第3步 — 平台专属操作指引

In ZoomInfo (OperationsOS)

ZoomInfo (OperationsOS)

ZoomInfo OperationsOS is purpose-built for CRM data management at scale.
Deduplication:
  • OperationsOS identifies duplicates across contacts, leads, and accounts using fuzzy matching
  • Configurable match rules: email, name+company, phone, domain
  • Bulk merge with configurable "winning record" rules
  • Cross-object dedup: find leads that already exist as contacts
Data orchestration:
  • Build automated workflows: new record → match existing → enrich → normalize → route
  • Configure matching rules to prevent duplicates before they're created
  • Set up enrichment triggers on record creation, field change, or scheduled interval
  • Normalization rules for job titles, industries, company names
Data decay management:
  • Auto-detect job changes and company changes
  • Flag stale records based on last-enriched date
  • Configure re-enrichment schedules (monthly recommended for active pipeline)
  • Track data quality metrics over time
Setup: ZoomInfo admin → OperationsOS → Data Orchestration → Create Workflow.
ZoomInfo OperationsOS是为大规模CRM数据管理量身打造的工具。
去重
  • OperationsOS通过模糊匹配识别联系人、线索、账户中的重复记录
  • 可配置匹配规则:邮箱、姓名+公司、手机号、域名
  • 批量合并支持自定义「主记录」规则
  • 跨对象去重:识别已经是联系人的重复线索
数据编排
  • 搭建自动化工作流:新记录 → 匹配已有记录 → 补全信息 → 标准化 → 分配
  • 配置匹配规则在重复记录创建前就拦截
  • 可在记录创建、字段变更或固定时间点触发补全流程
  • 支持职位、行业、公司名的标准化规则
数据失效管理
  • 自动检测职位变动和公司变动
  • 根据上次补全日期标记过期记录
  • 配置重新补全 schedule(活跃商机推荐每月补全)
  • 长期追踪数据质量指标
设置路径:ZoomInfo admin → OperationsOS → Data Orchestration → Create Workflow。

In Salesforce (Native)

Salesforce(原生工具)

Duplicate Management:
  • Setup → Duplicate Management → Duplicate Rules
  • Standard rules: match on email, name+company, phone
  • Custom matching rules for complex scenarios
  • Block or alert on duplicate creation
Data.com Clean (if licensed):
  • Batch cleaning of contacts and accounts
  • Auto-enrichment on record creation
  • Scheduled cleanups
Limitations: Native dedup is basic — no fuzzy matching, no cross-object dedup, no automated merge. For enterprise-scale, pair with DemandTools or ZoomInfo OperationsOS.
重复管理
  • 设置路径:Setup → Duplicate Management → Duplicate Rules
  • 标准规则:基于邮箱、姓名+公司、手机号匹配
  • 支持复杂场景的自定义匹配规则
  • 可拦截重复记录创建或发出提醒
Data.com Clean(已购买授权的用户可用):
  • 批量清洗联系人和账户
  • 记录创建时自动补全
  • 定时清理
局限性:原生去重功能基础 — 无模糊匹配、无跨对象去重、无自动合并。企业级场景建议搭配DemandTools或ZoomInfo OperationsOS使用。

In HubSpot (Operations Hub)

HubSpot(Operations Hub)

Deduplication:
  • Operations Hub includes AI-powered duplicate detection
  • Suggests merge candidates with confidence scores
  • Bulk merge with "primary record" selection
  • Available on Operations Hub Professional+
Data quality automation:
  • Programmable automation (Operations Hub Professional): custom code actions for normalization
  • Data quality command center: monitor property completeness, formatting issues
  • Automated formatting: capitalize names, standardize phone numbers, clean URLs
Limitations: HubSpot dedup is contact/company only — no custom object dedup. Formatting automation requires Operations Hub Professional ($800/mo+).
去重
  • Operations Hub包含AI驱动的重复检测功能
  • 会推荐合并候选并给出置信度评分
  • 批量合并支持选择「主记录」
  • 需Operations Hub Professional及以上版本可用
数据质量自动化
  • 可编程自动化(Operations Hub Professional):支持自定义代码实现标准化逻辑
  • 数据质量指挥中心:监控属性完整性、格式问题
  • 自动格式化:姓名首字母大写、手机号标准化、URL清洗
局限性:HubSpot去重仅支持联系人/公司 — 不支持自定义对象去重。格式化自动化需要Operations Hub Professional版本(800美元/月起)。

In Clay

Clay

  • CRM enrichment & refresh: Import contacts/accounts from Salesforce, HubSpot, or Dynamics 365 into Clay tables. Run waterfall enrichment to fill missing fields and refresh stale data. Push updated records back via bidirectional sync.
  • Automated data maintenance: Set up scheduled imports to regularly pull CRM records into Clay, re-enrich, and sync back. Keeps contact data fresh without manual effort.
  • Duplicate detection: Use enrichment data (LinkedIn URLs, company domains, verified emails) to identify and flag duplicate records before syncing back to CRM.
  • Data standardization: Use Sculptor workflows to normalize job titles, company names, industry classifications, and other fields. Apply consistent formatting before pushing to CRM.
  • Plan gate: CRM sync requires Growth plan ($446-495/mo). Free/Launch users can export enriched data as CSV for manual CRM import.
  • Best for: RevOps teams wanting to automate CRM data enrichment and cleanup on a recurring basis.
  • CRM补全与更新:将Salesforce、HubSpot或Dynamics 365的联系人/账户导入Clay表格。运行 waterfall 补全填充缺失字段、更新过期数据。通过双向同步将更新后的记录推回CRM。
  • 自动化数据维护:设置定时导入定期拉取CRM记录到Clay,重新补全后同步回去,无需人工操作即可保持联系人数据新鲜。
  • 重复检测:利用补全数据(LinkedIn链接、公司域名、验证过的邮箱)识别并标记重复记录,再同步回CRM。
  • 数据标准化:使用Sculptor工作流标准化职位、公司名、行业分类等字段,推回CRM前统一格式。
  • 套餐限制:CRM同步需要Growth套餐(446-495美元/月)。免费/Launch版本用户可导出补全后的CSV文件手动导入CRM。
  • 适用场景:想要定期自动化完成CRM数据补全和清理的RevOps团队。

In LeanData / RingLead

LeanData / RingLead

LeanData:
  • Lead-to-account matching (which leads belong to which accounts)
  • Lead routing based on matching results
  • Deduplication with merge automation
  • Salesforce-native (runs inside SFDC)
RingLead (now ZoomInfo — acquired):
  • Duplicate prevention on record creation
  • Bulk dedup with configurable match rules
  • Data normalization and standardization
  • Works with Salesforce, HubSpot, Marketo
LeanData
  • 线索到账户匹配(匹配线索所属的账户)
  • 基于匹配结果的线索分配
  • 带自动合并功能的去重
  • Salesforce原生(运行在SFDC内部)
RingLead(已被ZoomInfo收购):
  • 记录创建时的重复拦截
  • 支持自定义匹配规则的批量去重
  • 数据标准化和统一
  • 适配Salesforce、HubSpot、Marketo

In DemandTools (Validity)

DemandTools (Validity)

  • Most powerful Salesforce dedup tool
  • Scenario-based: build complex match/merge rules
  • Mass operations: update, delete, deduplicate at scale
  • Import management: clean data before it enters CRM
  • Single Table Dedupe, Table-to-Table Dedupe (cross-object)
  • 功能最强大的Salesforce去重工具
  • 基于场景:可搭建复杂的匹配/合并规则
  • 批量操作:支持大规模更新、删除、去重
  • 导入管理:数据进入CRM前先清洗
  • 单表去重、表对表去重(跨对象)

In Clearbit

Clearbit

Clearbit (now Breeze Intelligence in HubSpot) focuses on enrichment-driven data hygiene — standardizing and filling CRM records with firmographic and demographic data.
Enrichment-based cleanup:
  • Enrich existing CRM records with standardized firmographic/demographic data
  • Bulk enrichment via API or Breeze Intelligence in HubSpot
  • Continuous data refresh: subscribe to enrichment updates when data changes (
    subscribe: true
    parameter) — records stay current without manual re-enrichment
Normalization:
  • Standardized industry codes (NAICS, GICS, SIC) — normalizes messy free-text industry fields
  • Normalized role and seniority classifications — standardizes job titles across records into consistent categories
  • Corporate hierarchy: parent company and ultimate parent domain fields help deduplicate subsidiary records
Data quality signals:
  • emailProvider
    flag identifies personal email addresses (gmail, yahoo) vs business emails — useful for filtering low-quality records
  • Tech stack data helps identify outdated technology fields in CRM
Best for: Filling missing fields, standardizing industries and titles, flagging personal emails, and keeping records fresh with continuous enrichment. Pair with a dedup tool (ZoomInfo, DemandTools) for full hygiene coverage.
Clearbit(现在是HubSpot的Breeze Intelligence)专注于基于补全的数据卫生 — 用公司和人口统计数据标准化、填充CRM记录。
基于补全的清理
  • 用标准化的公司/人口统计数据补全现有CRM记录
  • 通过API或HubSpot的Breeze Intelligence实现批量补全
  • 持续数据更新:订阅数据变更推送(
    subscribe: true
    参数)— 无需手动重新补全即可保持记录最新
标准化
  • 标准行业代码(NAICS、GICS、SIC)— 标准化混乱的自由文本行业字段
  • 标准化角色和职级分类 — 将不同记录的职位统一为一致的分类
  • 公司层级:母公司和最终母公司域名字段帮助去分子公司记录
数据质量信号
  • emailProvider
    标记识别个人邮箱(gmail、yahoo)和企业邮箱 — 可用于过滤低质量记录
  • 技术栈数据可识别CRM中过时的技术字段
适用场景:填充缺失字段、标准化行业和职位、标记个人邮箱、通过持续补全保持记录新鲜。搭配去重工具(ZoomInfo、DemandTools)可实现完整的数据卫生覆盖。

Step 4 — Actionable guidance

第4步 — 可落地的操作指引

Quick wins (do these first)

快速见效的操作(优先完成)

  1. Remove obvious duplicates — exact email match dedup is safe and fast
  2. Fix formatting — standardize phone numbers, capitalize names, normalize countries
  3. Fill critical gaps — bulk enrich records missing email or phone
  4. Remove dead records — hard bounces, invalid emails, disconnected phones
  1. 移除明显重复记录 — 邮箱精确匹配去重安全且高效
  2. 修复格式问题 — 标准化手机号、姓名首字母大写、国家字段标准化
  3. 填充核心字段缺口 — 批量补全缺失邮箱或手机号的记录
  4. 移除无效记录 — 硬退信、无效邮箱、无法接通的手机号

Ongoing hygiene program

常态化数据卫生方案

  1. Prevent duplicates at entry — enable duplicate rules on record creation
  2. Enrich on create — auto-enrich new records within minutes of creation
  3. Monthly dedup sweep — run fuzzy match dedup monthly, review and merge
  4. Quarterly refresh — re-enrich all active records every 90 days
  5. Annual purge — remove records with no activity in 12+ months (archive, don't delete)
  1. 入口处拦截重复 — 记录创建时开启重复校验规则
  2. 创建时自动补全 — 新记录创建后几分钟内自动补全信息
  3. 月度去重扫描 — 每月运行模糊匹配去重,审核后合并
  4. 季度更新 — 每90天重新补全所有活跃记录
  5. 年度清理 — 移除12个月以上无活动的记录(归档,不要删除)

Metrics to track

需追踪的指标

  • Duplicate rate — % of records with duplicates (target: <2%)
  • Field completeness — % of critical fields filled (target: 95%+)
  • Bounce rate — email bounce rate on outbound (target: <3%)
  • Data age — median days since last enrichment (target: <90)
  • Merge rate — duplicates merged per month (should trend down over time)
  • 重复率 — 存在重复的记录占比(目标:<2%)
  • 字段完成率 — 核心字段已填充的占比(目标:95%以上)
  • 退信率 — 外发邮件退信率(目标:<3%)
  • 数据时效 — 上次补全的中位数天数(目标:<90天)
  • 合并率 — 每月合并的重复记录数(应随时间下降)

Gotchas

注意事项

  1. Merge before you enrich — enriching duplicate records wastes credits. Dedup first, then enrich the surviving records.
  2. Test dedup rules on a sample first — fuzzy matching can produce false positives (merging records that shouldn't be merged). Always review a sample of 50-100 merge candidates before running bulk operations.
  3. Preserve lead source on merge — the most common post-merge complaint is losing original lead source attribution. Configure merge rules to keep the oldest record's lead source.
  4. Don't delete — archive — instead of deleting stale records, move them to an archive status. Deleted records lose history; archived records can be reactivated if the contact returns.
  5. GDPR and compliance — data hygiene must respect opt-out and consent records. Never re-enrich a contact who has opted out. Check compliance status before any bulk enrichment operation.
  1. 补全前先去重 — 给重复记录补全会浪费额度,先去重,再给保留下来的记录补全。
  2. 先在样本上测试去重规则 — 模糊匹配可能出现误报(合并本不该合并的记录)。运行批量操作前务必先审核50-100条合并候选样本。
  3. 合并时保留线索来源 — 合并后最常见的问题就是丢失原始线索来源归因,配置合并规则时要保留最早记录的线索来源。
  4. 不要删除,要归档 — 不要删除过期记录,将其改为归档状态即可。删除记录会丢失历史,归档的记录如果联系人回流还可以重新激活。
  5. GDPR和合规要求 — 数据卫生操作必须遵守退订和授权记录规则,永远不要给已经退订的联系人补全信息。任何批量补全操作前都要检查合规状态。

Related skills

相关技能

  • /sales-clay
    — Clay platform help
  • /sales-zoominfo
    — ZoomInfo platform help (for OperationsOS-specific setup)
  • /sales-clearbit
    — Clearbit platform help (enrichment, reveal, prospector)
  • /sales-enrich
    — enrichment strategy across all providers
  • /sales-lead-routing
    — lead assignment and territory rules (often paired with dedup)
  • /sales-lead-score
    — lead scoring models (depend on clean data)
  • /sales-integration
    — connecting data tools to CRM
  • /sales-prospect-list
    — building prospect lists (data quality at the source)
  • /sales-do
    — Not sure which skill to use? The router matches any sales objective to the right skill. Install:
    npx skills add sales-skills/sales --skills sales-do
  • /sales-clay
    — Clay平台操作帮助
  • /sales-zoominfo
    — ZoomInfo平台操作帮助(OperationsOS专属设置)
  • /sales-clearbit
    — Clearbit平台操作帮助(补全、线索挖掘、潜客识别)
  • /sales-enrich
    — 全供应商补全策略
  • /sales-lead-routing
    — 线索分配和地域规则(通常和去重搭配使用)
  • /sales-lead-score
    — 线索打分模型(依赖干净的数据)
  • /sales-integration
    — 数据工具和CRM的连接
  • /sales-prospect-list
    — 搭建潜客列表(从源头保证数据质量)
  • /sales-do
    — 不知道用哪个技能?路由工具会匹配任意销售目标对应的合适技能。安装命令:
    npx skills add sales-skills/sales --skills sales-do

Examples

示例

Example 1: CRM data audit

示例1:CRM数据审计

User says: "Our Salesforce has 50,000 contacts and I suspect a lot of them are duplicates or outdated. Where do I start?" Skill does: Walks through the data quality audit framework — measure completeness, accuracy, duplication rate, and decay. Recommends starting with exact-match email dedup (safest), then running a field completeness report, then sampling 100 records against LinkedIn to estimate accuracy. Result: User has a data quality scorecard and prioritized cleanup plan.
用户提问:"我们的Salesforce有5万条联系人,我觉得很多都是重复或者过时的,该从哪开始?" 技能输出:引导用户完成数据质量审计框架 — 评估完整性、准确性、重复率和失效率。推荐先从邮箱精确匹配去重开始(最安全),然后生成字段完成率报告,再抽样100条记录和LinkedIn对比评估准确性。 结果:用户获得数据质量评分卡和优先级清晰的清理计划。

Example 2: Setting up ongoing hygiene

示例2:搭建常态化数据卫生流程

User says: "We keep getting duplicates in HubSpot and our data goes stale within months. How do we automate this?" Skill does: Recommends HubSpot Operations Hub for dedup + ZoomInfo or Clay for enrichment. Sets up duplicate prevention rules on creation, auto-enrichment for new records, and a quarterly re-enrichment schedule. Result: User has an automated hygiene program that prevents duplicates and keeps data fresh.
用户提问:"我们的HubSpot一直有重复记录,而且数据几个月就过期了,怎么自动化解决?" 技能输出:推荐用HubSpot Operations Hub做去重+ZoomInfo或Clay做补全。搭建创建时重复拦截规则、新记录自动补全、季度重新补全的 schedule。 结果:用户获得自动化的数据卫生方案,可以拦截重复记录、保持数据新鲜。

Example 3: Pre-campaign data cleanup

示例3:活动前数据清理

User says: "We're about to launch a big outbound campaign to 10,000 contacts. How do I make sure the data is clean first?" Skill does: Recommends a pre-campaign checklist: dedup the list, verify emails with a dedicated verification tool, re-enrich records older than 90 days, remove contacts at companies that no longer fit ICP, and check opt-out/DNC status. Result: User launches campaign with verified, deduplicated, compliant data — lower bounce rate, higher deliverability.
用户提问:"我们马上要给1万条联系人发起大型外触活动,怎么提前确保数据是干净的?" 技能输出:推荐活动前检查清单:列表去重、用专业验证工具核实邮箱、重新补全90天以上未更新的记录、移除不符合理想客户画像的公司联系人、检查退订/禁止联络状态。 结果:用户用经过验证、去重、合规的数据发起活动 — 退信率更低、送达率更高。

Troubleshooting

问题排查

Dedup merging wrong records

去重合并了错误的记录

Symptom: Fuzzy match dedup merged two different people who happen to have similar names at the same company Cause: Match rules too loose — matching on name + company without additional criteria Solution: Tighten match rules: require email OR phone match in addition to name + company. Always run in "review" mode before "auto-merge" mode. Add title or department as a tiebreaker.
症状:模糊匹配去重合并了同一家公司名字相似的两个不同的人 原因:匹配规则太宽松 — 仅匹配姓名+公司,没有其他校验条件 解决方案:收紧匹配规则:除了姓名+公司外,还要求邮箱或手机号匹配。「自动合并」模式前务必先运行「审核」模式。增加职位或部门作为决胜条件。

Enrichment not filling expected fields

补全没有填充预期字段

Symptom: Auto-enrichment runs but many records still have empty phone or email fields Cause: Single enrichment provider doesn't have coverage for all contacts. Coverage varies by geography, seniority, and industry. Solution: Implement waterfall enrichment — try Provider A, if no result try Provider B, then Provider C. Use
/sales-enrich
for waterfall setup. Common waterfall: ZoomInfo → Apollo → Lusha.
症状:自动补全运行后很多记录的手机号或邮箱字段还是空的 原因:单一补全服务商不能覆盖所有联系人,覆盖率因地域、职级、行业不同有差异 解决方案:落地 waterfall 补全 — 先试服务商A,没有结果试服务商B,再试服务商C。用
/sales-enrich
搭建waterfall流程。常见组合:ZoomInfo → Apollo → Lusha。

Data quality metrics not improving

数据质量指标没有改善

Symptom: Running monthly dedup and enrichment but duplicate rate and completeness aren't improving Cause: New duplicates are being created faster than they're being merged. Root cause is usually web forms, imports, or integrations creating records without duplicate checks. Solution: Fix the source — enable duplicate prevention rules on all record creation paths (web forms, API imports, manual creation, integration syncs). Prevention is more effective than cleanup.
症状:每月都运行去重和补全,但重复率和字段完成率没有提升 原因:新重复记录的创建速度比合并速度快。根本原因通常是网页表单、导入、集成创建记录时没有做重复校验。 解决方案:从源头修复 — 所有记录创建路径(网页表单、API导入、手动创建、集成同步)都开启重复拦截规则。预防比清理效率高得多。