sales-data-hygiene
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCRM Data Hygiene & Quality
CRM 数据卫生与质量管理
Help the user clean, deduplicate, normalize, and maintain CRM data quality. This skill is tool-agnostic but includes platform-specific guidance for ZoomInfo OperationsOS, Salesforce native tools, HubSpot Operations Hub, Clay, LeanData, RingLead, Openprise, and DemandTools.
帮助用户完成CRM数据的清洗、去重、标准化和质量维护。本技能不绑定特定工具,同时包含以下平台的专属操作指引:ZoomInfo OperationsOS、Salesforce 原生工具、HubSpot Operations Hub、Clay、LeanData、RingLead、Openprise 和 DemandTools。
Step 1 — Gather context
第1步 — 收集上下文
Ask the user:
-
What's the main data problem?
- A) Duplicate contacts, leads, or accounts
- B) Stale/outdated records (job changes, company changes)
- C) Missing fields (no phone, no email, incomplete company data)
- D) Inconsistent data (job titles, industries, company names formatted differently)
- E) Compliance issues (opt-outs, GDPR, stale consent)
- F) General data audit — don't know what's wrong yet
- G) Setting up ongoing data hygiene automation
- H) Other — describe it
-
What CRM are you using?
- A) Salesforce
- B) HubSpot
- C) Microsoft Dynamics
- D) Pipedrive
- E) Other CRM
- F) Custom/in-house system
-
How many records are affected?
- A) Under 1,000 (small cleanup)
- B) 1,000-10,000 (moderate)
- C) 10,000-100,000 (large)
- D) 100,000+ (enterprise-scale)
- E) Not sure — need to audit first
-
What tools do you have for data operations?
- A) ZoomInfo OperationsOS
- B) HubSpot Operations Hub
- C) Salesforce native (duplicate management, data.com)
- D) Clay
- E) LeanData / RingLead / Openprise
- F) DemandTools (Validity)
- G) None — using manual processes
- H) Other — describe it
询问用户以下信息:
-
核心数据问题是什么?
- A) 联系人、线索或账户重复
- B) 记录陈旧/过时(职位变动、公司变动)
- C) 字段缺失(无手机号、无邮箱、公司信息不完整)
- D) 数据不一致(职位、行业、公司名称格式不统一)
- E) 合规问题(退订、GDPR、同意授权过期)
- F) 通用数据审计 — 暂不清楚具体问题
- G) 搭建常态化数据卫生自动化流程
- H) 其他 — 请描述
-
你正在使用哪款CRM?
- A) Salesforce
- B) HubSpot
- C) Microsoft Dynamics
- D) Pipedrive
- E) 其他CRM
- F) 定制/自研系统
-
受影响的记录数量是多少?
- A) 1000条以下(小型清理)
- B) 1000-10000条(中型)
- C) 10000-100000条(大型)
- D) 100000条以上(企业级规模)
- E) 不确定 — 需要先审计
-
你已有的数据运营工具包括哪些?
- A) ZoomInfo OperationsOS
- B) HubSpot Operations Hub
- C) Salesforce 原生工具(重复管理、data.com)
- D) Clay
- E) LeanData / RingLead / Openprise
- F) DemandTools (Validity)
- G) 无 — 采用人工流程
- H) 其他 — 请描述
Step 2 — Strategy and approach
第2步 — 策略与方法
Data quality audit framework
数据质量审计框架
Before fixing data, measure what's broken:
-
Completeness — what % of records have all critical fields filled?
- Email: target 95%+ for contacts
- Phone: target 70%+ for key personas
- Company: target 100% for accounts
- Title/department: target 90%+ for contacts
-
Accuracy — what % of filled fields are actually correct?
- Email deliverability: verify a sample with an email verification tool
- Phone connectivity: check a sample of direct dials
- Job title currency: compare against LinkedIn for a sample
-
Duplication rate — what % of records are duplicates?
- Contact-level: same person, multiple records
- Account-level: same company, different spellings
- Cross-object: leads that are also contacts
-
Decay rate — how fast does your data go stale?
- Industry average: 30% of B2B data decays annually
- Sales contacts: ~20% change jobs each year
- Direct dials: ~15% become invalid annually
- Emails: ~22% bounce rate after 12 months without refresh
-
Consistency — are the same things called the same thing?
- Job titles: "VP Sales" vs "Vice President of Sales" vs "VP, Sales"
- Industries: "SaaS" vs "Software" vs "Technology"
- Company names: "IBM" vs "International Business Machines" vs "IBM Corp"
修复数据前,先评估问题范围:
-
完整性 — 所有关键字段均已填充的记录占比?
- 邮箱:联系人目标完成率95%以上
- 手机号:核心角色目标完成率70%以上
- 公司信息:账户目标完成率100%
- 职位/部门:联系人目标完成率90%以上
-
准确性 — 已填充字段实际正确的占比?
- 邮箱送达率:用邮箱验证工具抽样验证
- 手机号连通性:抽样核实直拨号码有效性
- 职位时效性:抽样和LinkedIn信息对比
-
重复率 — 重复记录的占比?
- 联系人层级:同一人对应多条记录
- 账户层级:同一公司对应不同拼写的多条记录
- 跨对象:同一条线索同时也是联系人
-
失效率 — 数据过期的速度?
- 行业平均:B2B数据每年失效率为30%
- 销售联系人:每年约20%的人更换工作
- 直拨手机号:每年约15%的号码失效
- 邮箱:12个月不更新的话退信率约22%
-
一致性 — 同类信息是否采用统一命名?
- 职位:"VP Sales" vs "Vice President of Sales" vs "VP, Sales"
- 行业:"SaaS" vs "Software" vs "Technology"
- 公司名:"IBM" vs "International Business Machines" vs "IBM Corp"
Deduplication strategy
去重策略
| Approach | When to use | Risk level |
|---|---|---|
| Exact match | Email, phone, domain — safest | Low |
| Fuzzy match | Names, company names, addresses | Medium — review matches before merging |
| Rule-based | Combine multiple fields (name + company + title) | Medium |
| ML-based | Large datasets with complex patterns | Low (if trained well) — but expensive |
Merge rules (which record wins):
- Most recently updated record keeps modifiable fields
- Most complete record keeps enrichment data
- Oldest record keeps the original owner/source
- Always preserve: original lead source, first touch date, opt-in status
| 方法 | 适用场景 | 风险等级 |
|---|---|---|
| 精确匹配 | 邮箱、手机号、域名 — 最安全 | 低 |
| 模糊匹配 | 姓名、公司名、地址 | 中 — 合并前需审核匹配结果 |
| 规则匹配 | 组合多个字段匹配(姓名+公司+职位) | 中 |
| 机器学习匹配 | 存在复杂模式的大型数据集 | 低(训练效果好的前提下)— 但成本高 |
合并规则(哪条记录保留为主记录):
- 最近更新的记录保留可修改字段
- 信息最完整的记录保留补全的扩展数据
- 最早的记录保留原始归属人/来源
- 始终保留字段:原始线索来源、首次触达日期、授权订阅状态
Data normalization
数据标准化
| Field | Common problems | Solution |
|---|---|---|
| Job title | Abbreviations, variations, custom titles | Map to standard taxonomy (C-Level, VP, Director, Manager, IC) |
| Industry | Free-text, overlapping categories | Map to SIC/NAICS or your internal taxonomy |
| Company name | Abbreviations, legal suffixes, DBA names | Normalize to official name, store variants as aliases |
| Phone | Mixed formats, extensions, country codes | E.164 format (+1XXXXXXXXXX) |
| Address | Inconsistent formatting, abbreviations | USPS standardization or Google Maps API |
| Country | Mix of codes and names | ISO 3166-1 alpha-2 codes |
| 字段 | 常见问题 | 解决方案 |
|---|---|---|
| 职位 | 缩写、变体、自定义职位 | 映射到标准分类体系(高管层、VP、总监、经理、普通员工) |
| 行业 | 自由文本、分类重叠 | 映射到SIC/NAICS或内部分类体系 |
| 公司名 | 缩写、法律后缀、商号名称 | 标准化为官方名称,变体存为别名 |
| 手机号 | 格式混杂、分机号、国家码不统一 | 采用E.164格式(+1XXXXXXXXXX) |
| 地址 | 格式不一致、缩写 | 采用USPS标准或谷歌地图API标准化 |
| 国家 | 代码和名称混用 | 采用ISO 3166-1 alpha-2编码 |
Enrichment automation
补全自动化
Set up ongoing enrichment to prevent decay:
- Trigger-based — enrich when a record is created or updated
- Scheduled — monthly/quarterly batch enrichment of all records
- Decay-based — re-enrich records older than X days
- Event-based — re-enrich when a contact's company has a news event (funding, acquisition)
搭建常态化补全流程避免数据失效:
- 触发式补全 — 记录创建或更新时自动补全
- 定时补全 — 月度/季度批量补全所有记录
- 失效触发补全 — 重新补全超过X天未更新的记录
- 事件触发补全 — 联系人所属公司发生新闻事件(融资、收购)时重新补全
Step 3 — Platform-specific guidance
第3步 — 平台专属操作指引
In ZoomInfo (OperationsOS)
ZoomInfo (OperationsOS)
ZoomInfo OperationsOS is purpose-built for CRM data management at scale.
Deduplication:
- OperationsOS identifies duplicates across contacts, leads, and accounts using fuzzy matching
- Configurable match rules: email, name+company, phone, domain
- Bulk merge with configurable "winning record" rules
- Cross-object dedup: find leads that already exist as contacts
Data orchestration:
- Build automated workflows: new record → match existing → enrich → normalize → route
- Configure matching rules to prevent duplicates before they're created
- Set up enrichment triggers on record creation, field change, or scheduled interval
- Normalization rules for job titles, industries, company names
Data decay management:
- Auto-detect job changes and company changes
- Flag stale records based on last-enriched date
- Configure re-enrichment schedules (monthly recommended for active pipeline)
- Track data quality metrics over time
Setup: ZoomInfo admin → OperationsOS → Data Orchestration → Create Workflow.
ZoomInfo OperationsOS是为大规模CRM数据管理量身打造的工具。
去重:
- OperationsOS通过模糊匹配识别联系人、线索、账户中的重复记录
- 可配置匹配规则:邮箱、姓名+公司、手机号、域名
- 批量合并支持自定义「主记录」规则
- 跨对象去重:识别已经是联系人的重复线索
数据编排:
- 搭建自动化工作流:新记录 → 匹配已有记录 → 补全信息 → 标准化 → 分配
- 配置匹配规则在重复记录创建前就拦截
- 可在记录创建、字段变更或固定时间点触发补全流程
- 支持职位、行业、公司名的标准化规则
数据失效管理:
- 自动检测职位变动和公司变动
- 根据上次补全日期标记过期记录
- 配置重新补全 schedule(活跃商机推荐每月补全)
- 长期追踪数据质量指标
设置路径:ZoomInfo admin → OperationsOS → Data Orchestration → Create Workflow。
In Salesforce (Native)
Salesforce(原生工具)
Duplicate Management:
- Setup → Duplicate Management → Duplicate Rules
- Standard rules: match on email, name+company, phone
- Custom matching rules for complex scenarios
- Block or alert on duplicate creation
Data.com Clean (if licensed):
- Batch cleaning of contacts and accounts
- Auto-enrichment on record creation
- Scheduled cleanups
Limitations: Native dedup is basic — no fuzzy matching, no cross-object dedup, no automated merge. For enterprise-scale, pair with DemandTools or ZoomInfo OperationsOS.
重复管理:
- 设置路径:Setup → Duplicate Management → Duplicate Rules
- 标准规则:基于邮箱、姓名+公司、手机号匹配
- 支持复杂场景的自定义匹配规则
- 可拦截重复记录创建或发出提醒
Data.com Clean(已购买授权的用户可用):
- 批量清洗联系人和账户
- 记录创建时自动补全
- 定时清理
局限性:原生去重功能基础 — 无模糊匹配、无跨对象去重、无自动合并。企业级场景建议搭配DemandTools或ZoomInfo OperationsOS使用。
In HubSpot (Operations Hub)
HubSpot(Operations Hub)
Deduplication:
- Operations Hub includes AI-powered duplicate detection
- Suggests merge candidates with confidence scores
- Bulk merge with "primary record" selection
- Available on Operations Hub Professional+
Data quality automation:
- Programmable automation (Operations Hub Professional): custom code actions for normalization
- Data quality command center: monitor property completeness, formatting issues
- Automated formatting: capitalize names, standardize phone numbers, clean URLs
Limitations: HubSpot dedup is contact/company only — no custom object dedup. Formatting automation requires Operations Hub Professional ($800/mo+).
去重:
- Operations Hub包含AI驱动的重复检测功能
- 会推荐合并候选并给出置信度评分
- 批量合并支持选择「主记录」
- 需Operations Hub Professional及以上版本可用
数据质量自动化:
- 可编程自动化(Operations Hub Professional):支持自定义代码实现标准化逻辑
- 数据质量指挥中心:监控属性完整性、格式问题
- 自动格式化:姓名首字母大写、手机号标准化、URL清洗
局限性:HubSpot去重仅支持联系人/公司 — 不支持自定义对象去重。格式化自动化需要Operations Hub Professional版本(800美元/月起)。
In Clay
Clay
- CRM enrichment & refresh: Import contacts/accounts from Salesforce, HubSpot, or Dynamics 365 into Clay tables. Run waterfall enrichment to fill missing fields and refresh stale data. Push updated records back via bidirectional sync.
- Automated data maintenance: Set up scheduled imports to regularly pull CRM records into Clay, re-enrich, and sync back. Keeps contact data fresh without manual effort.
- Duplicate detection: Use enrichment data (LinkedIn URLs, company domains, verified emails) to identify and flag duplicate records before syncing back to CRM.
- Data standardization: Use Sculptor workflows to normalize job titles, company names, industry classifications, and other fields. Apply consistent formatting before pushing to CRM.
- Plan gate: CRM sync requires Growth plan ($446-495/mo). Free/Launch users can export enriched data as CSV for manual CRM import.
- Best for: RevOps teams wanting to automate CRM data enrichment and cleanup on a recurring basis.
- CRM补全与更新:将Salesforce、HubSpot或Dynamics 365的联系人/账户导入Clay表格。运行 waterfall 补全填充缺失字段、更新过期数据。通过双向同步将更新后的记录推回CRM。
- 自动化数据维护:设置定时导入定期拉取CRM记录到Clay,重新补全后同步回去,无需人工操作即可保持联系人数据新鲜。
- 重复检测:利用补全数据(LinkedIn链接、公司域名、验证过的邮箱)识别并标记重复记录,再同步回CRM。
- 数据标准化:使用Sculptor工作流标准化职位、公司名、行业分类等字段,推回CRM前统一格式。
- 套餐限制:CRM同步需要Growth套餐(446-495美元/月)。免费/Launch版本用户可导出补全后的CSV文件手动导入CRM。
- 适用场景:想要定期自动化完成CRM数据补全和清理的RevOps团队。
In LeanData / RingLead
LeanData / RingLead
LeanData:
- Lead-to-account matching (which leads belong to which accounts)
- Lead routing based on matching results
- Deduplication with merge automation
- Salesforce-native (runs inside SFDC)
RingLead (now ZoomInfo — acquired):
- Duplicate prevention on record creation
- Bulk dedup with configurable match rules
- Data normalization and standardization
- Works with Salesforce, HubSpot, Marketo
LeanData:
- 线索到账户匹配(匹配线索所属的账户)
- 基于匹配结果的线索分配
- 带自动合并功能的去重
- Salesforce原生(运行在SFDC内部)
RingLead(已被ZoomInfo收购):
- 记录创建时的重复拦截
- 支持自定义匹配规则的批量去重
- 数据标准化和统一
- 适配Salesforce、HubSpot、Marketo
In DemandTools (Validity)
DemandTools (Validity)
- Most powerful Salesforce dedup tool
- Scenario-based: build complex match/merge rules
- Mass operations: update, delete, deduplicate at scale
- Import management: clean data before it enters CRM
- Single Table Dedupe, Table-to-Table Dedupe (cross-object)
- 功能最强大的Salesforce去重工具
- 基于场景:可搭建复杂的匹配/合并规则
- 批量操作:支持大规模更新、删除、去重
- 导入管理:数据进入CRM前先清洗
- 单表去重、表对表去重(跨对象)
In Clearbit
Clearbit
Clearbit (now Breeze Intelligence in HubSpot) focuses on enrichment-driven data hygiene — standardizing and filling CRM records with firmographic and demographic data.
Enrichment-based cleanup:
- Enrich existing CRM records with standardized firmographic/demographic data
- Bulk enrichment via API or Breeze Intelligence in HubSpot
- Continuous data refresh: subscribe to enrichment updates when data changes (parameter) — records stay current without manual re-enrichment
subscribe: true
Normalization:
- Standardized industry codes (NAICS, GICS, SIC) — normalizes messy free-text industry fields
- Normalized role and seniority classifications — standardizes job titles across records into consistent categories
- Corporate hierarchy: parent company and ultimate parent domain fields help deduplicate subsidiary records
Data quality signals:
- flag identifies personal email addresses (gmail, yahoo) vs business emails — useful for filtering low-quality records
emailProvider - Tech stack data helps identify outdated technology fields in CRM
Best for: Filling missing fields, standardizing industries and titles, flagging personal emails, and keeping records fresh with continuous enrichment. Pair with a dedup tool (ZoomInfo, DemandTools) for full hygiene coverage.
Clearbit(现在是HubSpot的Breeze Intelligence)专注于基于补全的数据卫生 — 用公司和人口统计数据标准化、填充CRM记录。
基于补全的清理:
- 用标准化的公司/人口统计数据补全现有CRM记录
- 通过API或HubSpot的Breeze Intelligence实现批量补全
- 持续数据更新:订阅数据变更推送(参数)— 无需手动重新补全即可保持记录最新
subscribe: true
标准化:
- 标准行业代码(NAICS、GICS、SIC)— 标准化混乱的自由文本行业字段
- 标准化角色和职级分类 — 将不同记录的职位统一为一致的分类
- 公司层级:母公司和最终母公司域名字段帮助去分子公司记录
数据质量信号:
- 标记识别个人邮箱(gmail、yahoo)和企业邮箱 — 可用于过滤低质量记录
emailProvider - 技术栈数据可识别CRM中过时的技术字段
适用场景:填充缺失字段、标准化行业和职位、标记个人邮箱、通过持续补全保持记录新鲜。搭配去重工具(ZoomInfo、DemandTools)可实现完整的数据卫生覆盖。
Step 4 — Actionable guidance
第4步 — 可落地的操作指引
Quick wins (do these first)
快速见效的操作(优先完成)
- Remove obvious duplicates — exact email match dedup is safe and fast
- Fix formatting — standardize phone numbers, capitalize names, normalize countries
- Fill critical gaps — bulk enrich records missing email or phone
- Remove dead records — hard bounces, invalid emails, disconnected phones
- 移除明显重复记录 — 邮箱精确匹配去重安全且高效
- 修复格式问题 — 标准化手机号、姓名首字母大写、国家字段标准化
- 填充核心字段缺口 — 批量补全缺失邮箱或手机号的记录
- 移除无效记录 — 硬退信、无效邮箱、无法接通的手机号
Ongoing hygiene program
常态化数据卫生方案
- Prevent duplicates at entry — enable duplicate rules on record creation
- Enrich on create — auto-enrich new records within minutes of creation
- Monthly dedup sweep — run fuzzy match dedup monthly, review and merge
- Quarterly refresh — re-enrich all active records every 90 days
- Annual purge — remove records with no activity in 12+ months (archive, don't delete)
- 入口处拦截重复 — 记录创建时开启重复校验规则
- 创建时自动补全 — 新记录创建后几分钟内自动补全信息
- 月度去重扫描 — 每月运行模糊匹配去重,审核后合并
- 季度更新 — 每90天重新补全所有活跃记录
- 年度清理 — 移除12个月以上无活动的记录(归档,不要删除)
Metrics to track
需追踪的指标
- Duplicate rate — % of records with duplicates (target: <2%)
- Field completeness — % of critical fields filled (target: 95%+)
- Bounce rate — email bounce rate on outbound (target: <3%)
- Data age — median days since last enrichment (target: <90)
- Merge rate — duplicates merged per month (should trend down over time)
- 重复率 — 存在重复的记录占比(目标:<2%)
- 字段完成率 — 核心字段已填充的占比(目标:95%以上)
- 退信率 — 外发邮件退信率(目标:<3%)
- 数据时效 — 上次补全的中位数天数(目标:<90天)
- 合并率 — 每月合并的重复记录数(应随时间下降)
Gotchas
注意事项
-
Merge before you enrich — enriching duplicate records wastes credits. Dedup first, then enrich the surviving records.
-
Test dedup rules on a sample first — fuzzy matching can produce false positives (merging records that shouldn't be merged). Always review a sample of 50-100 merge candidates before running bulk operations.
-
Preserve lead source on merge — the most common post-merge complaint is losing original lead source attribution. Configure merge rules to keep the oldest record's lead source.
-
Don't delete — archive — instead of deleting stale records, move them to an archive status. Deleted records lose history; archived records can be reactivated if the contact returns.
-
GDPR and compliance — data hygiene must respect opt-out and consent records. Never re-enrich a contact who has opted out. Check compliance status before any bulk enrichment operation.
-
补全前先去重 — 给重复记录补全会浪费额度,先去重,再给保留下来的记录补全。
-
先在样本上测试去重规则 — 模糊匹配可能出现误报(合并本不该合并的记录)。运行批量操作前务必先审核50-100条合并候选样本。
-
合并时保留线索来源 — 合并后最常见的问题就是丢失原始线索来源归因,配置合并规则时要保留最早记录的线索来源。
-
不要删除,要归档 — 不要删除过期记录,将其改为归档状态即可。删除记录会丢失历史,归档的记录如果联系人回流还可以重新激活。
-
GDPR和合规要求 — 数据卫生操作必须遵守退订和授权记录规则,永远不要给已经退订的联系人补全信息。任何批量补全操作前都要检查合规状态。
Related skills
相关技能
- — Clay platform help
/sales-clay - — ZoomInfo platform help (for OperationsOS-specific setup)
/sales-zoominfo - — Clearbit platform help (enrichment, reveal, prospector)
/sales-clearbit - — enrichment strategy across all providers
/sales-enrich - — lead assignment and territory rules (often paired with dedup)
/sales-lead-routing - — lead scoring models (depend on clean data)
/sales-lead-score - — connecting data tools to CRM
/sales-integration - — building prospect lists (data quality at the source)
/sales-prospect-list - — Not sure which skill to use? The router matches any sales objective to the right skill. Install:
/sales-donpx skills add sales-skills/sales --skills sales-do
- — Clay平台操作帮助
/sales-clay - — ZoomInfo平台操作帮助(OperationsOS专属设置)
/sales-zoominfo - — Clearbit平台操作帮助(补全、线索挖掘、潜客识别)
/sales-clearbit - — 全供应商补全策略
/sales-enrich - — 线索分配和地域规则(通常和去重搭配使用)
/sales-lead-routing - — 线索打分模型(依赖干净的数据)
/sales-lead-score - — 数据工具和CRM的连接
/sales-integration - — 搭建潜客列表(从源头保证数据质量)
/sales-prospect-list - — 不知道用哪个技能?路由工具会匹配任意销售目标对应的合适技能。安装命令:
/sales-donpx skills add sales-skills/sales --skills sales-do
Examples
示例
Example 1: CRM data audit
示例1:CRM数据审计
User says: "Our Salesforce has 50,000 contacts and I suspect a lot of them are duplicates or outdated. Where do I start?"
Skill does: Walks through the data quality audit framework — measure completeness, accuracy, duplication rate, and decay. Recommends starting with exact-match email dedup (safest), then running a field completeness report, then sampling 100 records against LinkedIn to estimate accuracy.
Result: User has a data quality scorecard and prioritized cleanup plan.
用户提问:"我们的Salesforce有5万条联系人,我觉得很多都是重复或者过时的,该从哪开始?"
技能输出:引导用户完成数据质量审计框架 — 评估完整性、准确性、重复率和失效率。推荐先从邮箱精确匹配去重开始(最安全),然后生成字段完成率报告,再抽样100条记录和LinkedIn对比评估准确性。
结果:用户获得数据质量评分卡和优先级清晰的清理计划。
Example 2: Setting up ongoing hygiene
示例2:搭建常态化数据卫生流程
User says: "We keep getting duplicates in HubSpot and our data goes stale within months. How do we automate this?"
Skill does: Recommends HubSpot Operations Hub for dedup + ZoomInfo or Clay for enrichment. Sets up duplicate prevention rules on creation, auto-enrichment for new records, and a quarterly re-enrichment schedule.
Result: User has an automated hygiene program that prevents duplicates and keeps data fresh.
用户提问:"我们的HubSpot一直有重复记录,而且数据几个月就过期了,怎么自动化解决?"
技能输出:推荐用HubSpot Operations Hub做去重+ZoomInfo或Clay做补全。搭建创建时重复拦截规则、新记录自动补全、季度重新补全的 schedule。
结果:用户获得自动化的数据卫生方案,可以拦截重复记录、保持数据新鲜。
Example 3: Pre-campaign data cleanup
示例3:活动前数据清理
User says: "We're about to launch a big outbound campaign to 10,000 contacts. How do I make sure the data is clean first?"
Skill does: Recommends a pre-campaign checklist: dedup the list, verify emails with a dedicated verification tool, re-enrich records older than 90 days, remove contacts at companies that no longer fit ICP, and check opt-out/DNC status.
Result: User launches campaign with verified, deduplicated, compliant data — lower bounce rate, higher deliverability.
用户提问:"我们马上要给1万条联系人发起大型外触活动,怎么提前确保数据是干净的?"
技能输出:推荐活动前检查清单:列表去重、用专业验证工具核实邮箱、重新补全90天以上未更新的记录、移除不符合理想客户画像的公司联系人、检查退订/禁止联络状态。
结果:用户用经过验证、去重、合规的数据发起活动 — 退信率更低、送达率更高。
Troubleshooting
问题排查
Dedup merging wrong records
去重合并了错误的记录
Symptom: Fuzzy match dedup merged two different people who happen to have similar names at the same company
Cause: Match rules too loose — matching on name + company without additional criteria
Solution: Tighten match rules: require email OR phone match in addition to name + company. Always run in "review" mode before "auto-merge" mode. Add title or department as a tiebreaker.
症状:模糊匹配去重合并了同一家公司名字相似的两个不同的人
原因:匹配规则太宽松 — 仅匹配姓名+公司,没有其他校验条件
解决方案:收紧匹配规则:除了姓名+公司外,还要求邮箱或手机号匹配。「自动合并」模式前务必先运行「审核」模式。增加职位或部门作为决胜条件。
Enrichment not filling expected fields
补全没有填充预期字段
Symptom: Auto-enrichment runs but many records still have empty phone or email fields
Cause: Single enrichment provider doesn't have coverage for all contacts. Coverage varies by geography, seniority, and industry.
Solution: Implement waterfall enrichment — try Provider A, if no result try Provider B, then Provider C. Use for waterfall setup. Common waterfall: ZoomInfo → Apollo → Lusha.
/sales-enrich症状:自动补全运行后很多记录的手机号或邮箱字段还是空的
原因:单一补全服务商不能覆盖所有联系人,覆盖率因地域、职级、行业不同有差异
解决方案:落地 waterfall 补全 — 先试服务商A,没有结果试服务商B,再试服务商C。用 搭建waterfall流程。常见组合:ZoomInfo → Apollo → Lusha。
/sales-enrichData quality metrics not improving
数据质量指标没有改善
Symptom: Running monthly dedup and enrichment but duplicate rate and completeness aren't improving
Cause: New duplicates are being created faster than they're being merged. Root cause is usually web forms, imports, or integrations creating records without duplicate checks.
Solution: Fix the source — enable duplicate prevention rules on all record creation paths (web forms, API imports, manual creation, integration syncs). Prevention is more effective than cleanup.
症状:每月都运行去重和补全,但重复率和字段完成率没有提升
原因:新重复记录的创建速度比合并速度快。根本原因通常是网页表单、导入、集成创建记录时没有做重复校验。
解决方案:从源头修复 — 所有记录创建路径(网页表单、API导入、手动创建、集成同步)都开启重复拦截规则。预防比清理效率高得多。