crm-data-quality

Original🇺🇸 English
Translated

Find incomplete records, normalize field values in bulk, dedupe with `hubspot objects merge`, and audit custom properties. Builds on `bulk-operations` for JSONL piping and dry-run/digest/confirm.

3installs
Added on

NPX Install

npx skill4agent add hubspot/agent-cli-skills crm-data-quality
Read
bulk-operations/SKILL.md
first — JSONL piping, batch read, pagination, and dry-run/digest/confirm gating apply to every command below.

Property discovery

Don't guess property names. List them:
bash
hubspot properties list --type contacts --format table
hubspot properties list --type contacts | jq -c 'select(.type=="enumeration") | {name, label}'
Same for
--type companies
,
deals
, or any custom type (
hubspot objects types
).

1. Find incomplete records

!name
= NOT_HAS_PROPERTY (missing or empty). Bare
name
= HAS_PROPERTY. Within one
--filter
, chain with
AND
; multiple
--filter
flags are OR'd.
bash
hubspot objects search --type contacts --filter "!email" --properties firstname,lastname,company
hubspot objects search --type contacts --filter "!phone AND !mobilephone" --properties email
hubspot objects search --type contacts --filter "!hubspot_owner_id" --properties email,lifecyclestage
For >100 results, use the pagination loop from
bulk-operations
.

2. Normalize field values

Search → reshape with
jq
→ pipe into
update
. Always
--dry-run
first;
bulk-operations
covers digest/confirm escalation for >100 rows. Reshape patterns:
bulk-operations/resources/json-patterns.md
.
bash
# Collapse spellings into one canonical value
hubspot objects search --type contacts --filter "company~acme" \
| jq -c '{id, properties:{company:"Acme Corporation"}}' \
| hubspot objects update --type contacts --dry-run

# Lowercase emails (read, reshape, write)
hubspot objects search --type contacts --filter "email" --properties email \
| jq -c '{id, properties:{email: (.properties.email | ascii_downcase)}}' \
| hubspot objects update --type contacts --dry-run

3. Dedupe with
hubspot objects merge

Secondary is folded into primary and deleted. Irreversible. Dry-run/digest/confirm gating applies.
bash
# Single pair
hubspot objects merge --type contacts --primary 149 --secondary 425 --dry-run
hubspot objects merge --type contacts --primary 149 --secondary 425   # execute (≤100 pairs)
Bulk: pipe JSONL
{"primary":"...","secondary":"..."}
on stdin (omit
--primary
/
--secondary
).
Pagination required.
objects search
caps at 100 rows per call and
jq -s
slurps a single stream into memory — running the snippet below against a raw
search
will silently miss every duplicate that crosses a page boundary. Collect the full set first with the pagination loop from
bulk-operations/SKILL.md
(write to
/tmp/contacts.jsonl
), then dedupe from the file:
bash
# /tmp/contacts.jsonl produced by the pagination loop (bulk-operations/SKILL.md)
jq -s -c '
    group_by(.properties.email)[]
    | select(length > 1)
    | sort_by(.id | tonumber)
    | .[0].id as $p | .[1:][] | {primary: $p, secondary: .id}
  ' /tmp/contacts.jsonl \
| hubspot objects merge --type contacts --dry-run | tee /tmp/merge-preview.jsonl
For >100 pairs, lift
digest
and
impact.records_affected
from the
BulkData
line and re-pipe the same producer with
--digest
/
--confirm
(see
bulk-operations
).

4. Audit properties

hubspot properties list
(and
get
,
batch-read
) emits
{name, label, type, fieldType, groupName}
per row. Enum option values are not currently exposed by the CLI — read them off a real record (
hubspot objects search ... --properties <enum>
) or the HubSpot UI.
bash
# Count properties per group (HubSpot groups standard fields; custom groups stand out)
hubspot properties list --type contacts | jq -rs 'group_by(.groupName) | map({group: .[0].groupName, count: length}) | .[]'

# All enumeration properties
hubspot properties list --type contacts | jq -c 'select(.type=="enumeration") | {name, label, fieldType}'

# Create a DQ flag property, then set it via the normalize pattern in section 2
hubspot properties create --type contacts --name dq_missing_phone --label "DQ: Missing Phone" --prop-type string --field-type text

Recovery

Merge is irreversible. After any merge,
hubspot history --since 1h
captures the audit trail. If wrong direction, restore the secondary from the UI's recycle bin.