Arize Annotation Skill
This skill focuses on
annotation configs — the schema for human feedback — and on
programmatically annotating project spans via the Python SDK. Human review in the Arize UI (including annotation queues, datasets, and experiments) still depends on these configs; there is no
CLI for queues yet.
Direction: Human labeling in Arize attaches values defined by configs to
spans,
dataset examples,
experiment-related records, and
queue items in the product UI. What is documented here:
and bulk span updates with
ArizeClient.spans.update_annotations
.
Prerequisites
Three things are needed:
CLI, an API key (env var or profile), and a space ID.
If
is not installed, not on PATH, or below version
, see ax-setup.md.
macOS/Linux:
bash
ax --version && echo "--- env ---" && if [ -n "$ARIZE_API_KEY" ]; then echo "ARIZE_API_KEY: (set)"; else echo "ARIZE_API_KEY: (not set)"; fi && echo "ARIZE_SPACE_ID: ${ARIZE_SPACE_ID:-(not set)}" && echo "--- profiles ---" && ax profiles show 2>&1
Windows (PowerShell):
powershell
ax --version; Write-Host "--- env ---"; Write-Host "ARIZE_API_KEY: $(if ($env:ARIZE_API_KEY) { '(set)' } else { '(not set)' })"; Write-Host "ARIZE_SPACE_ID: $env:ARIZE_SPACE_ID"; Write-Host "--- profiles ---"; ax profiles show 2>&1
Proceed immediately if env var or profile has an API key. Only ask the user if both are missing.
- No API key in env and no profile → AskQuestion: "Arize API key (https://app.arize.com/admin > API Keys)"
- Space ID unknown → run to list all accessible spaces, or AskQuestion
Concepts
What is an Annotation Config?
An annotation config defines the schema for a single type of human feedback label. Before anyone can annotate a span, dataset record, experiment output, or queue item, a config must exist for that label in the space.
| Field | Description |
|---|
| Name | Descriptive identifier (e.g. , ). Must be unique within the space. |
| Type | (pick from a list), (numeric range), or (free text). |
| Values | For categorical: array of {"label": str, "score": number}
pairs. |
| Min/Max Score | For continuous: numeric bounds. |
| Optimization Direction | Whether higher scores are better () or worse (). Used to render trends in the UI. |
Where labels get applied (surfaces)
| Surface | Typical path |
|---|
| Project spans | Python SDK (below) and/or the Arize UI |
| Dataset examples | Arize UI (human labeling flows); configs must exist in the space |
| Experiment outputs | Often reviewed alongside datasets or traces in the UI — see arize-experiment, arize-dataset |
| Annotation queue items | Arize UI; configs must exist — no queue commands documented here yet |
Always ensure the relevant annotation config exists in the space before expecting labels to persist.
Basic CRUD: Annotation Configs
List
bash
ax annotation-configs list --space-id SPACE_ID
ax annotation-configs list --space-id SPACE_ID -o json
ax annotation-configs list --space-id SPACE_ID --limit 20
Create — Categorical
Categorical configs present a fixed set of labels for reviewers to choose from.
bash
ax annotation-configs create \
--name "Correctness" \
--space-id SPACE_ID \
--type categorical \
--values '[{"label": "correct", "score": 1}, {"label": "incorrect", "score": 0}]' \
--optimization-direction maximize
Common binary label pairs:
Create — Continuous
Continuous configs let reviewers enter a numeric score within a defined range.
bash
ax annotation-configs create \
--name "Quality Score" \
--space-id SPACE_ID \
--type continuous \
--minimum-score 0 \
--maximum-score 10 \
--optimization-direction maximize
Create — Freeform
Freeform configs collect open-ended text feedback. No additional flags needed beyond name, space, and type.
bash
ax annotation-configs create \
--name "Reviewer Notes" \
--space-id SPACE_ID \
--type freeform
Get
bash
ax annotation-configs get ANNOTATION_CONFIG_ID
ax annotation-configs get ANNOTATION_CONFIG_ID -o json
Delete
bash
ax annotation-configs delete ANNOTATION_CONFIG_ID
ax annotation-configs delete ANNOTATION_CONFIG_ID --force # skip confirmation
Note: Deletion is irreversible. Any annotation queue associations to this config are also removed in the product (queues may remain; fix associations in the Arize UI if needed).
Applying Annotations to Spans (Python SDK)
Use the Python SDK to bulk-apply annotations to project spans when you already have labels (e.g., from a review export or an external labeling tool).
python
import pandas as pd
from arize import ArizeClient
client = ArizeClient(api_key="your-api-key")
# Build a DataFrame with annotation columns
# Required: context.span_id + at least one annotation.<name>.label or annotation.<name>.score
annotations_df = pd.DataFrame([
{
"context.span_id": "span_001",
"annotation.Correctness.label": "correct",
"annotation.Correctness.updated_by": "reviewer@example.com",
},
{
"context.span_id": "span_002",
"annotation.Correctness.label": "incorrect",
"annotation.Correctness.updated_by": "reviewer@example.com",
},
])
response = client.spans.update_annotations(
space_id="your-space-id",
project_name="your-project",
dataframe=annotations_df,
validate=True,
)
DataFrame column schema:
| Column | Required | Description |
|---|
| yes | The span to annotate |
| one of | Categorical or freeform label |
| one of | Numeric score |
annotation.<name>.updated_by
| no | Annotator identifier (email or name) |
annotation.<name>.updated_at
| no | Timestamp in milliseconds since epoch |
| no | Freeform notes on the span |
Limitation: Annotations apply only to spans within 31 days prior to submission.
Troubleshooting
| Problem | Solution |
|---|
| See ax-setup.md |
| API key may not have access to this space. Verify at https://app.arize.com/admin > API Keys |
Annotation config not found
| ax annotation-configs list --space-id SPACE_ID
|
| Name already exists in the space. Use a different name or get the existing config ID. |
| Human review / queues in UI | Use the Arize app; ensure configs exist — no annotation-queue CLI yet |
| Span SDK errors or missing spans | Confirm , , and span IDs; use arize-trace to export spans |
Related Skills
- arize-trace: Export spans to find span IDs and time ranges
- arize-dataset: Find dataset IDs and example IDs
- arize-evaluator: Automated LLM-as-judge alongside human annotation
- arize-experiment: Experiments tied to datasets and evaluation workflows
- arize-link: Deep links to annotation configs and queues in the Arize UI
Save Credentials for Future Use
At the end of the session, if the user manually provided any credentials during this conversation and those values were NOT already loaded from a saved profile or environment variable, offer to save them.
Skip this entirely if:
- The API key was already loaded from an existing profile or env var
- The space ID was already set via env var
How to offer: Use
AskQuestion:
"Would you like to save your Arize credentials so you don't have to enter them next time?" with options
/
.
If the user says yes:
-
API key — See ax-profiles.md. Run
to check the current state, then use
or
with the appropriate flags to save the key (and region if relevant).
-
Space ID — See ax-profiles.md (Space ID section) to persist it as an environment variable.