arize-compliance-audit

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Arize Compliance Audit Skill

Arize合规审计Skill

Use this skill when the user wants to audit their AI agent or LLM application for regulatory compliance. The skill scans the codebase for compliance gaps, cross-references Arize instrumentation for audit trail coverage, and produces a tailored checklist with optional remediation.
Triggers: "audit my app for compliance", "EU AI Act requirements", "NIST AI RMF checklist", "GDPR for AI", "is my AI app compliant", "compliance checklist", "regulatory audit", "ISO 42001", "AI management system", "AIMS certification".
当用户希望审核其AI Agent或LLM应用的监管合规性时使用此Skill。该Skill会扫描代码库查找合规性差距,交叉引用Arize工具的审计跟踪覆盖范围,并生成带有可选整改方案的定制化清单。
触发词: "audit my app for compliance", "EU AI Act requirements", "NIST AI RMF checklist", "GDPR for AI", "is my AI app compliant", "compliance checklist", "regulatory audit", "ISO 42001", "AI management system", "AIMS certification"

Disclaimer

免责声明

Before doing anything else, present this disclaimer verbatim to the user:

⚠️ Legal disclaimer
This audit is for guidance only and does not constitute legal advice or a complete compliance assessment. It identifies common technical patterns and gaps based on publicly available regulatory frameworks, but cannot assess your organisation's specific legal obligations, contractual commitments, data processing agreements, or operational processes.
Do not rely on this output as a substitute for qualified legal counsel. Regulatory compliance is a complex, jurisdiction-specific, and fact-dependent determination. Always engage a qualified attorney or compliance specialist for binding assessments.

在执行任何操作之前,请向用户一字不差地展示以下免责声明:

⚠️ 法律免责声明
本审计仅作指导用途,不构成法律建议或完整的合规评估。它基于公开可用的监管框架识别常见技术模式和差距,但无法评估您组织的具体法律义务、合同承诺、数据处理协议或运营流程。
请勿将此输出作为合格法律顾问的替代方案。 监管合规是一项复杂、因司法管辖区而异且依赖具体事实的判定工作。始终请合格律师或合规专家进行具有约束力的评估。

Core principles

核心原则

  • Prefer inspection over mutation — understand the codebase before suggesting changes.
  • Be practical, not legal — produce developer-actionable items, not legal opinions.
  • Tailor to jurisdiction and use case — a chatbot has different obligations than a hiring tool. Do not dump the entire regulatory framework.
  • Cross-reference instrumentation — compliance requires audit trails; check whether Arize tracing captures what regulators expect.
  • Offer remediation, always confirm — after presenting the checklist, offer to implement specific fixes, but never modify code without explicit user confirmation.
  • Keep output concise and production-focused — do not generate extra documentation or summary files unless requested.
  • Never embed literal credential values — always reference environment variables.
  • 优先检查而非修改 — 在建议更改前先了解代码库
  • 注重实用性而非法律性 — 生成开发者可执行的事项,而非法律意见
  • 根据司法管辖区和用例定制 — 聊天机器人的义务与招聘工具不同。不要照搬整个监管框架
  • 交叉引用工具配置 — 合规需要审计跟踪;检查Arize追踪是否捕获了监管机构要求的内容
  • 提供整改方案,始终确认 — 展示清单后,主动提出实施具体修复,但未经用户明确确认绝不要修改代码
  • 保持输出简洁且聚焦生产环境 — 除非被要求,否则不要生成额外文档或摘要文件
  • 绝不嵌入字面凭证值 — 始终引用环境变量

Phase 0: Framework selection and use case

阶段0:框架选择与用例确定

Before scanning code, determine which compliance frameworks apply.
在扫描代码之前,确定适用的合规框架。

Step 1 — Framework selection

步骤1 — 框架选择

Use the
AskUserQuestion
tool to ask the user which frameworks apply. Do not infer or auto-select — always ask explicitly.
Ask:
Which compliance frameworks should this audit cover?
Select all that apply (reply with numbers, e.g. "1, 3"):

1. EU frameworks — EU AI Act, GPAI Code of Practice, GDPR
   (choose if end-users or data subjects are located in the EU)

2. US frameworks — NIST AI RMF, state laws (Colorado AI Act, NYC LL144),
   HIPAA (if processing health data)
   (choose if operating in the United States)

3. ISO 42001 — International AI Management System standard
   (choose if pursuing ISO 42001 certification, operating globally,
   or wanting an internationally recognised baseline)

You can select any combination. If unsure, select all that seem relevant
and we can narrow down during the audit.
Based on the selection:
  • 1 selected — EU AI Act, GPAI Code of Practice, GDPR apply. See references/eu-ai-act-gpai.md.
  • 2 selected — NIST AI RMF, Colorado AI Act, NYC LL144, HIPAA may apply. See references/us-ai-compliance.md.
  • 3 selected — ISO 42001 AIMS controls apply. See references/iso-42001.md. Note: ISO 42001 is an organisational management system — the audit will cover technically-auditable controls only; purely organisational clauses (leadership review, internal audits) are flagged separately.
  • Multiple selected — all selected frameworks apply; the audit covers the union of requirements, with cross-references where frameworks overlap.
使用
AskUserQuestion
工具询问用户适用哪些框架。请勿推断或自动选择 — 务必明确询问。
询问内容如下:
本次审计应涵盖哪些合规框架?
选择所有适用项(回复数字,例如 "1, 3"):

1. 欧盟框架 — EU AI Act、GPAI Code of Practice、GDPR
   (若终端用户或数据主体位于欧盟,请选择此项)

2. 美国框架 — NIST AI RMF、州级法律(Colorado AI Act、NYC LL144)、
   HIPAA(若处理健康数据)
   (若在美国运营,请选择此项)

3. ISO 42001 — 国际AI管理系统标准
   (若追求ISO 42001认证、全球运营,或需要国际认可的基准,请选择此项)

您可以选择任意组合。若不确定,请选择所有看似相关的选项,我们可在审计过程中逐步缩小范围。
根据选择结果:
  • 选择1 — 适用EU AI Act、GPAI Code of Practice、GDPR。请参阅references/eu-ai-act-gpai.md。
  • 选择2 — 可能适用NIST AI RMF、Colorado AI Act、NYC LL144、HIPAA。请参阅references/us-ai-compliance.md。
  • 选择3 — 适用ISO 42001 AIMS控制要求。请参阅references/iso-42001.md。注意:ISO 42001是组织管理系统 — 审计仅涵盖可技术审计的控制项;纯组织条款(领导层评审、内部审计)将单独标记。
  • 选择多个 — 所有选定框架均适用;审计涵盖各项要求的并集,同时在框架重叠处提供交叉引用。

Step 2 — Determine use case category

步骤2 — 确定用例类别

Use the
AskUserQuestion
tool to ask: What does your AI application do?
  • General chatbot / assistant — Limited risk (EU), general obligations (US)
  • Hiring / HR — High risk (EU Art. 6, Annex III); Colorado AI Act applies; NYC LL144 applies if NYC
  • Healthcare — High risk (EU); HIPAA applies if processing PHI
  • Credit / financial — High risk (EU); Colorado AI Act applies
  • Education — High risk (EU)
  • Content generation — Limited risk (EU Art. 50 transparency); general obligations (US)
  • GPAI model provider — GPAI Code of Practice applies (EU)
使用
AskUserQuestion
工具询问:您的AI应用用于什么场景?
  • 通用聊天机器人/助手 — 低风险(欧盟)、一般义务(美国)
  • 招聘/人力资源 — 高风险(欧盟第6条、附件III);适用Colorado AI Act;若在纽约市则适用NYC LL144
  • 医疗保健 — 高风险(欧盟);若处理受保护健康信息(PHI)则适用HIPAA
  • 信贷/金融 — 高风险(欧盟);适用Colorado AI Act
  • 教育 — 高风险(欧盟)
  • 内容生成 — 低风险(欧盟第50条透明度要求)、一般义务(美国)
  • GPAI模型提供商 — 适用GPAI Code of Practice(欧盟)

Step 3 — Determine risk tier

步骤3 — 确定风险等级

Based on the use case and selected frameworks:
  • EU selected: Classify as Unacceptable / High / Limited / Minimal per references/eu-ai-act-gpai.md
  • US selected: Classify as High-risk (consequential decisions per Colorado AI Act) or General
  • ISO 42001 selected: Risk tier is not a formal classification in ISO 42001, but note whether the system is high-stakes (which elevates the priority of impact assessment and bias controls)
根据用例和选定框架:
  • 选定欧盟框架:根据references/eu-ai-act-gpai.md分类为不可接受/高/有限/最低风险
  • 选定美国框架:分类为高风险(根据Colorado AI Act的重大决策场景)或一般风险
  • 选定ISO 42001:ISO 42001中没有正式的风险等级分类,但需注明系统是否为高风险(这会提升影响评估和偏见控制的优先级)

Phase 0 output

阶段0输出

Present a brief summary:
Frameworks selected: {EU / US / ISO 42001 / combination}
Use case:            {category}
Risk tier:           {EU tier if applicable} / {US tier if applicable}
Applicable:          {list of specific regulations and standards}
ISO 42001 note:      {if selected} Audit covers technically-auditable controls only;
                     organisational clauses will be flagged but not code-audited.
Then proceed directly to Phase 1.
展示简要摘要:
选定框架:{欧盟/美国/ISO 42001/组合}
用例:            {类别}
风险等级:           {适用的欧盟等级} / {适用的美国等级}
适用法规:          {具体法规和标准列表}
ISO 42001说明:      {若选定} 审计仅涵盖可技术审计的控制项;
                     组织条款将被标记但不进行代码审计。
然后直接进入阶段1。

Phase 1: Codebase audit (read-only)

阶段1:代码库审计(只读)

Do not write any code or create any files during this phase.
Systematically scan the codebase for evidence of compliance and gaps across seven domains. For each domain, run the listed searches and record findings.
此阶段请勿编写任何代码或创建任何文件。
系统地扫描代码库,在七个领域中查找合规证据和差距。针对每个领域,运行列出的搜索并记录发现。

A. Transparency and disclosure

A. 透明度与披露

What to look for:
  • User-facing strings disclosing AI involvement: search for terms like
    AI
    ,
    artificial intelligence
    ,
    automated
    ,
    bot
    ,
    machine learning
    ,
    generated by
    ,
    powered by
    in UI templates, API responses, and user-facing code
  • Content labelling: markers on AI-generated output (text, images, audio)
  • Terms of service, privacy policy references in the codebase
Signals of concern: Absence of any AI disclosure in user-facing code, especially if the application generates content or makes recommendations.
检查要点:
  • 用户界面中披露AI参与的字符串:在UI模板、API响应和面向用户的代码中搜索
    AI
    artificial intelligence
    automated
    bot
    machine learning
    generated by
    powered by
    等术语
  • 内容标记:AI生成输出(文本、图像、音频)的标记
  • 代码库中服务条款、隐私政策的引用
风险信号: 面向用户的代码中完全没有AI披露,尤其是当应用生成内容或提供建议时。

B. Data protection and privacy

B. 数据保护与隐私

What to look for:
  • PII field names in code:
    email
    ,
    phone
    ,
    ssn
    ,
    social_security
    ,
    date_of_birth
    ,
    address
    ,
    name
    in prompts, context, or retrieved documents
  • PII in trace span attributes: check if
    input.value
    or
    output.value
    could contain personal data sent to Arize without redaction
  • Consent mechanisms:
    consent
    ,
    opt-in
    ,
    opt-out
    ,
    gdpr
    ,
    ccpa
    references
  • DPIA or privacy assessment references
  • Data retention and deletion handlers
  • Data subject rights:
    right_to_access
    ,
    right_to_erasure
    ,
    data_subject_request
    ,
    data_protection_officer
检查要点:
  • 代码中的PII字段名称:提示词、上下文或检索文档中的
    email
    phone
    ssn
    social_security
    date_of_birth
    address
    name
  • 跟踪跨度属性中的PII:检查
    input.value
    output.value
    是否包含未经脱敏就发送到Arize的个人数据
  • 同意机制:
    consent
    opt-in
    opt-out
    gdpr
    ccpa
    的引用
  • DPIA或隐私评估的引用
  • 数据保留和删除处理程序
  • 数据主体权利:
    right_to_access
    right_to_erasure
    data_subject_request
    data_protection_officer

C. Security

C. 安全性

What to look for:
  • Prompt injection defences: input validation, guardrail libraries (
    guardrails-ai
    ,
    nemo-guardrails
    ,
    rebuff
    ,
    lakera
    ), content filtering, system prompt protection
  • Data loss prevention: output scanning before returning to users, sensitive data detection
  • Tool/function calling controls: permission boundaries, allowlists, sandboxing for tool execution
  • Rate limiting and authentication on AI endpoints
  • Hardcoded secrets:
    api_key
    ,
    secret
    ,
    password
    ,
    token
    literals in source files (not env var references)
检查要点:
  • 提示词注入防御:输入验证、防护库(
    guardrails-ai
    nemo-guardrails
    rebuff
    lakera
    )、内容过滤、系统提示保护
  • 数据丢失防护:返回给用户前的输出扫描、敏感数据检测
  • 工具/函数调用控制:权限边界、允许列表、工具执行沙箱
  • AI端点的速率限制和身份验证
  • 硬编码密钥:源文件中的
    api_key
    secret
    password
    token
    字面量(而非环境变量引用)

D. Testing and evaluation

D. 测试与评估

What to look for:
  • Bias and fairness testing: references to demographic parity, impact ratios, fairness metrics
  • Red teaming or adversarial test suites: prompt injection tests, jailbreak tests
  • Evaluation frameworks: Arize evaluators, custom eval scripts,
    pytest
    -based evals, experiment infrastructure
  • A/B testing or model comparison infrastructure
检查要点:
  • 偏见与公平性测试:人口均等性、影响比率、公平性指标的引用
  • 红队测试或对抗性测试套件:提示词注入测试、越狱测试
  • 评估框架:Arize评估器、自定义评估脚本、基于
    pytest
    的评估、实验基础设施
  • A/B测试或模型比较基础设施

E. Documentation

E. 文档

What to look for:
  • Model cards:
    MODEL_CARD.md
    ,
    model_card.json
    ,
    model_card.yaml
    , or similar
  • System architecture documentation
  • Change logs or version tracking for prompts and model updates
  • Incident response documentation
检查要点:
  • 模型卡片:
    MODEL_CARD.md
    model_card.json
    model_card.yaml
    或类似文件
  • 系统架构文档
  • 提示词和模型更新的变更日志或版本跟踪
  • 事件响应文档

F. Monitoring and observability

F. 监控与可观测性

What to look for:
  • Arize tracing setup:
    arize-otel
    ,
    register()
    ,
    TracerProvider
    ,
    opentelemetry
    ,
    openinference
    imports
  • If tracing exists, check coverage:
    • All LLM calls traced (not just some)
    • Session IDs for conversation continuity
    • User IDs for data subject request support
    • Error tracking and exception spans
  • Alerting and drift detection configuration
  • Trace retention configuration
检查要点:
  • Arize追踪设置:
    arize-otel
    register()
    TracerProvider
    opentelemetry
    openinference
    导入
  • 若已设置追踪,检查覆盖范围:
    • 所有LLM调用均被追踪(而非仅部分)
    • 会话ID以保持对话连续性
    • 用户ID以支持数据主体请求
    • 错误跟踪和异常跨度
  • 告警和漂移检测配置
  • 追踪保留配置

G. Vendor management

G. 供应商管理

What to look for:
  • Third-party AI API usage: OpenAI, Anthropic, Google, Azure, Bedrock, Cohere imports or client instantiation
  • Model versioning: are specific model versions pinned (e.g.,
    gpt-4-0613
    ) or using
    latest
    / unversioned identifiers
  • Fallback and failover logic between providers
检查要点:
  • 第三方AI API使用:OpenAI、Anthropic、Google、Azure、Bedrock、Cohere的导入或客户端实例化
  • 模型版本控制:是否固定了特定模型版本(例如
    gpt-4-0613
    )或使用
    latest
    /未版本化标识符
  • 供应商之间的回退和故障转移逻辑

Phase 1 output

阶段1输出

Present a two-part report:
Part 1 — Summary table
DomainEvidence foundGaps identifiedRating
A. Transparency{findings}{gaps}Compliant / Partial / Non-compliant / N/A
B. Data protection{findings}{gaps}...
C. Security{findings}{gaps}...
D. Testing{findings}{gaps}...
E. Documentation{findings}{gaps}...
F. Monitoring{findings}{gaps}...
G. Vendor management{findings}{gaps}...
Part 2 — Gap detail (required for every Non-compliant or Partial rating)
For each domain rated Non-compliant or Partial, write a dedicated subsection that includes:
  1. The exact code path — file path(s), line number(s), and the relevant code snippet showing where the gap exists. Do not paraphrase; quote the actual code.
  2. Why it matters in this specific app — explain the concrete risk in the context of this codebase (e.g. which tools could be abused, which data flows are exposed, what an attacker or regulator would find).
  3. What is missing — a precise description of the control or code that should exist but does not (e.g. "a span attribute processor that hashes
    user_email
    before the OTLP exporter fires", not just "add PII redaction").
Minimum one subsection per Non-compliant/Partial domain. Do not omit this section — it is the primary value of the audit for engineering teams.
Then proceed directly to Phase 2.
展示两部分报告:
第一部分 — 摘要表格
领域发现的证据识别的差距评级
A. 透明度{发现结果}{差距}合规/部分合规/不合规/不适用
B. 数据保护{发现结果}{差距}...
C. 安全性{发现结果}{差距}...
D. 测试{发现结果}{差距}...
E. 文档{发现结果}{差距}...
F. 监控{发现结果}{差距}...
G. 供应商管理{发现结果}{差距}...
第二部分 — 差距详情(所有不合规或部分合规评级必填)
针对每个被评为不合规或部分合规的领域,编写专门的小节,包含:
  1. 确切代码路径 — 文件路径、行号,以及显示差距所在位置的相关代码片段。请勿转述;引用实际代码。
  2. 对当前应用的影响 — 解释此代码库中的具体风险(例如哪些工具可能被滥用、哪些数据流暴露、攻击者或监管机构会发现什么)。
  3. 缺失内容 — 对应该存在但不存在的控制项或代码的精确描述(例如"在OTLP导出器触发前对
    user_email
    进行哈希处理的跨度属性处理器",而非仅"添加PII脱敏")。
每个不合规/部分合规领域至少包含一个小节。请勿省略此部分 — 这是审计对工程团队的核心价值。
然后直接进入阶段2。

Phase 2: Compliance checklist

阶段2:合规清单

Using the Phase 1 findings and the template in references/compliance-checklist-template.md, generate a tailored compliance checklist.
利用阶段1的发现和references/compliance-checklist-template.md中的模板,生成定制化合规清单

Rules for checklist generation

清单生成规则

  1. Only include relevant sections. If the user is US-only, skip GDPR-specific items. If not healthcare, skip HIPAA. If not hiring in NYC, skip LL144.
  2. Mark items from Phase 1. Items where evidence was found: mark as
    Compliant
    . Items with gaps: mark as
    Non-compliant
    with a concrete remediation suggestion.
  3. Prioritise correctly. Critical = enforcement risk or system prohibition. High = required by regulation. Medium = recommended by framework. Low = best practice.
  4. Be specific in remediation. Instead of "implement input validation", say "add a guardrail library like
    guardrails-ai
    to validate LLM inputs and outputs against your content policy".
  5. Include the instrumentation cross-reference table from the template. If Arize tracing is not set up, flag this as a Critical gap — audit trails are required by EU Art. 12 and NIST MAN-2.1.
  1. 仅包含相关部分。若用户仅涉及美国,跳过GDPR特定项。若不涉及医疗保健,跳过HIPAA。若不在纽约市招聘,跳过LL144。
  2. 标记阶段1的项。找到证据的项标记为
    合规
    。存在差距的项标记为
    不合规
    并附上具体整改建议。
  3. 正确排序优先级。关键 = 执法风险或系统禁用风险。高 = 法规要求。中 = 框架建议。低 = 最佳实践。
  4. 整改建议具体化。不说"实施输入验证",而是说"添加
    guardrails-ai
    等防护库,根据您的内容政策验证LLM的输入和输出"。
  5. 包含模板中的工具配置交叉引用表。若未设置Arize追踪,将其标记为关键差距 — EU第12条和NIST MAN-2.1要求审计跟踪。

Final report

最终报告

Present a single consolidated report with four sections:
Section 1 — Audit scope (Phase 0 summary)
  • Frameworks selected, use case, risk tier, applicable regulations
Section 2 — Codebase findings (Phase 1 summary table)
  • The domain table (A–G) with evidence, gaps, and ratings
Section 3 — Gap detail (Phase 1 expanded)
  • One subsection per Non-compliant or Partial domain, each containing: exact file paths and line numbers, quoted code snippets, app-specific risk explanation, and a precise description of what is missing. This section is mandatory — never omit it.
Section 4 — Compliance checklist (Phase 2)
  • The tailored checklist with status and remediation suggestions, instrumentation cross-reference table, priority summary, and recommended next steps
When the user asks for a report file, write a single markdown file to
/tmp/<app-name>-compliance-audit-<YYYY-MM-DD>.md
containing all four sections.
After presenting the report, offer Phase 3 remediation.
展示包含四个部分的综合报告:
第一部分 — 审计范围(阶段0摘要)
  • 选定框架、用例、风险等级、适用法规
第二部分 — 代码库发现(阶段1摘要表格)
  • A–G领域的表格,包含证据、差距和评级
第三部分 — 差距详情(阶段1扩展内容)
  • 每个不合规或部分合规领域对应一个小节,包含:确切文件路径和行号、引用的代码片段、针对应用的风险解释、缺失内容的精确描述。此部分为必填项 — 绝不要省略。
第四部分 — 合规清单(阶段2)
  • 定制化清单,包含状态和整改建议、工具配置交叉引用表、优先级摘要和建议下一步行动
当用户要求报告文件时,将所有四个部分写入
/tmp/<app-name>-compliance-audit-<YYYY-MM-DD>.md
文件。
展示报告后,提供阶段3的整改服务。

Phase 3: Remediation (optional)

阶段3:整改(可选)

After presenting the checklist, offer to implement specific fixes. Always use the
AskUserQuestion
tool to confirm before making any changes.
展示清单后,主动提出实施具体修复。在进行任何更改之前,务必使用
AskUserQuestion
工具确认。

Remediation categories

整改类别

Add dependencies — offer to install:
  • Guardrail libraries for input/output validation (e.g.,
    guardrails-ai
    ,
    nemo-guardrails
    )
  • PII detection/redaction packages (e.g.,
    presidio-analyzer
    ,
    scrubadub
    )
  • Content safety classifiers
Insert code — offer to add:
  • AI disclosure strings in user-facing output (templates, API responses)
  • PII redaction filters on span attributes before export to Arize
  • Input validation/sanitisation on AI endpoints
  • User ID attributes on trace spans for data subject request support
Create documentation templates — offer to scaffold:
  • Model card template (markdown file with standard sections)
  • Incident response plan template
  • Data processing record template
Configure monitoring — offer to set up via related skills:
  • Arize evaluators for bias detection and content safety (via
    arize-evaluator
    skill)
  • Tracing for audit trail coverage (via
    arize-instrumentation
    skill)
添加依赖项 — 主动提出安装:
  • 用于输入/输出验证的防护库(例如
    guardrails-ai
    nemo-guardrails
  • PII检测/脱敏包(例如
    presidio-analyzer
    scrubadub
  • 内容安全分类器
插入代码 — 主动提出添加:
  • 面向用户输出中的AI披露字符串(模板、API响应)
  • 导出到Arize前对跨度属性进行PII脱敏的过滤器
  • AI端点的输入验证/清理
  • 跟踪跨度上的用户ID属性,以支持数据主体请求
创建文档模板 — 主动提出搭建:
  • 模型卡片模板(包含标准章节的markdown文件)
  • 事件响应计划模板
  • 数据处理记录模板
配置监控 — 主动提出通过相关Skill设置:
  • 用于偏见检测和内容安全的Arize评估器(通过
    arize-evaluator
    Skill)
  • 用于审计跟踪覆盖的追踪(通过
    arize-instrumentation
    Skill)

Remediation rules

整改规则

  • Present each remediation as a discrete, confirmable action. Never batch-apply changes.
  • Show exactly what will change (file, code diff concept) then use the
    AskUserQuestion
    tool to get confirmation before applying.
  • Follow existing code style and project conventions.
  • Never embed credentials — always use environment variables.
  • Test that the application still builds after changes.
  • 将每个整改措施作为独立、可确认的操作呈现。绝不要批量应用更改。
  • 准确展示将要更改的内容(文件、代码差异概念),然后使用
    AskUserQuestion
    工具获取确认后再应用。
  • 遵循现有代码风格和项目约定。
  • 绝不嵌入凭证 — 始终使用环境变量。
  • 更改后测试应用是否仍能正常构建。

Skill orchestration

Skill编排

When gaps identified in Phase 1 or 2 require capabilities from other Arize skills, offer to invoke them. Always use the
AskUserQuestion
tool to ask before invoking another skill
and explain why it is relevant to the compliance gap.
GapSkill to invokeWhy
No tracing / incomplete audit trail
arize-instrumentation
EU Art. 12 and NIST MAN-2.1 require event logging; Arize tracing provides this
No bias or safety evaluation
arize-evaluator
Create LLM-as-judge evaluators for fairness, content safety, or quality monitoring
Need trace export for compliance evidence
arize-trace
Export spans for regulatory documentation or incident investigation
Need human review for high-risk decisions
arize-annotation
Set up annotation queues for human oversight per EU Art. 14
Need deep link to share compliance evidence
arize-link
Generate URLs to specific traces, spans, or evaluations for stakeholder review
当阶段1或2中识别的差距需要其他Arize Skill的能力时,主动提出调用它们。调用其他Skill前务必使用
AskUserQuestion
工具询问,并解释其与合规差距的相关性
差距要调用的Skill原因
无追踪/审计跟踪不完整
arize-instrumentation
EU第12条和NIST MAN-2.1要求AI系统记录事件;Arize追踪可提供此功能
无偏见或安全评估
arize-evaluator
创建用于公平性、内容安全或质量监控的LLM-as-judge评估器
需要导出追踪以提供合规证据
arize-trace
导出跨度用于监管文档或事件调查
高风险决策需要人工审核
arize-annotation
根据EU第14条设置人工监督的注释队列
需要深度链接以共享合规证据
arize-link
生成指向特定追踪、跨度或评估的URL,供利益相关方查看

Instrumentation cross-reference

工具配置交叉引用

If Arize tracing is already set up, verify it meets compliance requirements:
Compliance needRequired trace dataWhat to check
Audit trail for AI decisionsAll LLM spans with input/outputVerify all LLM client calls are instrumented, not just some
Data subject access requestsUser ID attribute on spansCheck for
user.id
or custom user identifier attribute
PII in tracesSensitive data in
input.value
/
output.value
Check if PII passes through unredacted — flag if so
Incident investigationError spans with full contextCheck for exception tracking and error status on spans
Retention requirementsTrace data retained for required periodEU: appropriate period (min 6 months for high-risk); HIPAA: 6 years
Bias monitoringDemographic or group attributesCheck for metadata attributes that enable fairness analysis
If Arize tracing is not set up, this is a significant compliance gap. Offer: "Shall I run the
arize-instrumentation
skill to set up audit-trail tracing? Regulatory frameworks (EU AI Act Art. 12, NIST AI RMF MAN-2.1) require event logging for AI systems."
若已设置Arize追踪,验证其是否符合合规要求:
合规需求所需追踪数据检查要点
AI决策的审计跟踪包含输入/输出的所有LLM跨度验证所有LLM客户端调用均已被工具配置,而非仅部分
数据主体访问请求跨度上的用户ID属性检查是否存在
user.id
或自定义用户标识符属性
追踪中的PII
input.value
/
output.value
中的敏感数据
检查PII是否未经脱敏传递 — 若是则标记
事件调查包含完整上下文的错误跨度检查是否有异常跟踪和跨度上的错误状态
保留要求按要求保留追踪数据欧盟:适当期限(高风险系统至少6个月);HIPAA:6年
偏见监控人口统计或群体属性检查是否有支持公平性分析的元数据属性
设置Arize追踪,这是重大合规差距。主动提出:"是否运行
arize-instrumentation
Skill来设置审计跟踪?监管框架(EU AI Act第12条、NIST AI RMF MAN-2.1)要求AI系统记录事件。"

Reference links

参考链接

Reference files

参考文件

  • references/eu-ai-act-gpai.md — EU AI Act and GPAI Code of Practice developer guide
  • references/us-ai-compliance.md — US compliance landscape (NIST AI RMF, Colorado, NYC LL144, HIPAA)
  • references/iso-42001.md — ISO/IEC 42001:2023 AI Management Systems developer guide (technically-auditable controls only)
  • references/compliance-checklist-template.md — Reusable checklist template for Phase 2 output
  • references/eu-ai-act-gpai.md — EU AI Act和GPAI Code of Practice开发者指南
  • references/us-ai-compliance.md — 美国合规体系(NIST AI RMF、Colorado、NYC LL144、HIPAA)
  • references/iso-42001.md — ISO/IEC 42001:2023 AI管理系统开发者指南(仅涵盖可技术审计的控制项)
  • references/compliance-checklist-template.md — 阶段2输出的可复用清单模板