arize-compliance-audit
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseArize Compliance Audit Skill
Arize合规审计Skill
Use this skill when the user wants to audit their AI agent or LLM application for regulatory compliance. The skill scans the codebase for compliance gaps, cross-references Arize instrumentation for audit trail coverage, and produces a tailored checklist with optional remediation.
Triggers: "audit my app for compliance", "EU AI Act requirements", "NIST AI RMF checklist", "GDPR for AI", "is my AI app compliant", "compliance checklist", "regulatory audit", "ISO 42001", "AI management system", "AIMS certification".
当用户希望审核其AI Agent或LLM应用的监管合规性时使用此Skill。该Skill会扫描代码库查找合规性差距,交叉引用Arize工具的审计跟踪覆盖范围,并生成带有可选整改方案的定制化清单。
触发词: "audit my app for compliance", "EU AI Act requirements", "NIST AI RMF checklist", "GDPR for AI", "is my AI app compliant", "compliance checklist", "regulatory audit", "ISO 42001", "AI management system", "AIMS certification"
Disclaimer
免责声明
Before doing anything else, present this disclaimer verbatim to the user:
⚠️ Legal disclaimerThis audit is for guidance only and does not constitute legal advice or a complete compliance assessment. It identifies common technical patterns and gaps based on publicly available regulatory frameworks, but cannot assess your organisation's specific legal obligations, contractual commitments, data processing agreements, or operational processes.Do not rely on this output as a substitute for qualified legal counsel. Regulatory compliance is a complex, jurisdiction-specific, and fact-dependent determination. Always engage a qualified attorney or compliance specialist for binding assessments.
在执行任何操作之前,请向用户一字不差地展示以下免责声明:
⚠️ 法律免责声明本审计仅作指导用途,不构成法律建议或完整的合规评估。它基于公开可用的监管框架识别常见技术模式和差距,但无法评估您组织的具体法律义务、合同承诺、数据处理协议或运营流程。请勿将此输出作为合格法律顾问的替代方案。 监管合规是一项复杂、因司法管辖区而异且依赖具体事实的判定工作。始终请合格律师或合规专家进行具有约束力的评估。
Core principles
核心原则
- Prefer inspection over mutation — understand the codebase before suggesting changes.
- Be practical, not legal — produce developer-actionable items, not legal opinions.
- Tailor to jurisdiction and use case — a chatbot has different obligations than a hiring tool. Do not dump the entire regulatory framework.
- Cross-reference instrumentation — compliance requires audit trails; check whether Arize tracing captures what regulators expect.
- Offer remediation, always confirm — after presenting the checklist, offer to implement specific fixes, but never modify code without explicit user confirmation.
- Keep output concise and production-focused — do not generate extra documentation or summary files unless requested.
- Never embed literal credential values — always reference environment variables.
- 优先检查而非修改 — 在建议更改前先了解代码库
- 注重实用性而非法律性 — 生成开发者可执行的事项,而非法律意见
- 根据司法管辖区和用例定制 — 聊天机器人的义务与招聘工具不同。不要照搬整个监管框架
- 交叉引用工具配置 — 合规需要审计跟踪;检查Arize追踪是否捕获了监管机构要求的内容
- 提供整改方案,始终确认 — 展示清单后,主动提出实施具体修复,但未经用户明确确认绝不要修改代码
- 保持输出简洁且聚焦生产环境 — 除非被要求,否则不要生成额外文档或摘要文件
- 绝不嵌入字面凭证值 — 始终引用环境变量
Phase 0: Framework selection and use case
阶段0:框架选择与用例确定
Before scanning code, determine which compliance frameworks apply.
在扫描代码之前,确定适用的合规框架。
Step 1 — Framework selection
步骤1 — 框架选择
Use the tool to ask the user which frameworks apply. Do not infer or auto-select — always ask explicitly.
AskUserQuestionAsk:
Which compliance frameworks should this audit cover?
Select all that apply (reply with numbers, e.g. "1, 3"):
1. EU frameworks — EU AI Act, GPAI Code of Practice, GDPR
(choose if end-users or data subjects are located in the EU)
2. US frameworks — NIST AI RMF, state laws (Colorado AI Act, NYC LL144),
HIPAA (if processing health data)
(choose if operating in the United States)
3. ISO 42001 — International AI Management System standard
(choose if pursuing ISO 42001 certification, operating globally,
or wanting an internationally recognised baseline)
You can select any combination. If unsure, select all that seem relevant
and we can narrow down during the audit.Based on the selection:
- 1 selected — EU AI Act, GPAI Code of Practice, GDPR apply. See references/eu-ai-act-gpai.md.
- 2 selected — NIST AI RMF, Colorado AI Act, NYC LL144, HIPAA may apply. See references/us-ai-compliance.md.
- 3 selected — ISO 42001 AIMS controls apply. See references/iso-42001.md. Note: ISO 42001 is an organisational management system — the audit will cover technically-auditable controls only; purely organisational clauses (leadership review, internal audits) are flagged separately.
- Multiple selected — all selected frameworks apply; the audit covers the union of requirements, with cross-references where frameworks overlap.
使用工具询问用户适用哪些框架。请勿推断或自动选择 — 务必明确询问。
AskUserQuestion询问内容如下:
本次审计应涵盖哪些合规框架?
选择所有适用项(回复数字,例如 "1, 3"):
1. 欧盟框架 — EU AI Act、GPAI Code of Practice、GDPR
(若终端用户或数据主体位于欧盟,请选择此项)
2. 美国框架 — NIST AI RMF、州级法律(Colorado AI Act、NYC LL144)、
HIPAA(若处理健康数据)
(若在美国运营,请选择此项)
3. ISO 42001 — 国际AI管理系统标准
(若追求ISO 42001认证、全球运营,或需要国际认可的基准,请选择此项)
您可以选择任意组合。若不确定,请选择所有看似相关的选项,我们可在审计过程中逐步缩小范围。根据选择结果:
- 选择1 — 适用EU AI Act、GPAI Code of Practice、GDPR。请参阅references/eu-ai-act-gpai.md。
- 选择2 — 可能适用NIST AI RMF、Colorado AI Act、NYC LL144、HIPAA。请参阅references/us-ai-compliance.md。
- 选择3 — 适用ISO 42001 AIMS控制要求。请参阅references/iso-42001.md。注意:ISO 42001是组织管理系统 — 审计仅涵盖可技术审计的控制项;纯组织条款(领导层评审、内部审计)将单独标记。
- 选择多个 — 所有选定框架均适用;审计涵盖各项要求的并集,同时在框架重叠处提供交叉引用。
Step 2 — Determine use case category
步骤2 — 确定用例类别
Use the tool to ask: What does your AI application do?
AskUserQuestion- General chatbot / assistant — Limited risk (EU), general obligations (US)
- Hiring / HR — High risk (EU Art. 6, Annex III); Colorado AI Act applies; NYC LL144 applies if NYC
- Healthcare — High risk (EU); HIPAA applies if processing PHI
- Credit / financial — High risk (EU); Colorado AI Act applies
- Education — High risk (EU)
- Content generation — Limited risk (EU Art. 50 transparency); general obligations (US)
- GPAI model provider — GPAI Code of Practice applies (EU)
使用工具询问:您的AI应用用于什么场景?
AskUserQuestion- 通用聊天机器人/助手 — 低风险(欧盟)、一般义务(美国)
- 招聘/人力资源 — 高风险(欧盟第6条、附件III);适用Colorado AI Act;若在纽约市则适用NYC LL144
- 医疗保健 — 高风险(欧盟);若处理受保护健康信息(PHI)则适用HIPAA
- 信贷/金融 — 高风险(欧盟);适用Colorado AI Act
- 教育 — 高风险(欧盟)
- 内容生成 — 低风险(欧盟第50条透明度要求)、一般义务(美国)
- GPAI模型提供商 — 适用GPAI Code of Practice(欧盟)
Step 3 — Determine risk tier
步骤3 — 确定风险等级
Based on the use case and selected frameworks:
- EU selected: Classify as Unacceptable / High / Limited / Minimal per references/eu-ai-act-gpai.md
- US selected: Classify as High-risk (consequential decisions per Colorado AI Act) or General
- ISO 42001 selected: Risk tier is not a formal classification in ISO 42001, but note whether the system is high-stakes (which elevates the priority of impact assessment and bias controls)
根据用例和选定框架:
- 选定欧盟框架:根据references/eu-ai-act-gpai.md分类为不可接受/高/有限/最低风险
- 选定美国框架:分类为高风险(根据Colorado AI Act的重大决策场景)或一般风险
- 选定ISO 42001:ISO 42001中没有正式的风险等级分类,但需注明系统是否为高风险(这会提升影响评估和偏见控制的优先级)
Phase 0 output
阶段0输出
Present a brief summary:
Frameworks selected: {EU / US / ISO 42001 / combination}
Use case: {category}
Risk tier: {EU tier if applicable} / {US tier if applicable}
Applicable: {list of specific regulations and standards}
ISO 42001 note: {if selected} Audit covers technically-auditable controls only;
organisational clauses will be flagged but not code-audited.Then proceed directly to Phase 1.
展示简要摘要:
选定框架:{欧盟/美国/ISO 42001/组合}
用例: {类别}
风险等级: {适用的欧盟等级} / {适用的美国等级}
适用法规: {具体法规和标准列表}
ISO 42001说明: {若选定} 审计仅涵盖可技术审计的控制项;
组织条款将被标记但不进行代码审计。然后直接进入阶段1。
Phase 1: Codebase audit (read-only)
阶段1:代码库审计(只读)
Do not write any code or create any files during this phase.
Systematically scan the codebase for evidence of compliance and gaps across seven domains. For each domain, run the listed searches and record findings.
此阶段请勿编写任何代码或创建任何文件。
系统地扫描代码库,在七个领域中查找合规证据和差距。针对每个领域,运行列出的搜索并记录发现。
A. Transparency and disclosure
A. 透明度与披露
What to look for:
- User-facing strings disclosing AI involvement: search for terms like ,
AI,artificial intelligence,automated,bot,machine learning,generated byin UI templates, API responses, and user-facing codepowered by - Content labelling: markers on AI-generated output (text, images, audio)
- Terms of service, privacy policy references in the codebase
Signals of concern: Absence of any AI disclosure in user-facing code, especially if the application generates content or makes recommendations.
检查要点:
- 用户界面中披露AI参与的字符串:在UI模板、API响应和面向用户的代码中搜索、
AI、artificial intelligence、automated、bot、machine learning、generated by等术语powered by - 内容标记:AI生成输出(文本、图像、音频)的标记
- 代码库中服务条款、隐私政策的引用
风险信号: 面向用户的代码中完全没有AI披露,尤其是当应用生成内容或提供建议时。
B. Data protection and privacy
B. 数据保护与隐私
What to look for:
- PII field names in code: ,
email,phone,ssn,social_security,date_of_birth,addressin prompts, context, or retrieved documentsname - PII in trace span attributes: check if or
input.valuecould contain personal data sent to Arize without redactionoutput.value - Consent mechanisms: ,
consent,opt-in,opt-out,gdprreferencesccpa - DPIA or privacy assessment references
- Data retention and deletion handlers
- Data subject rights: ,
right_to_access,right_to_erasure,data_subject_requestdata_protection_officer
检查要点:
- 代码中的PII字段名称:提示词、上下文或检索文档中的、
email、phone、ssn、social_security、date_of_birth、addressname - 跟踪跨度属性中的PII:检查或
input.value是否包含未经脱敏就发送到Arize的个人数据output.value - 同意机制:、
consent、opt-in、opt-out、gdpr的引用ccpa - DPIA或隐私评估的引用
- 数据保留和删除处理程序
- 数据主体权利:、
right_to_access、right_to_erasure、data_subject_requestdata_protection_officer
C. Security
C. 安全性
What to look for:
- Prompt injection defences: input validation, guardrail libraries (,
guardrails-ai,nemo-guardrails,rebuff), content filtering, system prompt protectionlakera - Data loss prevention: output scanning before returning to users, sensitive data detection
- Tool/function calling controls: permission boundaries, allowlists, sandboxing for tool execution
- Rate limiting and authentication on AI endpoints
- Hardcoded secrets: ,
api_key,secret,passwordliterals in source files (not env var references)token
检查要点:
- 提示词注入防御:输入验证、防护库(、
guardrails-ai、nemo-guardrails、rebuff)、内容过滤、系统提示保护lakera - 数据丢失防护:返回给用户前的输出扫描、敏感数据检测
- 工具/函数调用控制:权限边界、允许列表、工具执行沙箱
- AI端点的速率限制和身份验证
- 硬编码密钥:源文件中的、
api_key、secret、password字面量(而非环境变量引用)token
D. Testing and evaluation
D. 测试与评估
What to look for:
- Bias and fairness testing: references to demographic parity, impact ratios, fairness metrics
- Red teaming or adversarial test suites: prompt injection tests, jailbreak tests
- Evaluation frameworks: Arize evaluators, custom eval scripts, -based evals, experiment infrastructure
pytest - A/B testing or model comparison infrastructure
检查要点:
- 偏见与公平性测试:人口均等性、影响比率、公平性指标的引用
- 红队测试或对抗性测试套件:提示词注入测试、越狱测试
- 评估框架:Arize评估器、自定义评估脚本、基于的评估、实验基础设施
pytest - A/B测试或模型比较基础设施
E. Documentation
E. 文档
What to look for:
- Model cards: ,
MODEL_CARD.md,model_card.json, or similarmodel_card.yaml - System architecture documentation
- Change logs or version tracking for prompts and model updates
- Incident response documentation
检查要点:
- 模型卡片:、
MODEL_CARD.md、model_card.json或类似文件model_card.yaml - 系统架构文档
- 提示词和模型更新的变更日志或版本跟踪
- 事件响应文档
F. Monitoring and observability
F. 监控与可观测性
What to look for:
- Arize tracing setup: ,
arize-otel,register(),TracerProvider,opentelemetryimportsopeninference - If tracing exists, check coverage:
- All LLM calls traced (not just some)
- Session IDs for conversation continuity
- User IDs for data subject request support
- Error tracking and exception spans
- Alerting and drift detection configuration
- Trace retention configuration
检查要点:
- Arize追踪设置:、
arize-otel、register()、TracerProvider、opentelemetry导入openinference - 若已设置追踪,检查覆盖范围:
- 所有LLM调用均被追踪(而非仅部分)
- 会话ID以保持对话连续性
- 用户ID以支持数据主体请求
- 错误跟踪和异常跨度
- 告警和漂移检测配置
- 追踪保留配置
G. Vendor management
G. 供应商管理
What to look for:
- Third-party AI API usage: OpenAI, Anthropic, Google, Azure, Bedrock, Cohere imports or client instantiation
- Model versioning: are specific model versions pinned (e.g., ) or using
gpt-4-0613/ unversioned identifierslatest - Fallback and failover logic between providers
检查要点:
- 第三方AI API使用:OpenAI、Anthropic、Google、Azure、Bedrock、Cohere的导入或客户端实例化
- 模型版本控制:是否固定了特定模型版本(例如)或使用
gpt-4-0613/未版本化标识符latest - 供应商之间的回退和故障转移逻辑
Phase 1 output
阶段1输出
Present a two-part report:
Part 1 — Summary table
| Domain | Evidence found | Gaps identified | Rating |
|---|---|---|---|
| A. Transparency | {findings} | {gaps} | Compliant / Partial / Non-compliant / N/A |
| B. Data protection | {findings} | {gaps} | ... |
| C. Security | {findings} | {gaps} | ... |
| D. Testing | {findings} | {gaps} | ... |
| E. Documentation | {findings} | {gaps} | ... |
| F. Monitoring | {findings} | {gaps} | ... |
| G. Vendor management | {findings} | {gaps} | ... |
Part 2 — Gap detail (required for every Non-compliant or Partial rating)
For each domain rated Non-compliant or Partial, write a dedicated subsection that includes:
- The exact code path — file path(s), line number(s), and the relevant code snippet showing where the gap exists. Do not paraphrase; quote the actual code.
- Why it matters in this specific app — explain the concrete risk in the context of this codebase (e.g. which tools could be abused, which data flows are exposed, what an attacker or regulator would find).
- What is missing — a precise description of the control or code that should exist but does not (e.g. "a span attribute processor that hashes before the OTLP exporter fires", not just "add PII redaction").
user_email
Minimum one subsection per Non-compliant/Partial domain. Do not omit this section — it is the primary value of the audit for engineering teams.
Then proceed directly to Phase 2.
展示两部分报告:
第一部分 — 摘要表格
| 领域 | 发现的证据 | 识别的差距 | 评级 |
|---|---|---|---|
| A. 透明度 | {发现结果} | {差距} | 合规/部分合规/不合规/不适用 |
| B. 数据保护 | {发现结果} | {差距} | ... |
| C. 安全性 | {发现结果} | {差距} | ... |
| D. 测试 | {发现结果} | {差距} | ... |
| E. 文档 | {发现结果} | {差距} | ... |
| F. 监控 | {发现结果} | {差距} | ... |
| G. 供应商管理 | {发现结果} | {差距} | ... |
第二部分 — 差距详情(所有不合规或部分合规评级必填)
针对每个被评为不合规或部分合规的领域,编写专门的小节,包含:
- 确切代码路径 — 文件路径、行号,以及显示差距所在位置的相关代码片段。请勿转述;引用实际代码。
- 对当前应用的影响 — 解释此代码库中的具体风险(例如哪些工具可能被滥用、哪些数据流暴露、攻击者或监管机构会发现什么)。
- 缺失内容 — 对应该存在但不存在的控制项或代码的精确描述(例如"在OTLP导出器触发前对进行哈希处理的跨度属性处理器",而非仅"添加PII脱敏")。
user_email
每个不合规/部分合规领域至少包含一个小节。请勿省略此部分 — 这是审计对工程团队的核心价值。
然后直接进入阶段2。
Phase 2: Compliance checklist
阶段2:合规清单
Using the Phase 1 findings and the template in references/compliance-checklist-template.md, generate a tailored compliance checklist.
利用阶段1的发现和references/compliance-checklist-template.md中的模板,生成定制化合规清单。
Rules for checklist generation
清单生成规则
- Only include relevant sections. If the user is US-only, skip GDPR-specific items. If not healthcare, skip HIPAA. If not hiring in NYC, skip LL144.
- Mark items from Phase 1. Items where evidence was found: mark as . Items with gaps: mark as
Compliantwith a concrete remediation suggestion.Non-compliant - Prioritise correctly. Critical = enforcement risk or system prohibition. High = required by regulation. Medium = recommended by framework. Low = best practice.
- Be specific in remediation. Instead of "implement input validation", say "add a guardrail library like to validate LLM inputs and outputs against your content policy".
guardrails-ai - Include the instrumentation cross-reference table from the template. If Arize tracing is not set up, flag this as a Critical gap — audit trails are required by EU Art. 12 and NIST MAN-2.1.
- 仅包含相关部分。若用户仅涉及美国,跳过GDPR特定项。若不涉及医疗保健,跳过HIPAA。若不在纽约市招聘,跳过LL144。
- 标记阶段1的项。找到证据的项标记为。存在差距的项标记为
合规并附上具体整改建议。不合规 - 正确排序优先级。关键 = 执法风险或系统禁用风险。高 = 法规要求。中 = 框架建议。低 = 最佳实践。
- 整改建议具体化。不说"实施输入验证",而是说"添加等防护库,根据您的内容政策验证LLM的输入和输出"。
guardrails-ai - 包含模板中的工具配置交叉引用表。若未设置Arize追踪,将其标记为关键差距 — EU第12条和NIST MAN-2.1要求审计跟踪。
Final report
最终报告
Present a single consolidated report with four sections:
Section 1 — Audit scope (Phase 0 summary)
- Frameworks selected, use case, risk tier, applicable regulations
Section 2 — Codebase findings (Phase 1 summary table)
- The domain table (A–G) with evidence, gaps, and ratings
Section 3 — Gap detail (Phase 1 expanded)
- One subsection per Non-compliant or Partial domain, each containing: exact file paths and line numbers, quoted code snippets, app-specific risk explanation, and a precise description of what is missing. This section is mandatory — never omit it.
Section 4 — Compliance checklist (Phase 2)
- The tailored checklist with status and remediation suggestions, instrumentation cross-reference table, priority summary, and recommended next steps
When the user asks for a report file, write a single markdown file to containing all four sections.
/tmp/<app-name>-compliance-audit-<YYYY-MM-DD>.mdAfter presenting the report, offer Phase 3 remediation.
展示包含四个部分的综合报告:
第一部分 — 审计范围(阶段0摘要)
- 选定框架、用例、风险等级、适用法规
第二部分 — 代码库发现(阶段1摘要表格)
- A–G领域的表格,包含证据、差距和评级
第三部分 — 差距详情(阶段1扩展内容)
- 每个不合规或部分合规领域对应一个小节,包含:确切文件路径和行号、引用的代码片段、针对应用的风险解释、缺失内容的精确描述。此部分为必填项 — 绝不要省略。
第四部分 — 合规清单(阶段2)
- 定制化清单,包含状态和整改建议、工具配置交叉引用表、优先级摘要和建议下一步行动
当用户要求报告文件时,将所有四个部分写入文件。
/tmp/<app-name>-compliance-audit-<YYYY-MM-DD>.md展示报告后,提供阶段3的整改服务。
Phase 3: Remediation (optional)
阶段3:整改(可选)
After presenting the checklist, offer to implement specific fixes. Always use the tool to confirm before making any changes.
AskUserQuestion展示清单后,主动提出实施具体修复。在进行任何更改之前,务必使用工具确认。
AskUserQuestionRemediation categories
整改类别
Add dependencies — offer to install:
- Guardrail libraries for input/output validation (e.g., ,
guardrails-ai)nemo-guardrails - PII detection/redaction packages (e.g., ,
presidio-analyzer)scrubadub - Content safety classifiers
Insert code — offer to add:
- AI disclosure strings in user-facing output (templates, API responses)
- PII redaction filters on span attributes before export to Arize
- Input validation/sanitisation on AI endpoints
- User ID attributes on trace spans for data subject request support
Create documentation templates — offer to scaffold:
- Model card template (markdown file with standard sections)
- Incident response plan template
- Data processing record template
Configure monitoring — offer to set up via related skills:
- Arize evaluators for bias detection and content safety (via skill)
arize-evaluator - Tracing for audit trail coverage (via skill)
arize-instrumentation
添加依赖项 — 主动提出安装:
- 用于输入/输出验证的防护库(例如、
guardrails-ai)nemo-guardrails - PII检测/脱敏包(例如、
presidio-analyzer)scrubadub - 内容安全分类器
插入代码 — 主动提出添加:
- 面向用户输出中的AI披露字符串(模板、API响应)
- 导出到Arize前对跨度属性进行PII脱敏的过滤器
- AI端点的输入验证/清理
- 跟踪跨度上的用户ID属性,以支持数据主体请求
创建文档模板 — 主动提出搭建:
- 模型卡片模板(包含标准章节的markdown文件)
- 事件响应计划模板
- 数据处理记录模板
配置监控 — 主动提出通过相关Skill设置:
- 用于偏见检测和内容安全的Arize评估器(通过Skill)
arize-evaluator - 用于审计跟踪覆盖的追踪(通过Skill)
arize-instrumentation
Remediation rules
整改规则
- Present each remediation as a discrete, confirmable action. Never batch-apply changes.
- Show exactly what will change (file, code diff concept) then use the tool to get confirmation before applying.
AskUserQuestion - Follow existing code style and project conventions.
- Never embed credentials — always use environment variables.
- Test that the application still builds after changes.
- 将每个整改措施作为独立、可确认的操作呈现。绝不要批量应用更改。
- 准确展示将要更改的内容(文件、代码差异概念),然后使用工具获取确认后再应用。
AskUserQuestion - 遵循现有代码风格和项目约定。
- 绝不嵌入凭证 — 始终使用环境变量。
- 更改后测试应用是否仍能正常构建。
Skill orchestration
Skill编排
When gaps identified in Phase 1 or 2 require capabilities from other Arize skills, offer to invoke them. Always use the tool to ask before invoking another skill and explain why it is relevant to the compliance gap.
AskUserQuestion| Gap | Skill to invoke | Why |
|---|---|---|
| No tracing / incomplete audit trail | | EU Art. 12 and NIST MAN-2.1 require event logging; Arize tracing provides this |
| No bias or safety evaluation | | Create LLM-as-judge evaluators for fairness, content safety, or quality monitoring |
| Need trace export for compliance evidence | | Export spans for regulatory documentation or incident investigation |
| Need human review for high-risk decisions | | Set up annotation queues for human oversight per EU Art. 14 |
| Need deep link to share compliance evidence | | Generate URLs to specific traces, spans, or evaluations for stakeholder review |
当阶段1或2中识别的差距需要其他Arize Skill的能力时,主动提出调用它们。调用其他Skill前务必使用工具询问,并解释其与合规差距的相关性
AskUserQuestion| 差距 | 要调用的Skill | 原因 |
|---|---|---|
| 无追踪/审计跟踪不完整 | | EU第12条和NIST MAN-2.1要求AI系统记录事件;Arize追踪可提供此功能 |
| 无偏见或安全评估 | | 创建用于公平性、内容安全或质量监控的LLM-as-judge评估器 |
| 需要导出追踪以提供合规证据 | | 导出跨度用于监管文档或事件调查 |
| 高风险决策需要人工审核 | | 根据EU第14条设置人工监督的注释队列 |
| 需要深度链接以共享合规证据 | | 生成指向特定追踪、跨度或评估的URL,供利益相关方查看 |
Instrumentation cross-reference
工具配置交叉引用
If Arize tracing is already set up, verify it meets compliance requirements:
| Compliance need | Required trace data | What to check |
|---|---|---|
| Audit trail for AI decisions | All LLM spans with input/output | Verify all LLM client calls are instrumented, not just some |
| Data subject access requests | User ID attribute on spans | Check for |
| PII in traces | Sensitive data in | Check if PII passes through unredacted — flag if so |
| Incident investigation | Error spans with full context | Check for exception tracking and error status on spans |
| Retention requirements | Trace data retained for required period | EU: appropriate period (min 6 months for high-risk); HIPAA: 6 years |
| Bias monitoring | Demographic or group attributes | Check for metadata attributes that enable fairness analysis |
If Arize tracing is not set up, this is a significant compliance gap. Offer: "Shall I run the skill to set up audit-trail tracing? Regulatory frameworks (EU AI Act Art. 12, NIST AI RMF MAN-2.1) require event logging for AI systems."
arize-instrumentation若已设置Arize追踪,验证其是否符合合规要求:
| 合规需求 | 所需追踪数据 | 检查要点 |
|---|---|---|
| AI决策的审计跟踪 | 包含输入/输出的所有LLM跨度 | 验证所有LLM客户端调用均已被工具配置,而非仅部分 |
| 数据主体访问请求 | 跨度上的用户ID属性 | 检查是否存在 |
| 追踪中的PII | | 检查PII是否未经脱敏传递 — 若是则标记 |
| 事件调查 | 包含完整上下文的错误跨度 | 检查是否有异常跟踪和跨度上的错误状态 |
| 保留要求 | 按要求保留追踪数据 | 欧盟:适当期限(高风险系统至少6个月);HIPAA:6年 |
| 偏见监控 | 人口统计或群体属性 | 检查是否有支持公平性分析的元数据属性 |
若未设置Arize追踪,这是重大合规差距。主动提出:"是否运行 Skill来设置审计跟踪?监管框架(EU AI Act第12条、NIST AI RMF MAN-2.1)要求AI系统记录事件。"
arize-instrumentationReference links
参考链接
| Resource | URL |
|---|---|
| EU AI Act full text | https://eur-lex.europa.eu/eli/reg/2024/1689/oj |
| GPAI Code of Practice | https://digital-strategy.ec.europa.eu/en/policies/contents-code-gpai |
| Code of Practice portal | https://code-of-practice.ai/ |
| NIST AI RMF | https://www.nist.gov/artificial-intelligence/ai-risk-management-framework |
| Colorado AI Act (SB24-205) | https://leg.colorado.gov/bills/sb24-205 |
| NYC Local Law 144 | https://www.nyc.gov/site/dca/about/automated-employment-decision-tools.page |
| HIPAA | https://www.hhs.gov/hipaa/index.html |
| ISO/IEC 42001:2023 | https://www.iso.org/standard/42001.html |
| Arize AX Docs | https://arize.com/docs/ax |
| 资源 | URL |
|---|---|
| EU AI Act全文 | https://eur-lex.europa.eu/eli/reg/2024/1689/oj |
| GPAI Code of Practice | https://digital-strategy.ec.europa.eu/en/policies/contents-code-gpai |
| Code of Practice门户 | https://code-of-practice.ai/ |
| NIST AI RMF | https://www.nist.gov/artificial-intelligence/ai-risk-management-framework |
| Colorado AI Act (SB24-205) | https://leg.colorado.gov/bills/sb24-205 |
| NYC Local Law 144 | https://www.nyc.gov/site/dca/about/automated-employment-decision-tools.page |
| HIPAA | https://www.hhs.gov/hipaa/index.html |
| ISO/IEC 42001:2023 | https://www.iso.org/standard/42001.html |
| Arize AX文档 | https://arize.com/docs/ax |
Reference files
参考文件
- references/eu-ai-act-gpai.md — EU AI Act and GPAI Code of Practice developer guide
- references/us-ai-compliance.md — US compliance landscape (NIST AI RMF, Colorado, NYC LL144, HIPAA)
- references/iso-42001.md — ISO/IEC 42001:2023 AI Management Systems developer guide (technically-auditable controls only)
- references/compliance-checklist-template.md — Reusable checklist template for Phase 2 output
- references/eu-ai-act-gpai.md — EU AI Act和GPAI Code of Practice开发者指南
- references/us-ai-compliance.md — 美国合规体系(NIST AI RMF、Colorado、NYC LL144、HIPAA)
- references/iso-42001.md — ISO/IEC 42001:2023 AI管理系统开发者指南(仅涵盖可技术审计的控制项)
- references/compliance-checklist-template.md — 阶段2输出的可复用清单模板