model-debugging
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseModel Debugging Skill
模型调试技能
Use this skill when:
- Investigating model failures, high error rates, or service issues
- Finding users affected by errors (402 billing, 403 permissions, 500 backend)
- Analyzing Tinybird/Cloudflare logs for patterns
- Diagnosing specific request failures
Related skill: Use to upgrade users or check balances after identifying issues here.
tier-management在以下场景使用本技能:
- 调查模型故障、高错误率或服务问题
- 查找受错误影响的用户(402计费错误、403权限错误、500后端错误)
- 分析Tinybird/Cloudflare日志中的模式
- 诊断特定请求的故障
相关技能:在识别问题后,使用技能升级用户或查看余额。
tier-managementUnderstanding Model Monitor Error Rates
理解模型监控的错误率
Why does the Model Monitor show high error rates when models work fine manually?
The Model Monitor at https://monitor.pollinations.ai shows all real-world traffic, including:
- 401 errors: Anonymous users without API keys (most common)
- 402 errors: Users with insufficient pollen balance or exhausted API key budget
- 403 errors: Users denied access to specific models (API key restrictions)
- 400 errors: Invalid request parameters (e.g., without
openai-audioparam)modalities - 429 errors: Rate-limited requests
- 500/504 errors: Actual backend failures (investigate these)
When you test manually with a valid secret key (), you bypass auth/quota issues, so models appear to work fine.
sk_Key insight: High 401/402/403/400 rates are expected from real-world usage. Focus investigation on 500/504 errors.
为什么当模型手动运行正常时,模型监控却显示高错误率?
- 401错误:无API密钥的匿名用户(最常见)
- 402错误:花粉余额不足或API密钥预算耗尽的用户
- 403错误:被拒绝访问特定模型的用户(API密钥限制)
- 400错误:无效请求参数(例如,请求未携带
openai-audio参数)modalities - 429错误:请求超出速率限制
- 500/504错误:实际后端故障(需要重点调查)
当你使用有效的密钥(开头)手动测试时,会绕过认证/配额问题,因此模型看起来运行正常。
sk_关键结论:401/402/403/400错误率高是真实使用场景中的正常现象。重点调查500/504错误。
Data Flow Architecture
数据流架构
User Request → enter.pollinations.ai (Cloudflare Worker)
↓
Logs to Cloudflare Workers Observability
↓
Events stored in D1 database
↓
Batched to Tinybird (async, 100-500 events)
↓
Model Monitor queries Tinybird (model_health.pipe)Structured Logging: enter.pollinations.ai uses LogTape with:
- : Unique per request (passed to downstream via
requestIdheader)x-request-id - ,
status: Full error response from downstream servicesbody - Context: ,
method,routePath,userAgentipAddress
用户请求 → enter.pollinations.ai (Cloudflare Worker)
↓
日志发送至Cloudflare Workers Observability
↓
事件存储在D1数据库
↓
批量同步至Tinybird(异步,100-500条事件)
↓
模型监控查询Tinybird(model_health.pipe)结构化日志:enter.pollinations.ai使用LogTape,包含以下字段:
- :每个请求的唯一标识(通过
requestId头传递给下游服务)x-request-id - 、
status:下游服务返回的完整错误响应body - 上下文信息:、
method、routePath、userAgentipAddress
Quick Diagnostics
快速诊断步骤
1. Check Model Monitor
1. 查看模型监控
View current model health at: https://monitor.pollinations.ai
访问以下地址查看当前模型健康状态:https://monitor.pollinations.ai
2. Query Recent Errors from D1 Database
2. 从D1数据库查询近期错误
bash
undefinedbash
undefinedVia enter.pollinations.ai worker (requires wrangler)
通过enter.pollinations.ai worker(需要wrangler)
cd enter.pollinations.ai
npx wrangler d1 execute pollinations-db --remote --command "SELECT model_requested, response_status, error_message, COUNT(*) as count FROM event WHERE response_status >= 400 AND created_at > datetime('now', '-1 hour') GROUP BY model_requested, response_status, error_message ORDER BY count DESC LIMIT 20"
undefinedcd enter.pollinations.ai
npx wrangler d1 execute pollinations-db --remote --command "SELECT model_requested, response_status, error_message, COUNT(*) as count FROM event WHERE response_status >= 400 AND created_at > datetime('now', '-1 hour') GROUP BY model_requested, response_status, error_message ORDER BY count DESC LIMIT 20"
undefined3. Capture Live Logs
3. 捕获实时日志
enter.pollinations.ai (Cloudflare Worker)
enter.pollinations.ai(Cloudflare Worker)
bash
cd enter.pollinations.ai
wrangler tail --format json | tee logs.jsonlbash
cd enter.pollinations.ai
wrangler tail --format json | tee logs.jsonlOr with formatting:
或格式化输出:
wrangler tail --format json | npx tsx scripts/format-logs.ts
undefinedwrangler tail --format json | npx tsx scripts/format-logs.ts
undefinedimage.pollinations.ai (EC2 systemd)
image.pollinations.ai(EC2 systemd)
bash
undefinedbash
undefinedReal-time logs
实时日志
ssh enter-services "sudo journalctl -u image-pollinations.service -f"
ssh enter-services "sudo journalctl -u image-pollinations.service -f"
Last 3 minutes
最近3分钟的日志
ssh enter-services "sudo journalctl -u image-pollinations.service --since '3 minutes ago' --no-pager" > image-service-logs.txt
ssh enter-services "sudo journalctl -u image-pollinations.service --since '3 minutes ago' --no-pager" > image-service-logs.txt
Recent errors only
仅查看近期错误
ssh enter-services "sudo journalctl -u image-pollinations.service -p err -n 50"
undefinedssh enter-services "sudo journalctl -u image-pollinations.service -p err -n 50"
undefinedtext.pollinations.ai (EC2 systemd)
text.pollinations.ai(EC2 systemd)
bash
undefinedbash
undefinedReal-time logs
实时日志
ssh enter-services "sudo journalctl -u text-pollinations.service -f"
ssh enter-services "sudo journalctl -u text-pollinations.service -f"
Last 3 minutes
最近3分钟的日志
ssh enter-services "sudo journalctl -u text-pollinations.service --since '3 minutes ago' --no-pager" > text-service-logs.txt
---ssh enter-services "sudo journalctl -u text-pollinations.service --since '3 minutes ago' --no-pager" > text-service-logs.txt
---Common Error Patterns
常见错误模式
Azure Content Safety DNS Failure
Azure Content Safety DNS解析失败
Error:
Cause: Azure Content Safety resource deleted or misconfigured
Impact: Fail-open (content proceeds without safety check)
Fix: Create new Azure Content Safety resource and update :
getaddrinfo ENOTFOUND gptimagemain1-resource.cognitiveservices.azure.com.envAZURE_CONTENT_SAFETY_ENDPOINT=https://<new-resource>.cognitiveservices.azure.com/
AZURE_CONTENT_SAFETY_API_KEY=<new-key>错误信息:
原因:Azure Content Safety资源已删除或配置错误
影响:故障开放模式(内容将绕过安全检查继续处理)
修复方案:创建新的Azure Content Safety资源并更新文件:
getaddrinfo ENOTFOUND gptimagemain1-resource.cognitiveservices.azure.com.envAZURE_CONTENT_SAFETY_ENDPOINT=https://<new-resource>.cognitiveservices.azure.com/
AZURE_CONTENT_SAFETY_API_KEY=<new-key>Azure Kontext Content Filter
Azure Kontext内容过滤
Error:
Cause: Azure's content moderation blocking prompts/images
Impact: 400 error returned to user
Fix: User error - prompt violates content policy
Content rejected due to sexual/hate/violence content detection错误信息:
原因:Azure内容审核系统拦截了提示词/图片
影响:向用户返回400错误
修复方案:用户端问题 - 提示词违反内容政策
Content rejected due to sexual/hate/violence content detectionVertex AI Invalid Image
Vertex AI无效图片
Error:
Cause: User passing unsupported image URL (e.g., Google Drive links)
Impact: 400 error returned to user
Fix: User error - need direct image URL
Provided image is not valid错误信息:
原因:用户传递了不支持的图片URL(例如Google Drive链接)
影响:向用户返回400错误
修复方案:用户端问题 - 需要使用直接的图片URL
Provided image is not validTranslation Service Down
翻译服务不可用
Error:
Cause: Translation service unavailable
Impact: Prompts not translated (non-fatal)
Fix: Check translation service status
No active translate servers available错误信息:
原因:翻译服务无法访问
影响:提示词不会被翻译(非致命错误)
修复方案:检查翻译服务状态
No active translate servers availableOpenAI Audio Invalid Voice
OpenAI Audio无效语音
Error:
Cause: User requesting unsupported voice name
Impact: 400 error returned to user
Fix: User error - use supported voices: alloy, echo, fable, onyx, nova, shimmer, coral, verse, ballad, ash, sage, etc.
Invalid value for audio.voice错误信息:
原因:用户请求了不支持的语音名称
影响:向用户返回400错误
修复方案:用户端问题 - 使用支持的语音:alloy、echo、fable、onyx、nova、shimmer、coral、verse、ballad、ash、sage等
Invalid value for audio.voiceVeo No Video Data
Veo无视频数据
Error:
Cause: Vertex AI returned empty video response
Impact: 500 error
Fix: Check Vertex AI quota/status, may be transient
No video data in response错误信息:
原因:Vertex AI返回了空的视频响应
影响:返回500错误
修复方案:检查Vertex AI配额/状态,可能是临时故障
No video data in responseEnvironment Variables to Check
需检查的环境变量
image.pollinations.ai
image.pollinations.ai
bash
ssh enter-services "cat /home/ubuntu/pollinations/image.pollinations.ai/.env | grep -E 'AZURE|GOOGLE|CLOUDFLARE'"Key variables:
- - Azure Content Safety API endpoint
AZURE_CONTENT_SAFETY_ENDPOINT - - Azure Content Safety API key
AZURE_CONTENT_SAFETY_API_KEY - - Google Cloud project for Vertex AI
GOOGLE_PROJECT_ID - - Azure Kontext model endpoint
AZURE_MYCELI_FLUX_KONTEXT_ENDPOINT
bash
ssh enter-services "cat /home/ubuntu/pollinations/image.pollinations.ai/.env | grep -E 'AZURE|GOOGLE|CLOUDFLARE'"关键变量:
- - Azure Content Safety API端点
AZURE_CONTENT_SAFETY_ENDPOINT - - Azure Content Safety API密钥
AZURE_CONTENT_SAFETY_API_KEY - - 用于Vertex AI的Google Cloud项目ID
GOOGLE_PROJECT_ID - - Azure Kontext模型端点
AZURE_MYCELI_FLUX_KONTEXT_ENDPOINT
text.pollinations.ai
text.pollinations.ai
bash
ssh enter-services "cat /home/ubuntu/pollinations/text.pollinations.ai/.env | grep -E 'AZURE|OPENAI|GOOGLE'"bash
ssh enter-services "cat /home/ubuntu/pollinations/text.pollinations.ai/.env | grep -E 'AZURE|OPENAI|GOOGLE'"Updating Secrets
日志分析命令
Secrets are stored encrypted with SOPS:
image.pollinations.ai/secrets/env.jsontext.pollinations.ai/secrets/env.json
To update:
bash
undefinedbash
undefinedDecrypt, edit, re-encrypt
按类型统计错误
sops image.pollinations.ai/secrets/env.json
grep -i "error" image-service-logs.txt | grep -oE "(Azure Flux Kontext|Vertex AI|No active translate|getaddrinfo ENOTFOUND)" | sort | uniq -c | sort -rn
Deploy to server
查找被内容过滤拦截的请求
sops --output-type dotenv -d image.pollinations.ai/secrets/env.json > /tmp/image.env
scp /tmp/image.env enter-services:/home/ubuntu/pollinations/image.pollinations.ai/.env
rm /tmp/image.env
grep -i "Content rejected" image-service-logs.txt | sort | uniq -c
Restart service
在服务器上检查DNS解析
ssh enter-services "sudo systemctl restart image-pollinations.service"
---ssh enter-services "nslookup gptimagemain1-resource.cognitiveservices.azure.com"
---Log Analysis Commands
模型专属调试指南
bash
undefined| 模型 | 后端服务 | 常见问题 |
|---|---|---|
| Azure/Replicate | 速率限制、内容过滤 |
| Azure Flux Kontext | 内容过滤(严格) |
| Vertex AI Gemini | 无效图片URL、内容过滤 |
| ByteDance ARK | NSFW过滤、API密钥问题 |
| Vertex AI | 配额限制、空响应 |
| Azure OpenAI | 无效语音名称 |
| DeepSeek API | 速率限制、API密钥 |
Count errors by type
Cloudflare Workers可观测性API
grep -i "error" image-service-logs.txt | grep -oE "(Azure Flux Kontext|Vertex AI|No active translate|getaddrinfo ENOTFOUND)" | sort | uniq -c | sort -rn
enter.pollinations.ai worker已启用结构化日志。你可以通过Cloudflare Workers Observability API以编程方式查询日志。
Find content filter rejections
前提条件
—
1. 获取账户ID
grep -i "Content rejected" image-service-logs.txt | sort | uniq -c
bash
undefinedCheck DNS resolution on server
从wrangler.toml文件中获取
ssh enter-services "nslookup gptimagemain1-resource.cognitiveservices.azure.com"
---grep account_id enter.pollinations.ai/wrangler.toml
Model-Specific Debugging
或从已有的.env文件中获取
| Model | Backend | Common Issues |
|---|---|---|
| Azure/Replicate | Rate limits, content filter |
| Azure Flux Kontext | Content filter (strict) |
| Vertex AI Gemini | Invalid image URLs, content filter |
| ByteDance ARK | NSFW filter, API key issues |
| Vertex AI | Quota, empty responses |
| Azure OpenAI | Invalid voice names |
| DeepSeek API | Rate limits, API key |
grep CLOUDFLARE_ACCOUNT_ID image.pollinations.ai/.env
undefinedCloudflare Workers Observability API
2. 创建具备Workers可观测性权限的API令牌
The enter.pollinations.ai worker has structured logging enabled. You can query logs programmatically via the Cloudflare Workers Observability API.
通过Cloudflare控制台操作:
- 访问https://dash.cloudflare.com/profile/api-tokens
- 点击创建令牌
- 点击创建自定义令牌
- 配置:
- 令牌名称:
Workers Observability Read - 权限:
- 账户 → Workers脚本 → 读取
- 账户 → Workers可观测性 → 编辑(查询API需要此权限)
- 账户资源:包含 → 你的账户
- 令牌名称:
- 点击继续到摘要 → 创建令牌
- 立即复制令牌(仅显示一次)
Prerequisites
3. 安全存储令牌
1. Get Account ID
—
bash
undefined令牌存储在SOPS加密的机密文件中:
- 位置:
enter.pollinations.ai/secrets/env.json - 密钥:
CLOUDFLARE_OBSERVABILITY_TOKEN
添加/更新令牌的步骤:
bash
undefinedFrom wrangler.toml
步骤1:解密到临时文件
grep account_id enter.pollinations.ai/wrangler.toml
cd /path/to/pollinations
sops -d enter.pollinations.ai/secrets/env.json > /tmp/env.json
Or from existing .env
步骤2:添加令牌(使用jq)
grep CLOUDFLARE_ACCOUNT_ID image.pollinations.ai/.env
undefinedjq '. + {"CLOUDFLARE_OBSERVABILITY_TOKEN": "your_token"}' /tmp/env.json > /tmp/env_updated.json
2. Create API Token with Workers Observability Permission
步骤3:重新加密(必须重命名以匹配.sops.yaml的命名规则)
Via Cloudflare Dashboard:
- Go to https://dash.cloudflare.com/profile/api-tokens
- Click Create Token
- Click Create Custom Token
- Configure:
- Token name:
Workers Observability Read - Permissions:
- Account → Workers Scripts → Read
- Account → Workers Observability → Edit (required for query API)
- Account Resources: Include → Your Account
- Token name:
- Click Continue to summary → Create Token
- Copy the token immediately (shown only once)
cp /tmp/env_updated.json /tmp/env.json
sops -e /tmp/env.json > enter.pollinations.ai/secrets/env.json
3. Store Token Securely
步骤4:清理临时文件
The token is stored in SOPS-encrypted secrets:
- Location:
enter.pollinations.ai/secrets/env.json - Key:
CLOUDFLARE_OBSERVABILITY_TOKEN
To add/update:
bash
undefinedrm /tmp/env.json /tmp/env_updated.json
Step 1: Decrypt to temp file
验证
cd /path/to/pollinations
sops -d enter.pollinations.ai/secrets/env.json > /tmp/env.json
sops -d enter.pollinations.ai/secrets/env.json | jq 'keys'
**注意**:`.sops.yaml`配置要求文件名匹配`env.json$`模式。Step 2: Add the token (use jq)
API端点
jq '. + {"CLOUDFLARE_OBSERVABILITY_TOKEN": "your_token"}' /tmp/env.json > /tmp/env_updated.json
POST https://api.cloudflare.com/client/v4/accounts/{account_id}/workers/observability/telemetry/queryStep 3: Re-encrypt (must rename to match .sops.yaml pattern)
查询示例
—
准备工作:从SOPS获取凭证
cp /tmp/env_updated.json /tmp/env.json
sops -e /tmp/env.json > enter.pollinations.ai/secrets/env.json
bash
undefinedStep 4: Cleanup
从加密机密文件中提取凭证
rm /tmp/env.json /tmp/env_updated.json
ACCOUNT_ID=$(sops -d enter.pollinations.ai/secrets/env.json | jq -r '.CLOUDFLARE_ACCOUNT_ID')
API_TOKEN=$(sops -d enter.pollinations.ai/secrets/env.json | jq -r '.CLOUDFLARE_OBSERVABILITY_TOKEN')
undefinedVerify
列出可用日志字段(可用)
sops -d enter.pollinations.ai/secrets/env.json | jq 'keys'
**Note**: The `.sops.yaml` config requires filenames matching `env.json$` pattern.此端点可正常工作,显示所有可用的日志字段:
bash
curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/keys" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"timeframe": {"from": '$(( $(date +%s) - 86400 ))'000, "to": '$(date +%s)'000}, "datasets": ["workers"]}' | jq '.result[:10]'API Endpoint
查询近期错误(最近15分钟)
POST https://api.cloudflare.com/client/v4/accounts/{account_id}/workers/observability/telemetry/query注意:端点需要从控制台保存的。对于临时查询,请使用Cloudflare控制台查询构建器或。
/queryqueryIdwrangler tailbash
undefinedQuery Examples
此格式需要已保存的查询ID
Setup: Get Credentials from SOPS
查询状态码>=400的错误
bash
undefinedcurl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/query"
-H "Authorization: Bearer $API_TOKEN"
-H "Content-Type: application/json"
-d '{ "timeframe": { "from": '$(( $(date +%s) - 900 ))'000, "to": '$(date +%s)'000 }, "parameters": { "datasets": ["workers"], "filters": [ {"key": "$workers.scriptName", "operation": "eq", "type": "string", "value": "enter-pollinations-ai"}, {"key": "$metadata.statusCode", "operation": "gte", "type": "number", "value": 400} ], "calculations": [{"operator": "count"}], "groupBys": [ {"type": "string", "value": "$metadata.statusCode"}, {"type": "string", "value": "$metadata.error"} ], "limit": 50 } }' | jq '.result.events.events[:20]'
-H "Authorization: Bearer $API_TOKEN"
-H "Content-Type: application/json"
-d '{ "timeframe": { "from": '$(( $(date +%s) - 900 ))'000, "to": '$(date +%s)'000 }, "parameters": { "datasets": ["workers"], "filters": [ {"key": "$workers.scriptName", "operation": "eq", "type": "string", "value": "enter-pollinations-ai"}, {"key": "$metadata.statusCode", "operation": "gte", "type": "number", "value": 400} ], "calculations": [{"operator": "count"}], "groupBys": [ {"type": "string", "value": "$metadata.statusCode"}, {"type": "string", "value": "$metadata.error"} ], "limit": 50 } }' | jq '.result.events.events[:20]'
undefinedExtract credentials from encrypted secrets
按模型查询错误
ACCOUNT_ID=$(sops -d enter.pollinations.ai/secrets/env.json | jq -r '.CLOUDFLARE_ACCOUNT_ID')
API_TOKEN=$(sops -d enter.pollinations.ai/secrets/env.json | jq -r '.CLOUDFLARE_OBSERVABILITY_TOKEN')
undefinedbash
curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/query" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"timeframe": {
"from": '$(( $(date +%s) - 3600 ))'000,
"to": '$(date +%s)'000
},
"parameters": {
"datasets": ["workers"],
"filters": [
{"key": "$workers.scriptName", "operation": "eq", "type": "string", "value": "enter-pollinations-ai"},
{"key": "$metadata.statusCode", "operation": "gte", "type": "number", "value": 400}
],
"calculations": [{"operator": "count"}],
"groupBys": [
{"type": "string", "value": "model"},
{"type": "string", "value": "$metadata.statusCode"}
],
"limit": 100
}
}' | jq '.result.calculations[0].aggregates'List Available Log Keys (Working)
获取包含完整详情的原始错误事件
This endpoint works and shows what fields are available:
bash
curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/keys" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"timeframe": {"from": '$(( $(date +%s) - 86400 ))'000, "to": '$(date +%s)'000}, "datasets": ["workers"]}' | jq '.result[:10]'bash
curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/query" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"timeframe": {
"from": '$(( $(date +%s) - 900 ))'000,
"to": '$(date +%s)'000
},
"parameters": {
"datasets": ["workers"],
"filters": [
{"key": "$workers.scriptName", "operation": "eq", "type": "string", "value": "enter-pollinations-ai"},
{"key": "$metadata.statusCode", "operation": "gte", "type": "number", "value": 500}
],
"limit": 20
}
}' | jq '.result.events.events[] | {
timestamp: .timestamp,
statusCode: ."$metadata".statusCode,
error: ."$metadata".error,
message: ."$metadata".message,
requestId: ."$workers".requestId,
url: ."$metadata".url
}'Query Recent Errors (Last 15 Minutes)
列出可用日志字段
Note: The endpoint requires a saved . For ad-hoc queries, use the Cloudflare Dashboard Query Builder or .
/queryqueryIdwrangler tailbash
undefinedbash
curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/keys" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"timeframe": {
"from": '$(( $(date +%s) - 3600 ))'000,
"to": '$(date +%s)'000
},
"datasets": ["workers"],
"filters": [
{"key": "$workers.scriptName", "operation": "eq", "type": "string", "value": "enter-pollinations-ai"}
]
}' | jq '.result.keys'This format requires a saved query ID
enter.pollinations.ai中的结构化日志
Query errors with status >= 400
—
curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/query"
-H "Authorization: Bearer $API_TOKEN"
-H "Content-Type: application/json"
-d '{ "timeframe": { "from": '$(( $(date +%s) - 900 ))'000, "to": '$(date +%s)'000 }, "parameters": { "datasets": ["workers"], "filters": [ {"key": "$workers.scriptName", "operation": "eq", "type": "string", "value": "enter-pollinations-ai"}, {"key": "$metadata.statusCode", "operation": "gte", "type": "number", "value": 400} ], "calculations": [{"operator": "count"}], "groupBys": [ {"type": "string", "value": "$metadata.statusCode"}, {"type": "string", "value": "$metadata.error"} ], "limit": 50 } }' | jq '.result.events.events[:20]'
-H "Authorization: Bearer $API_TOKEN"
-H "Content-Type: application/json"
-d '{ "timeframe": { "from": '$(( $(date +%s) - 900 ))'000, "to": '$(date +%s)'000 }, "parameters": { "datasets": ["workers"], "filters": [ {"key": "$workers.scriptName", "operation": "eq", "type": "string", "value": "enter-pollinations-ai"}, {"key": "$metadata.statusCode", "operation": "gte", "type": "number", "value": 400} ], "calculations": [{"operator": "count"}], "groupBys": [ {"type": "string", "value": "$metadata.statusCode"}, {"type": "string", "value": "$metadata.error"} ], "limit": 50 } }' | jq '.result.events.events[:20]'
undefinedWorker使用LogTape实现结构化日志,包含以下关键字段:
- requestId:每个请求的唯一ID(日志中显示前8位字符)
- method:HTTP方法(GET、POST)
- routePath:请求URL
- status:响应状态码
- duration:请求处理时长(毫秒)
下游错误的日志记录方式:
typescript
log.warn("Chat completions error {status}: {body}", {
status: response.status,
body: responseText,
});Query Errors by Model
Tinybird分析(替代方案)
bash
curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/query" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"timeframe": {
"from": '$(( $(date +%s) - 3600 ))'000,
"to": '$(date +%s)'000
},
"parameters": {
"datasets": ["workers"],
"filters": [
{"key": "$workers.scriptName", "operation": "eq", "type": "string", "value": "enter-pollinations-ai"},
{"key": "$metadata.statusCode", "operation": "gte", "type": "number", "value": 400}
],
"calculations": [{"operator": "count"}],
"groupBys": [
{"type": "string", "value": "model"},
{"type": "string", "value": "$metadata.statusCode"}
],
"limit": 100
}
}' | jq '.result.calculations[0].aggregates'如需聚合模型健康统计数据,可直接查询Tinybird:
bash
undefinedGet Raw Error Events with Full Details
获取模型健康统计数据(最近5分钟)
bash
curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/query" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"timeframe": {
"from": '$(( $(date +%s) - 900 ))'000,
"to": '$(date +%s)'000
},
"parameters": {
"datasets": ["workers"],
"filters": [
{"key": "$workers.scriptName", "operation": "eq", "type": "string", "value": "enter-pollinations-ai"},
{"key": "$metadata.statusCode", "operation": "gte", "type": "number", "value": 500}
],
"limit": 20
}
}' | jq '.result.events.events[] | {
timestamp: .timestamp,
statusCode: ."$metadata".statusCode,
error: ."$metadata".error,
message: ."$metadata".message,
requestId: ."$workers".requestId,
url: ."$metadata".url
}'List Available Log Keys
获取详细错误分类
bash
curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/keys" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"timeframe": {
"from": '$(( $(date +%s) - 3600 ))'000,
"to": '$(date +%s)'000
},
"datasets": ["workers"],
"filters": [
{"key": "$workers.scriptName", "operation": "eq", "type": "string", "value": "enter-pollinations-ai"}
]
}' | jq '.result.keys'curl "https://api.europe-west2.gcp.tinybird.co/v0/pipes/model_errors.json?token=$TINYBIRD_TOKEN" | jq '.data'
Tinybird令牌是只读公共令牌,可在以下位置找到:
- `apps/model-monitor/src/hooks/useModelMonitor.js`
---Structured Logging in enter.pollinations.ai
调试工作流
The worker uses LogTape for structured logging with these key fields:
- requestId: Unique ID per request (first 8 chars shown in logs)
- method: HTTP method (GET, POST)
- routePath: Request URL
- status: Response status code
- duration: Request duration in ms
Downstream errors are logged with:
typescript
log.warn("Chat completions error {status}: {body}", {
status: response.status,
body: responseText,
});-
查看模型监控 - https://monitor.pollinations.ai
- 识别哪些模型的错误率较高
- 记录错误码分布(401、402、403、400、500等)
-
查询Cloudflare日志 - 使用上述API查询
- 获取包含完整详情的原始错误事件
- 查找错误信息中的模式
-
通过Request ID关联 - 如果你有特定的Request ID:bash
# 按Request ID过滤 curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/query" \ -H "Authorization: Bearer $API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "timeframe": {"from": '$(( $(date +%s) - 86400 ))'000, "to": '$(date +%s)'000}, "parameters": { "datasets": ["workers"], "filters": [ {"key": "$workers.requestId", "operation": "eq", "type": "string", "value": "REQUEST_ID_HERE"} ], "limit": 100 } }' | jq '.result.events.events' -
检查后端日志 - 如果错误来自下游服务:bash
# 图片服务 ssh enter-services "sudo journalctl -u image-pollinations.service --since '5 minutes ago'" # 文本服务 ssh enter-services "sudo journalctl -u text-pollinations.service --since '5 minutes ago'" -
直接测试模型 - 验证模型是否真的故障:bash
TOKEN=$(grep ENTER_API_TOKEN_REMOTE enter.pollinations.ai/.testingtokens | cut -d= -f2) # 测试文本模型 curl -s 'https://gen.pollinations.ai/v1/chat/completions' \ -H "Authorization: Bearer $TOKEN" \ -H 'Content-Type: application/json' \ -d '{"model": "MODEL_NAME", "messages": [{"role": "user", "content": "Test"}]}' \ -w "\nHTTP: %{http_code}\n" # 测试图片模型 curl -s 'https://gen.pollinations.ai/image/test?model=MODEL_NAME&width=256&height=256' \ -H "Authorization: Bearer $TOKEN" \ -w "\nHTTP: %{http_code}\n" -o /dev/null
Tinybird Analytics (Alternative)
当前状态与限制
—
Cloudflare可观测性API
For aggregated model health stats, query Tinybird directly:
bash
undefined可用功能:
- - 列出可用日志字段 ✅
/telemetry/keys - - 获取字段的唯一值 ✅
/telemetry/values - 令牌存储在SOPS中:✅
enter.pollinations.ai/secrets/env.json
限制:
- 需要从控制台保存的
/telemetry/queryqueryId - 对于临时查询,使用Cloudflare控制台 → Workers和页面 → pollinations-enter → 可观测性 → 调查
- 或使用查看实时日志
wrangler tail
Get model health stats (last 5 minutes)
替代方案:Tinybird(推荐用于聚合数据)
Tinybird提供预聚合的模型健康统计数据和原始事件数据。
Get detailed error breakdown
令牌位置
curl "https://api.europe-west2.gcp.tinybird.co/v0/pipes/model_errors.json?token=$TINYBIRD_TOKEN" | jq '.data'
The Tinybird token is a read-only public token found in:
- `apps/model-monitor/src/hooks/useModelMonitor.js`
---- 公共只读令牌(仅用于管道):
apps/model-monitor/src/hooks/useModelMonitor.js - 管理员令牌(用于原始SQL查询):(
enter.pollinations.ai/observability/.tinyb字段)token
Debugging Workflow
基础查询(公共令牌)
-
Check Model Monitor - https://monitor.pollinations.ai
- Identify which models have high error rates
- Note the error code breakdown (401, 402, 403, 400, 500, etc.)
-
Query Cloudflare Logs - Use the API queries above
- Get raw error events with full details
- Look for patterns in error messages
-
Correlate with Request ID - If you have a specific request ID:bash
# Filter by request ID curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/workers/observability/telemetry/query" \ -H "Authorization: Bearer $API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "timeframe": {"from": '$(( $(date +%s) - 86400 ))'000, "to": '$(date +%s)'000}, "parameters": { "datasets": ["workers"], "filters": [ {"key": "$workers.requestId", "operation": "eq", "type": "string", "value": "REQUEST_ID_HERE"} ], "limit": 100 } }' | jq '.result.events.events' -
Check Backend Logs - If error is from downstream service:bash
# Image service ssh enter-services "sudo journalctl -u image-pollinations.service --since '5 minutes ago'" # Text service ssh enter-services "sudo journalctl -u text-pollinations.service --since '5 minutes ago'" -
Test Model Directly - Verify if model is actually broken:bash
TOKEN=$(grep ENTER_API_TOKEN_REMOTE enter.pollinations.ai/.testingtokens | cut -d= -f2) # Test text model curl -s 'https://gen.pollinations.ai/v1/chat/completions' \ -H "Authorization: Bearer $TOKEN" \ -H 'Content-Type: application/json' \ -d '{"model": "MODEL_NAME", "messages": [{"role": "user", "content": "Test"}]}' \ -w "\nHTTP: %{http_code}\n" # Test image model curl -s 'https://gen.pollinations.ai/image/test?model=MODEL_NAME&width=256&height=256' \ -H "Authorization: Bearer $TOKEN" \ -w "\nHTTP: %{http_code}\n" -o /dev/null
bash
undefinedCurrent Status & Limitations
从apps/model-monitor获取公共只读令牌
Cloudflare Observability API
—
What works:
- - List available log fields ✅
/telemetry/keys - - Get unique values for a field ✅
/telemetry/values - Token stored in SOPS: ✅
enter.pollinations.ai/secrets/env.json
Limitations:
- requires a saved
/telemetry/queryfrom the dashboardqueryId - For ad-hoc queries, use Cloudflare Dashboard → Workers & Pages → pollinations-enter → Observability → Investigate
- Or use for real-time logs
wrangler tail
TINYBIRD_TOKEN="p.eyJ1IjogImFjYTYzZjc5LThjNTYtNDhlNC05NWJjLWEyYmFjMTY0NmJkMyIsICJpZCI6ICJmZTRjODM1Ni1iOTYwLTQ0ZTYtODE1Mi1kY2UwYjc0YzExNjQiLCAiaG9zdCI6ICJnY3AtZXVyb3BlLXdlc3QyIn0.Wc49vYoVYI_xd4JSsH_Fe8mJk7Oc9hx0IIldwc1a44g"
Alternative: Tinybird (Recommended for Aggregates)
获取模型健康数据(最近5分钟)
Tinybird provides pre-aggregated model health stats and raw event data.
curl -s "https://api.europe-west2.gcp.tinybird.co/v0/pipes/model_health.json?token=$TINYBIRD_TOKEN" | jq '.data'
undefinedToken Locations
原始SQL查询(管理员令牌)
- Public read-only token (for pipes only):
apps/model-monitor/src/hooks/useModelMonitor.js - Admin token (for raw SQL queries): (in
enter.pollinations.ai/observability/.tinybfield)token
如需查询原始数据源,使用文件中的管理员令牌:
generation_event.tinybbash
undefinedBasic Queries (Public Token)
从.tinyb文件获取管理员令牌
bash
undefinedTINYBIRD_ADMIN_TOKEN=$(jq -r '.token' enter.pollinations.ai/observability/.tinyb)
Public read-only token from apps/model-monitor
查找最近24小时内403错误超过10次的用户
TINYBIRD_TOKEN="p.eyJ1IjogImFjYTYzZjc5LThjNTYtNDhlNC05NWJjLWEyYmFjMTY0NmJkMyIsICJpZCI6ICJmZTRjODM1Ni1iOTYwLTQ0ZTYtODE1Mi1kY2UwYjc0YzExNjQiLCAiaG9zdCI6ICJnY3AtZXVyb3BlLXdlc3QyIn0.Wc49vYoVYI_xd4JSsH_Fe8mJk7Oc9hx0IIldwc1a44g"
curl -s "https://api.europe-west2.gcp.tinybird.co/v0/sql?token=$TINYBIRD_ADMIN_TOKEN"
--data-urlencode "q=SELECT user_id, user_github_username, user_tier, count() as error_403_count FROM generation_event WHERE response_status = 403 AND start_time > now() - interval 24 hour AND user_id != '' AND user_id != 'undefined' GROUP BY user_id, user_github_username, user_tier ORDER BY error_403_count DESC LIMIT 20"
--data-urlencode "q=SELECT user_id, user_github_username, user_tier, count() as error_403_count FROM generation_event WHERE response_status = 403 AND start_time > now() - interval 24 hour AND user_id != '' AND user_id != 'undefined' GROUP BY user_id, user_github_username, user_tier ORDER BY error_403_count DESC LIMIT 20"
Get model health (last 5 min)
查找500错误(后端问题)
curl -s "https://api.europe-west2.gcp.tinybird.co/v0/pipes/model_health.json?token=$TINYBIRD_TOKEN" | jq '.data'
undefinedcurl -s "https://api.europe-west2.gcp.tinybird.co/v0/sql?token=$TINYBIRD_ADMIN_TOKEN"
--data-urlencode "q=SELECT user_github_username, model_requested, error_message, count() as error_count FROM generation_event WHERE response_status >= 500 AND start_time > now() - interval 24 hour GROUP BY user_github_username, model_requested, error_message ORDER BY error_count DESC LIMIT 20"
--data-urlencode "q=SELECT user_github_username, model_requested, error_message, count() as error_count FROM generation_event WHERE response_status >= 500 AND start_time > now() - interval 24 hour GROUP BY user_github_username, model_requested, error_message ORDER BY error_count DESC LIMIT 20"
Raw SQL Queries (Admin Token)
检查特定用户的近期错误
For querying the raw datasource, use the admin token from :
generation_event.tinybbash
undefinedcurl -s "https://api.europe-west2.gcp.tinybird.co/v0/sql?token=$TINYBIRD_ADMIN_TOKEN"
--data-urlencode "q=SELECT start_time, response_status, model_requested, error_message FROM generation_event WHERE user_github_username = 'USERNAME_HERE' AND start_time > now() - interval 24 hour ORDER BY start_time DESC LIMIT 50"
--data-urlencode "q=SELECT start_time, response_status, model_requested, error_message FROM generation_event WHERE user_github_username = 'USERNAME_HERE' AND start_time > now() - interval 24 hour ORDER BY start_time DESC LIMIT 50"
undefinedGet admin token from .tinyb file
数据源架构
TINYBIRD_ADMIN_TOKEN=$(jq -r '.token' enter.pollinations.ai/observability/.tinyb)
generation_evententer.pollinations.ai/observability/datasources/generation_event.datasource- 、
user_id、user_github_usernameuser_tier - 、
response_status、error_messageerror_response_code - 、
model_requestedmodel_used - 、
total_pricetotal_cost - 、
start_time、end_timeresponse_time
Find users with frequent 403 errors (last 24 hours)
脚本工具
curl -s "https://api.europe-west2.gcp.tinybird.co/v0/sql?token=$TINYBIRD_ADMIN_TOKEN"
--data-urlencode "q=SELECT user_id, user_github_username, user_tier, count() as error_403_count FROM generation_event WHERE response_status = 403 AND start_time > now() - interval 24 hour AND user_id != '' AND user_id != 'undefined' GROUP BY user_id, user_github_username, user_tier ORDER BY error_403_count DESC LIMIT 20"
--data-urlencode "q=SELECT user_id, user_github_username, user_tier, count() as error_403_count FROM generation_event WHERE response_status = 403 AND start_time > now() - interval 24 hour AND user_id != '' AND user_id != 'undefined' GROUP BY user_id, user_github_username, user_tier ORDER BY error_403_count DESC LIMIT 20"
用于常见调试任务的辅助脚本。从仓库根目录运行。
Find users with 500 errors (actual backend issues)
查找存在403错误的用户(配额问题)
curl -s "https://api.europe-west2.gcp.tinybird.co/v0/sql?token=$TINYBIRD_ADMIN_TOKEN"
--data-urlencode "q=SELECT user_github_username, model_requested, error_message, count() as error_count FROM generation_event WHERE response_status >= 500 AND start_time > now() - interval 24 hour GROUP BY user_github_username, model_requested, error_message ORDER BY error_count DESC LIMIT 20"
--data-urlencode "q=SELECT user_github_username, model_requested, error_message, count() as error_count FROM generation_event WHERE response_status >= 500 AND start_time > now() - interval 24 hour GROUP BY user_github_username, model_requested, error_message ORDER BY error_count DESC LIMIT 20"
bash
undefinedCheck specific user's recent errors
查找最近24小时内403错误超过10次的用户
curl -s "https://api.europe-west2.gcp.tinybird.co/v0/sql?token=$TINYBIRD_ADMIN_TOKEN"
--data-urlencode "q=SELECT start_time, response_status, model_requested, error_message FROM generation_event WHERE user_github_username = 'USERNAME_HERE' AND start_time > now() - interval 24 hour ORDER BY start_time DESC LIMIT 50"
--data-urlencode "q=SELECT start_time, response_status, model_requested, error_message FROM generation_event WHERE user_github_username = 'USERNAME_HERE' AND start_time > now() - interval 24 hour ORDER BY start_time DESC LIMIT 50"
undefined.claude/skills/model-debugging/scripts/find-403-users.sh 24 10
Datasource Schema
按层级过滤(例如仅spore用户)
The datasource is defined in and includes:
generation_evententer.pollinations.ai/observability/datasources/generation_event.datasource- ,
user_id,user_github_usernameuser_tier - ,
response_status,error_messageerror_response_code - ,
model_requestedmodel_used - ,
total_pricetotal_cost - ,
start_time,end_timeresponse_time
.claude/skills/model-debugging/scripts/find-403-users.sh 24 10 spore
undefinedScripts
查找500错误(后端问题)
Helper scripts for common debugging tasks. Run from repo root.
bash
undefinedFind Users with 403 Errors (Quota Issues)
按用户/模型/错误信息分组查找500+错误
bash
undefined.claude/skills/model-debugging/scripts/find-500-errors.sh 24
undefinedFind users with >10 403 errors in last 24 hours
检查特定用户的错误
.claude/skills/model-debugging/scripts/find-403-users.sh 24 10
bash
undefinedFilter by tier (e.g., only spore users)
查看用户的近期错误
.claude/skills/model-debugging/scripts/find-403-users.sh 24 10 spore
undefined.claude/skills/model-debugging/scripts/check-user-errors.sh superbrainai 24
---Find 500 Errors (Backend Issues)
注意事项
bash
undefined- 401错误:用户认证问题(无API密钥)- 匿名流量中的正常现象
- 402错误:花粉/计费问题(用户积分耗尽或密钥预算不足)- 正常现象
- 403错误:权限问题(API密钥不允许访问该模型)- 正常现象
- 400错误:通常是用户输入错误(无效提示词、参数错误)- 正常现象
- 500错误:后端/基础设施问题 - 需要重点调查
- 504错误:超时(模型响应过慢或挂起)- 需要重点调查
Find 500+ errors grouped by user/model/message
已测试模型(截至2025-12-22均正常运行)
.claude/skills/model-debugging/scripts/find-500-errors.sh 24
undefined| 模型 | 类型 | 端点 | 状态 |
|---|---|---|---|
| 文本 | POST /v1/chat/completions | ✅ |
| 文本 | POST /v1/chat/completions | ✅ |
| 文本 | POST /v1/chat/completions | ✅ |
| 文本 | GET /text/{prompt}?model=openai-audio&voice=alloy | ✅ (MP3) |
| 文本 | POST /v1/chat/completions | ✅ |
| 文本 | POST /v1/chat/completions | ✅ |
| 图片 | GET /image/{prompt} | ✅ |
| 图片 | GET /image/{prompt} | ✅ |
| 图片 | GET /image/{prompt} | ✅ |
| 视频 | GET /image/{prompt} | ✅ (MP4) |
Check Specific User's Errors
—
bash
undefined—
See a user's recent errors
—
.claude/skills/model-debugging/scripts/check-user-errors.sh superbrainai 24
---—
Notes
—
- 401 errors: User authentication issues (no API key) - expected from anonymous traffic
- 402 errors: Pollen/billing issues (user ran out of credits or key budget) - expected
- 403 errors: Permission issues (model not allowed for API key) - expected
- 400 errors: Usually user input errors (bad prompts, invalid params) - expected
- 500 errors: Backend/infrastructure issues - investigate these
- 504 errors: Timeouts (model too slow or hung) - investigate these
—
Tested Models (All Working as of 2025-12-22)
—
| Model | Type | Endpoint | Status |
|---|---|---|---|
| text | POST /v1/chat/completions | ✅ |
| text | POST /v1/chat/completions | ✅ |
| text | POST /v1/chat/completions | ✅ |
| text | GET /text/{prompt}?model=openai-audio&voice=alloy | ✅ (MP3) |
| text | POST /v1/chat/completions | ✅ |
| text | POST /v1/chat/completions | ✅ |
| image | GET /image/{prompt} | ✅ |
| image | GET /image/{prompt} | ✅ |
| image | GET /image/{prompt} | ✅ |
| video | GET /image/{prompt} | ✅ (MP4) |
—