galileo-python-sdk

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Galileo Python SDK

Galileo Python SDK

The Galileo Python SDK (
galileo
) provides a unified interface for the Galileo AI platform — enabling evaluation, observability, and runtime guardrails for GenAI applications. It supports automatic tracing of LLM calls, custom span logging, evaluation experiments, and production-grade guardrails.
Additional references:
  • Framework Integrations — OpenAI, Anthropic, LangChain, LangGraph, CrewAI, PydanticAI, and more
  • Guardrail Metrics Reference — Hallucination Index, Context Adherence, Toxicity, PII, and all available metrics
  • Advanced Evaluation Patterns — Experiments, eval sets, prompt optimization, and scoring
Galileo Python SDK(
galileo
)为Galileo AI平台提供了统一接口,可实现GenAI应用的评估、可观测性与运行时防护能力。它支持LLM调用的自动追踪、自定义跨度日志记录、评估实验以及生产级防护规则。
其他参考资料:
  • 框架集成 — 支持OpenAI、Anthropic、LangChain、LangGraph、CrewAI、PydanticAI等
  • 防护指标参考 — 包含幻觉指数、上下文契合度、毒性、PII等所有可用指标
  • 高级评估模式 — 涵盖实验、评估数据集、提示优化与评分等内容

Installation

安装

bash
pip install galileo
For evaluation features with the legacy prompt engineering interface:
bash
pip install promptquality
For runtime guardrails:
bash
pip install galileo-protect
bash
pip install galileo
如果要使用传统提示工程接口的评估功能:
bash
pip install promptquality
如果要使用运行时防护功能:
bash
pip install galileo-protect

Quick Start

快速开始

python
import os
from galileo import galileo_context
from galileo.openai import openai

galileo_context.init(project="my-project", log_stream="my-log-stream")

client = openai.OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Explain quantum computing in one sentence."}],
    model="gpt-4o",
)

print(response.choices[0].message.content)

galileo_context.flush()
python
import os
from galileo import galileo_context
from galileo.openai import openai

galileo_context.init(project="my-project", log_stream="my-log-stream")

client = openai.OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Explain quantum computing in one sentence."}],
    model="gpt-4o",
)

print(response.choices[0].message.content)

galileo_context.flush()

Authentication

身份认证

Set the following environment variables:
bash
undefined
设置以下环境变量:
bash
undefined

.env file or shell environment

.env file or shell environment

GALILEO_API_KEY="your-api-key" # Required — from Galileo console GALILEO_CONSOLE_URL="https://app.galileo.ai" # Console URL (or self-hosted URL) GALILEO_PROJECT="my-project" # Optional — default project GALILEO_LOG_STREAM="my-log-stream" # Optional — default log stream GALILEO_LOGGING_DISABLED="false" # Optional — disable logging

For the legacy `promptquality` package, authenticate programmatically:

```python
import promptquality as pq
pq.login("https://app.galileo.ai")
GALILEO_API_KEY="your-api-key" # Required — from Galileo console GALILEO_CONSOLE_URL="https://app.galileo.ai" # Console URL (or self-hosted URL) GALILEO_PROJECT="my-project" # Optional — default project GALILEO_LOG_STREAM="my-log-stream" # Optional — default log stream GALILEO_LOGGING_DISABLED="false" # Optional — disable logging

对于传统的`promptquality`包,可通过代码进行身份认证:

```python
import promptquality as pq
pq.login("https://app.galileo.ai")

Observability and Tracing

可观测性与链路追踪

Initializing the Galileo Context

初始化Galileo上下文

python
from galileo import galileo_context

galileo_context.init(project="my-project", log_stream="my-log-stream")
python
from galileo import galileo_context

galileo_context.init(project="my-project", log_stream="my-log-stream")

Wrapped OpenAI Client (Auto-Logging)

封装的OpenAI客户端(自动日志记录)

Import the Galileo-wrapped OpenAI client to automatically trace all calls:
python
from galileo.openai import openai

client = openai.OpenAI()
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello"}],
    model="gpt-4o",
)
导入Galileo封装的OpenAI客户端即可自动追踪所有调用:
python
from galileo.openai import openai

client = openai.OpenAI()
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello"}],
    model="gpt-4o",
)

The
@log
Decorator

@log
装饰器

Use
@log
to create spans for your functions. Supported span types:
workflow
,
llm
,
retriever
,
tool
.
python
from galileo import log

@log
def my_workflow():
    result = call_openai()
    return result

@log(span_type="retriever")
def retrieve_documents(query: str):
    docs = vector_store.search(query)
    return docs

@log(span_type="tool")
def search_web(query: str):
    return web_api.search(query)
使用
@log
为你的函数创建跨度。支持的跨度类型包括:
workflow
llm
retriever
tool
python
from galileo import log

@log
def my_workflow():
    result = call_openai()
    return result

@log(span_type="retriever")
def retrieve_documents(query: str):
    docs = vector_store.search(query)
    return docs

@log(span_type="tool")
def search_web(query: str):
    return web_api.search(query)

Nested Workflows

嵌套工作流

python
from galileo import log

@log
def agent_pipeline(user_input: str):
    context = retrieve_documents(user_input)
    tool_result = search_web(user_input)
    response = generate_response(user_input, context, tool_result)
    return response

@log(span_type="retriever")
def retrieve_documents(query: str):
    return ["doc1", "doc2"]

@log(span_type="tool")
def search_web(query: str):
    return "search result"

@log
def generate_response(query: str, context: list, tool_result: str):
    client = openai.OpenAI()
    return client.chat.completions.create(
        messages=[{"role": "user", "content": query}],
        model="gpt-4o",
    )
python
from galileo import log

@log
def agent_pipeline(user_input: str):
    context = retrieve_documents(user_input)
    tool_result = search_web(user_input)
    response = generate_response(user_input, context, tool_result)
    return response

@log(span_type="retriever")
def retrieve_documents(query: str):
    return ["doc1", "doc2"]

@log(span_type="tool")
def search_web(query: str):
    return "search result"

@log
def generate_response(query: str, context: list, tool_result: str):
    client = openai.OpenAI()
    return client.chat.completions.create(
        messages=[{"role": "user", "content": query}],
        model="gpt-4o",
    )

Context Manager

上下文管理器

Scope logging to a specific block and auto-flush on exit:
python
from galileo import galileo_context

with galileo_context(project="my-project", log_stream="my-log-stream"):
    result = my_workflow()
    print(result)
将日志限定在特定代码块范围内,退出时自动刷新:
python
from galileo import galileo_context

with galileo_context(project="my-project", log_stream="my-log-stream"):
    result = my_workflow()
    print(result)

Flushing Traces

刷新追踪数据

Upload captured traces to Galileo:
python
galileo_context.flush()
将捕获的追踪数据上传到Galileo:
python
galileo_context.flush()

Evaluation

评估

Running Experiments with
promptquality

使用
promptquality
运行实验

python
import promptquality as pq

pq.login("https://app.galileo.ai")

template = "Explain {{topic}} to me like I'm a 5 year old"
data = {"topic": ["Quantum Physics", "Politics", "Large Language Models"]}

pq.run(
    project_name="my-first-project",
    template=template,
    dataset=data,
    settings=pq.Settings(
        model_alias="ChatGPT (16K context)",
        temperature=0.8,
        max_tokens=400,
    ),
)
python
import promptquality as pq

pq.login("https://app.galileo.ai")

template = "Explain {{topic}} to me like I'm a 5 year old"
data = {"topic": ["Quantum Physics", "Politics", "Large Language Models"]}

pq.run(
    project_name="my-first-project",
    template=template,
    dataset=data,
    settings=pq.Settings(
        model_alias="ChatGPT (16K context)",
        temperature=0.8,
        max_tokens=400,
    ),
)

Evaluation Runs with Scorers

使用评分器运行评估任务

python
from promptquality import EvaluateRun
import promptquality as pq

pq.login()

metrics = [pq.Scorers.context_adherence_plus, pq.Scorers.prompt_injection]

evaluate_run = EvaluateRun(
    run_name="my_run",
    project_name="my_project",
    scorers=metrics,
)

eval_set = ["What are hallucinations?", "What are intrinsic hallucinations?"]
for input_text in eval_set:
    output = llm.call(input_text)
    evaluate_run.add_single_step_workflow(
        input=input_text,
        output=output,
        model="gpt-4o",
    )

evaluate_run.finish()
See Advanced Evaluation Patterns for more.
python
from promptquality import EvaluateRun
import promptquality as pq

pq.login()

metrics = [pq.Scorers.context_adherence_plus, pq.Scorers.prompt_injection]

evaluate_run = EvaluateRun(
    run_name="my_run",
    project_name="my_project",
    scorers=metrics,
)

eval_set = ["What are hallucinations?", "What are intrinsic hallucinations?"]
for input_text in eval_set:
    output = llm.call(input_text)
    evaluate_run.add_single_step_workflow(
        input=input_text,
        output=output,
        model="gpt-4o",
    )

evaluate_run.finish()
更多内容请查看高级评估模式

Guardrails / Protect

防护规则/保护功能

Creating a Protection Stage

创建防护阶段

python
from galileo import GalileoMetrics
from galileo.stages import create_protect_stage
from galileo_core.schemas.protect.rule import Rule, RuleOperator
from galileo_core.schemas.protect.ruleset import Ruleset
from galileo_core.schemas.protect.stage import StageType

rule = Rule(
    metric=GalileoMetrics.input_toxicity,
    operator=RuleOperator.gt,
    target_value=0.1,
)

ruleset = Ruleset(rules=[rule])

stage = create_protect_stage(
    name="toxicity-guard",
    stage_type=StageType.central,
    prioritized_rulesets=[ruleset],
    description="Block toxic input.",
)
python
from galileo import GalileoMetrics
from galileo.stages import create_protect_stage
from galileo_core.schemas.protect.rule import Rule, RuleOperator
from galileo_core.schemas.protect.ruleset import Ruleset
from galileo_core.schemas.protect.stage import StageType

rule = Rule(
    metric=GalileoMetrics.input_toxicity,
    operator=RuleOperator.gt,
    target_value=0.1,
)

ruleset = Ruleset(rules=[rule])

stage = create_protect_stage(
    name="toxicity-guard",
    stage_type=StageType.central,
    prioritized_rulesets=[ruleset],
    description="Block toxic input.",
)

Invoking Runtime Protection

调用运行时防护

python
from galileo.protect import invoke_protect, ainvoke_protect
from galileo_core.schemas.protect.payload import Payload

payload = Payload(input="User message to check.")

response = invoke_protect(payload=payload, stage_name="toxicity-guard")
python
from galileo.protect import invoke_protect, ainvoke_protect
from galileo_core.schemas.protect.payload import Payload

payload = Payload(input="User message to check.")

response = invoke_protect(payload=payload, stage_name="toxicity-guard")

Async variant

Async variant

response = await ainvoke_protect(payload=payload, stage_name="toxicity-guard")
undefined
response = await ainvoke_protect(payload=payload, stage_name="toxicity-guard")
undefined

Stage Types

阶段类型

  • Central stages — Created and managed by governance teams; rulesets defined at creation time
  • Local stages — Created without rulesets; rulesets supplied at runtime by application teams
See Guardrail Metrics Reference for all available metrics.
  • 中心阶段 — 由治理团队创建和管理,规则集在创建时定义
  • 本地阶段 — 创建时无需指定规则集,运行时由应用团队提供规则集
所有可用指标请查看防护指标参考

Common Patterns

常见使用模式

Multi-Turn Conversations

多轮对话

python
from galileo import log
from galileo.openai import openai

client = openai.OpenAI()

@log
def chat(messages: list):
    response = client.chat.completions.create(
        messages=messages,
        model="gpt-4o",
    )
    return response.choices[0].message.content

messages = []
messages.append({"role": "user", "content": "What is RAG?"})
reply = chat(messages)
messages.append({"role": "assistant", "content": reply})
messages.append({"role": "user", "content": "How do I implement it?"})
reply = chat(messages)
python
from galileo import log
from galileo.openai import openai

client = openai.OpenAI()

@log
def chat(messages: list):
    response = client.chat.completions.create(
        messages=messages,
        model="gpt-4o",
    )
    return response.choices[0].message.content

messages = []
messages.append({"role": "user", "content": "What is RAG?"})
reply = chat(messages)
messages.append({"role": "assistant", "content": reply})
messages.append({"role": "user", "content": "How do I implement it?"})
reply = chat(messages)

RAG Pipeline with Retriever Spans

带检索器跨度的RAG管道

python
from galileo import log
from galileo.openai import openai

client = openai.OpenAI()

@log(span_type="retriever")
def retrieve(query: str):
    results = vector_db.similarity_search(query, k=5)
    return [doc.page_content for doc in results]

@log
def rag_pipeline(question: str):
    context = retrieve(question)
    prompt = f"Context: {context}\n\nQuestion: {question}"
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": prompt}],
        model="gpt-4o",
    )
    return response.choices[0].message.content
python
from galileo import log
from galileo.openai import openai

client = openai.OpenAI()

@log(span_type="retriever")
def retrieve(query: str):
    results = vector_db.similarity_search(query, k=5)
    return [doc.page_content for doc in results]

@log
def rag_pipeline(question: str):
    context = retrieve(question)
    prompt = f"Context: {context}\n\nQuestion: {question}"
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": prompt}],
        model="gpt-4o",
    )
    return response.choices[0].message.content

Agent Tool Calling

Agent工具调用

python
from galileo import log

@log(span_type="tool")
def calculator(a: float, b: float, op: str) -> str:
    if op == "add":
        return str(a + b)
    elif op == "multiply":
        return str(a * b)
    raise ValueError(f"Unknown op: {op}")

@log(span_type="tool")
def web_search(query: str):
    return search_api.query(query)

@log
def agent(user_input: str):
    plan = plan_actions(user_input)
    results = []
    for action in plan:
        if action.tool == "calculator":
            results.append(calculator(action.input))
        elif action.tool == "web_search":
            results.append(web_search(action.input))
    return synthesize(results)
python
from galileo import log

@log(span_type="tool")
def calculator(a: float, b: float, op: str) -> str:
    if op == "add":
        return str(a + b)
    elif op == "multiply":
        return str(a * b)
    raise ValueError(f"Unknown op: {op}")

@log(span_type="tool")
def web_search(query: str):
    return search_api.query(query)

@log
def agent(user_input: str):
    plan = plan_actions(user_input)
    results = []
    for action in plan:
        if action.tool == "calculator":
            results.append(calculator(action.input))
        elif action.tool == "web_search":
            results.append(web_search(action.input))
    return synthesize(results)

Best Practices

最佳实践

  1. Always set environment variables for
    GALILEO_API_KEY
    and
    GALILEO_CONSOLE_URL
    rather than hardcoding credentials.
  2. Organize projects and log streams by application, environment, or team to keep traces manageable.
  3. Call
    galileo_context.flush()
    at the end of each request or batch to ensure traces are uploaded. In web servers, flush at the end of each request handler.
  4. Use the context manager (
    with galileo_context(...)
    ) for scoped logging that auto-flushes on exit.
  5. Use specific span types (
    retriever
    ,
    tool
    ,
    llm
    ,
    workflow
    ) to get the most out of Galileo's trace visualization.
  6. Handle errors gracefully — wrap
    flush()
    calls in try/except to prevent logging failures from crashing your application.
  7. Use the wrapped OpenAI client (
    from galileo.openai import openai
    ) for zero-config automatic tracing of all OpenAI calls.
  8. Leverage guardrail metrics in production to catch hallucinations, toxic content, and PII before they reach end users.
  1. 始终通过环境变量配置
    GALILEO_API_KEY
    GALILEO_CONSOLE_URL
    ,不要硬编码凭证。
  2. 按应用、环境或团队组织项目和日志流,便于管理追踪数据。
  3. 在每次请求或批处理结束时调用
    galileo_context.flush()
    ,确保追踪数据成功上传。在Web服务中,可在每个请求处理函数结束时执行刷新操作。
  4. 使用上下文管理器
    with galileo_context(...)
    )实现范围化日志记录,退出时自动刷新。
  5. 使用明确的跨度类型
    retriever
    tool
    llm
    workflow
    ),充分发挥Galileo追踪可视化的能力。
  6. 优雅处理错误 — 将
    flush()
    调用包裹在try/except中,避免日志记录失败导致应用崩溃。
  7. 使用封装的OpenAI客户端
    from galileo.openai import openai
    ),无需额外配置即可自动追踪所有OpenAI调用。
  8. 在生产环境中使用防护指标,在幻觉、有毒内容、PII等内容触达最终用户之前及时拦截。

Resources

资源