Loading...
Loading...
Teach AI agents how to query data warehouses accurately using ktx - an executable context layer with skills, memory, and a semantic layer
npx skill4agent add aradotso/ai-agent-skills ktx-context-layer-data-agentsSkill by ara.so — AI Agent Skills collection.
npm install -g @kaelio/ktxnpm install --save-dev @kaelio/ktxktx setupktx statusktx project: /home/user/analytics
Project ready: yes
LLM ready: yes (claude-sonnet-4-6)
Embeddings ready: yes (text-embedding-3-small)
Databases configured: yes (warehouse)
Context sources configured: yes (dbt_main)
ktx context built: yes
Agent integration ready: yes (codex:project)my-project/
├── ktx.yaml # Project configuration
├── semantic-layer/<connection-id>/ # YAML semantic sources
├── wiki/global/ # Shared business context
├── wiki/user/<user-id>/ # User-scoped notes
├── raw-sources/<connection-id>/ # Ingest artifacts and reports
└── .ktx/ # Local state and secrets (git-ignored)ktx.yamlsemantic-layer/wiki/.ktx/version: 1
project_id: analytics_project
llm_provider: anthropic
embedding_provider: openai
connections:
- id: warehouse
type: postgres
config:
host: localhost
port: 5432
database: analytics
user: readonly_user
password_env: POSTGRES_PASSWORD
ssl: false
context_sources:
- id: dbt_main
type: dbt
config:
manifest_path: ./dbt/target/manifest.json
catalog_path: ./dbt/target/catalog.json
- id: looker_metrics
type: looker
config:
base_url_env: LOOKER_BASE_URL
client_id_env: LOOKER_CLIENT_ID
client_secret_env: LOOKER_CLIENT_SECRET
agent_integrations:
- type: codex
scope: project# LLM Provider
export ANTHROPIC_API_KEY=sk-ant-...
# or for Vertex AI
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json
# Embedding Provider
export OPENAI_API_KEY=sk-...
# Database Credentials
export POSTGRES_PASSWORD=yourpassword
export SNOWFLAKE_PASSWORD=yourpassword
# Tool Integrations
export LOOKER_BASE_URL=https://company.looker.com
export LOOKER_CLIENT_ID=your_client_id
export LOOKER_CLIENT_SECRET=your_secret
export NOTION_TOKEN=secret_...ktx ingestktx ingest --connection warehousektx ingest --context-source dbt_mainktx sl "revenue"
ktx sl "customer lifetime value"
ktx sl "monthly active users"Found 3 semantic sources matching "revenue":
1. metric.monthly_recurring_revenue
Type: metric
Connection: warehouse
Description: Sum of all active subscription values in a given month
SQL: SUM(subscriptions.monthly_value)
2. dimension.revenue_tier
Type: dimension
Entity: customer
Description: Customer revenue bracket (low/medium/high)
3. entity.revenue_events
Type: entity
Table: prod.revenue_events
Primary key: event_idktx wiki "refund policy"
ktx wiki "customer segmentation"ktx query "SELECT customer_tier, COUNT(*) FROM customers GROUP BY customer_tier"ktx query --use-metrics "SELECT monthly_recurring_revenue FROM time WHERE month = '2024-01'"ktx mcp startktx mcp start --project-dir /path/to/projectktx_search_semantic_layerktx_search_wikiktx_queryktx_get_schemaktx_get_metric_definition# semantic-layer/warehouse/metrics/mrr.yaml
name: monthly_recurring_revenue
type: metric
description: Sum of all active subscription values in a given month
entity: subscription
sql: SUM(subscriptions.monthly_value)
filters:
- sql: subscriptions.status = 'active'
dimensions:
- customer_tier
- plan_type
time_dimension: subscription_start_date# semantic-layer/warehouse/entities/customer.yaml
name: customer
type: entity
table: prod.customers
primary_key: customer_id
description: Customer master table
dimensions:
- name: customer_tier
type: categorical
sql: tier
- name: signup_date
type: time
sql: created_at# semantic-layer/warehouse/joins/customer_subscription.yaml
from_entity: customer
to_entity: subscription
type: one_to_many
join_sql: customers.customer_id = subscriptions.customer_id# Navigate to analytics directory
cd ~/analytics
# Initialize ktx
ktx setup
# Follow prompts to configure:
# - LLM provider (Anthropic API recommended)
# - Embedding provider (OpenAI recommended)
# - Database connection (read-only credentials)
# - dbt integration (point to manifest.json)
# Verify setup
ktx status
# Build initial context
ktx ingestUser: What was our MRR in January 2024?
Agent uses ktx:
1. ktx sl "monthly recurring revenue" → finds metric definition
2. ktx query --use-metrics "SELECT monthly_recurring_revenue FROM time WHERE month = '2024-01'"
3. Returns accurate result using canonical metric logic# Create global wiki page
mkdir -p wiki/global
cat > wiki/global/refund-policy.md << 'EOF'
# Refund Policy
Customers can request refunds within 30 days of purchase.
## Refund Eligibility
- Full refund if < 7 days
- Prorated refund if 7-30 days
- No refund if > 30 days
## Accounting Treatment
Refunds are recorded as negative revenue in the month issued, not the original purchase month.
EOF
# Ingest wiki content
ktx ingest# Ensure dbt artifacts exist
cd dbt-project
dbt compile
dbt docs generate
# Configure in ktx.yaml
cat >> ktx.yaml << 'EOF'
context_sources:
- id: dbt_main
type: dbt
config:
manifest_path: ./dbt-project/target/manifest.json
catalog_path: ./dbt-project/target/catalog.json
EOF
# Ingest dbt metrics
ktx ingest --context-source dbt_main
# Search for dbt metrics
ktx sl "customers"# Install during setup or manually
ktx setup
# Select "codex" when prompted for agent integration
# Or install explicitly
npx skills add Kaelio/ktx --skill ktx# ktx setup handles this automatically
# Adds MCP server config to Claude Code settings{
"mcpServers": {
"ktx": {
"command": "ktx",
"args": ["mcp", "start", "--project-dir", "/path/to/project"]
}
}
}import { KtxProject } from '@kaelio/ktx';
// Load existing project
const project = await KtxProject.load('/path/to/project');
// Search semantic layer
const results = await project.searchSemanticLayer('revenue');
console.log(results);
// Search wiki
const wikiResults = await project.searchWiki('refund policy');
console.log(wikiResults);
// Execute query
const queryResult = await project.query(
'warehouse',
'SELECT * FROM customers LIMIT 10'
);
console.log(queryResult.rows);
// Get metric definition
const metric = await project.getMetric('monthly_recurring_revenue');
console.log(metric.sql);# Check status
ktx status
# Start MCP server manually
ktx mcp start --project-dir /path/to/project
# Check logs
tail -f ~/.ktx/logs/mcp.logktx ingest# Test database connection
ktx test-connection warehouse
# Verify environment variables
env | grep -E 'POSTGRES|SNOWFLAKE|ANTHROPIC|OPENAI'
# Check ktx.yaml syntax
ktx validate
# Ingest with verbose logging
ktx ingest --verbosektx sl "metric_name"# Re-ingest context
ktx ingest --force
# Check semantic layer directory
ls -la semantic-layer/warehouse/metrics/
# Manually create metric if needed
mkdir -p semantic-layer/warehouse/metrics
cat > semantic-layer/warehouse/metrics/my_metric.yaml << 'EOF'
name: my_metric
type: metric
description: My custom metric
entity: my_entity
sql: COUNT(*)
EOF
# Rebuild context
ktx ingest# Verify API key
echo $ANTHROPIC_API_KEY
# Test LLM connection
ktx test-llm
# Switch provider if needed
ktx setup
# Select different LLM provider
# Check rate limits and quota
# Anthropic: https://console.anthropic.com
# OpenAI: https://platform.openai.com/usage# Check directory permissions
ls -la .
# Fix ownership
sudo chown -R $USER:$USER .
# Or specify writable project directory
ktx setup --project-dir ~/my-ktx-project# Review contradictions report
cat raw-sources/warehouse/contradictions.json
# Resolve by editing semantic sources
vim semantic-layer/warehouse/metrics/revenue.yaml
# Or update source (dbt, Looker, etc.)
# Then re-ingest
ktx ingest --force# ktx.yaml
connections:
- id: warehouse
type: postgres
config:
host: localhost
database: analytics
sample_strategy: adaptive
max_sample_rows: 10000
min_sample_rows: 100# ktx.yaml
semantic_layer:
join_detection:
min_confidence: 0.8
sample_size: 1000
detect_fan_traps: true
detect_chasm_traps: truewiki/
├── global/
│ ├── metrics/
│ │ ├── revenue-definitions.md
│ │ └── user-engagement.md
│ ├── policies/
│ │ ├── data-retention.md
│ │ └── refund-policy.md
│ └── glossary/
│ └── business-terms.md
└── user/
└── <user-id>/
└── scratch.mdktx ingestktx query--project-dirKTX_PROJECT_DIRktx.yamlktx ingest --project-dir /opt/analytics
ktx mcp start --project-dir /opt/analytics