搜索： dpo - AI Agent Skills

AI & Machine Learningdavila7/claude-code-templ...

simpo-training

Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model needed, more efficient than DPO. Use for preference alignment when want simpler, faster training than DPO/PPO.

🇺🇸|EnglishTranslated

94

Platform Serviceshairyf/blockchain-skills

tron

TRON (java-tron) - account model, DPoS, resources, system contracts, TVM, TRC-10/TRC-20, DEX, APIs, events, TronGrid.

🇺🇸|EnglishTranslated

6

Security & Compliancealien-id/agent-id

agent-id-auth

DPoP-signed (RFC 9449) authenticated calls to Alien-aware services. Discover any Alien-aware service's manifest at /.well-known/alien-agent-id.json, render its operations as actionable markdown, emit DPoP headers for one request, or one-shot a signed HTTP call with the agent's identity attached. Use when the user gives you a URL on an Alien-aware service (alien-api.com, alien.org, agent-sso.*), asks to call an Alien-aware endpoint, asks what an Alien-aware service can do, or mentions DPoP, agent-bound access tokens, or `cnf.jkt`.

🇺🇸|EnglishTranslated

6

Backend Developmentauth0/agent-skills

auth0-springboot-api

Use when securing Spring Boot API endpoints with JWT Bearer token validation, scope-based authorization, or DPoP proof-of-possession - integrates com.auth0:auth0-springboot-api SDK for REST APIs receiving access tokens from frontends or mobile apps. Triggers on Auth0AuthenticationFilter, Spring Boot API auth, JWT validation, SecurityFilterChain, hasAuthority SCOPE.

🇺🇸|EnglishTranslated

6

Backend Developmentauth0/agent-skills

auth0-fastapi-api

Use when securing FastAPI API endpoints with JWT Bearer token validation, scope/permission checks, or stateless auth - integrates auth0-fastapi-api for REST APIs receiving access tokens from SPAs, mobile apps, or other clients. Also handles DPoP proof-of-possession token binding. Triggers on: Auth0FastAPI, FastAPI API auth, JWT validation, require_auth, DPoP.

🇺🇸|EnglishTranslated

5

Backend Developmentauth0/agent-skills

express-oauth2-jwt-bearer

Use when adding Auth0 token validation to Express or Node.js APIs - integrates express-oauth2-jwt-bearer SDK to protect Node.js API endpoints with JWT Bearer authentication, scope-based RBAC, claim validation, and optional DPoP support

🇺🇸|EnglishTranslated

4

8 scripts/Checked

AI & Machine Learningaradotso/ai-agent-skills

awesome-adaptation-agentic-ai

Curated research collection on adaptation strategies for agentic AI systems, covering agent and tool adaptation methods with RL, SFT, and DPO approaches

🇺🇸|EnglishTranslated

4

Data Processingplurigrid/asi

acsets-relational-thinking

ACSets (Attributed C-Sets) for categorical database design and DPO rewriting

🇺🇸|EnglishTranslated

3

AI & Machine Learninghuggingface/skills

huggingface-llm-trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

🇺🇸|EnglishTranslated

14

7 scripts/Checked

AI & Machine Learninghuggingface/skills

trl-training

Train and fine-tune transformer language models using TRL (Transformers Reinforcement Learning). Supports SFT, DPO, GRPO, KTO, RLOO and Reward Model training via CLI commands.

🇺🇸|EnglishTranslated

11

AI & Machine Learninghuggingface/skills

hugging-face-model-trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

🇺🇸|EnglishTranslated

8

6 scripts/Checked

AI & Machine Learningtogethercomputer/skills

together-fine-tuning

LoRA, full fine-tuning, DPO preference tuning, VLM training, function-calling tuning, reasoning tuning, and BYOM uploads on Together AI. Reach for it whenever the user wants to adapt a model on custom data rather than only run inference, evaluate outputs, or host an existing model.

🇺🇸|EnglishTranslated

8

5 scripts/Checked

Search Results: dpo

simpo-training

tron

agent-id-auth

auth0-springboot-api

auth0-fastapi-api

express-oauth2-jwt-bearer

awesome-adaptation-agentic-ai

acsets-relational-thinking

huggingface-llm-trainer

trl-training

hugging-face-model-trainer

together-fine-tuning

Search Results: dpo

simpo-training

tron

agent-id-auth

auth0-springboot-api

auth0-fastapi-api

express-oauth2-jwt-bearer

awesome-adaptation-agentic-ai

acsets-relational-thinking

huggingface-llm-trainer

trl-training

hugging-face-model-trainer

together-fine-tuning