搜索： trl - AI Agent Skills

AI & Machine Learninghuggingface/skills

huggingface-llm-trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

🇺🇸|EnglishTranslated

19

7 scripts/Checked

AI & Machine Learninghuggingface/skills

trl-training

Train and fine-tune transformer language models using TRL (Transformers Reinforcement Learning). Supports SFT, DPO, GRPO, KTO, RLOO and Reward Model training via CLI commands.

🇺🇸|EnglishTranslated

13

AI & Machine Learninghuggingface/skills

hugging-face-model-trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

🇺🇸|EnglishTranslated

8

6 scripts/Checked

AI & Machine Learningsickn33/antigravity-aweso...

hugging-face-model-trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for...

🇺🇸|EnglishTranslated

7

AI & Machine Learningvuralserhat86/antigravity...

model_finetuning

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.

🇺🇸|EnglishTranslated

6

AI & Machine Learningkiterlin/intelligent-dete...

grpo-rl-training

Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training

🇺🇸|EnglishTranslated

6

2 scripts/Attention

AI & Machine Learningkiterlin/intelligent-dete...

fine-tuning-with-trl

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.

🇺🇸|EnglishTranslated

5

Search Results: trl

huggingface-llm-trainer

trl-training

hugging-face-model-trainer

hugging-face-model-trainer

model_finetuning

grpo-rl-training

fine-tuning-with-trl

Search Results: trl

huggingface-llm-trainer

trl-training

hugging-face-model-trainer

hugging-face-model-trainer

model_finetuning

grpo-rl-training

fine-tuning-with-trl