搜索：llm-fine-tuning - AI Agent Skills

AI & Machine Learningsickn33/antigravity-aweso...

hugging-face-model-trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for...

🇺🇸|EnglishTranslated

7

AI & Machine Learningsundial-org/skills

tinker

Fine-tune LLMs using the Tinker API. Covers supervised fine-tuning, reinforcement learning, LoRA training, vision-language models, and both high-level Cookbook patterns and low-level API usage.

🇺🇸|EnglishTranslated

7

AI & Machine Learningtondevrel/scientific-agen...

transformers

State-of-the-art Machine Learning for PyTorch, TensorFlow, and JAX. Provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio. The industry standard for Large Language Models (LLMs) and foundation models in science.

🇺🇸|EnglishTranslated

6

AI & Machine Learningsundial-org/skills

training-data-curation

Guidelines for creating high-quality datasets for LLM post-training (SFT/DPO/RLHF). Use when preparing data for fine-tuning, evaluating data quality, or designing data collection strategies.

🇺🇸|EnglishTranslated

6

AI & Machine Learningvuralserhat86/antigravity...

model_finetuning

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.

🇺🇸|EnglishTranslated

6

AI & Machine Learningkiterlin/intelligent-dete...

grpo-rl-training

Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training

🇺🇸|EnglishTranslated

6

2 scripts/Attention

AI & Machine Learningkiterlin/intelligent-dete...

fine-tuning-with-trl

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.

🇺🇸|EnglishTranslated

5

AI & Machine Learningpluginagentmarketplace/cu...

fine-tuning

LLM fine-tuning with LoRA, QLoRA, and instruction tuning for domain adaptation.

🇺🇸|EnglishTranslated

4

1 scripts/Checked

Search Results: llm-fine-tuning

hugging-face-model-trainer

tinker

transformers

training-data-curation

model_finetuning

grpo-rl-training

fine-tuning-with-trl

fine-tuning

Search Results: llm-fine-tuning

hugging-face-model-trainer

tinker

transformers

training-data-curation

model_finetuning

grpo-rl-training

fine-tuning-with-trl

fine-tuning