Search Results: reinforcement-learning

Found 17 Skills

AI & Machine Learningomer-metin/skills-for-ant...

reinforcement-learning

Use when implementing RL algorithms, training agents with rewards, or aligning LLMs with human feedback - covers policy gradients, PPO, Q-learning, RLHF, and GRPOUse when ", " mentioned.

🇺🇸|EnglishTranslated

AI & Machine Learningpluginagentmarketplace/cu...

reinforcement-learning

Q-learning, DQN, PPO, A3C, policy gradient methods, multi-agent systems, and Gym environments. Use for training agents, game AI, robotics, or decision-making systems.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningaznatkoiny/zai-skills

cpp-reinforcement-learning

C++ Reinforcement Learning best practices using libtorch (PyTorch C++ frontend) and modern C++17/20. Use when: - Implementing RL algorithms in C++ for performance-critical applications - Building production RL systems with libtorch - Creating replay buffers and experience storage - Optimizing RL training with GPU acceleration - Deploying RL models with ONNX Runtime

🇺🇸|EnglishTranslated

AI & Machine Learningk-dense-ai/claude-scienti...

stable-baselines3

Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API. Use for standard RL experiments, quick prototyping, and well-documented algorithm implementations. Best for single-agent RL with Gymnasium environments. For high-performance parallel training, multi-agent systems, or custom vectorized environments, use pufferlib instead.

🇺🇸|EnglishTranslated

3 scripts/Checked

AI & Machine Learningk-dense-ai/claude-scienti...

pufferlib

High-performance reinforcement learning framework optimized for speed and scale. Use when you need fast parallel training, vectorized environments, multi-agent systems, or integration with game environments (Atari, Procgen, NetHack). Achieves 2-10x speedups over standard implementations. For quick prototyping or standard algorithm implementations with extensive documentation, use stable-baselines3 instead.

🇺🇸|EnglishTranslated

2 scripts/Checked

AI & Machine Learningqodex-ai/ai-agent-skills

autonomous-agent-gaming

Build autonomous game-playing agents using AI and reinforcement learning. Covers game environments, agent decision-making, strategy development, and performance optimization. Use when creating game-playing bots, testing game AI, strategic decision-making systems, or game theory applications.

🇺🇸|EnglishTranslated

10 scripts/Checked

AI & Machine Learningdavila7/claude-code-templ...

pufferlib

This skill should be used when working with reinforcement learning tasks including high-performance RL training, custom environment development, vectorized parallel simulation, multi-agent systems, or integration with existing RL environments (Gymnasium, PettingZoo, Atari, Procgen, etc.). Use this skill for implementing PPO training, creating PufferEnv environments, optimizing RL performance, or developing policies with CNNs/LSTMs.

🇺🇸|EnglishTranslated

2 scripts/Checked

AI & Machine Learningsundial-org/skills

tinker

Fine-tune LLMs using the Tinker API. Covers supervised fine-tuning, reinforcement learning, LoRA training, vision-language models, and both high-level Cookbook patterns and low-level API usage.

🇺🇸|EnglishTranslated

AI & Machine Learningkiterlin/intelligent-dete...

slime-rl-training

Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.

🇺🇸|EnglishTranslated

AI & Machine Learningdavila7/claude-code-templ...

stable-baselines3

Use this skill for reinforcement learning tasks including training RL agents (PPO, SAC, DQN, TD3, DDPG, A2C, etc.), creating custom Gym environments, implementing callbacks for monitoring and control, using vectorized environments for parallel training, and integrating with deep RL workflows. This skill should be used when users request RL algorithm implementation, agent training, environment design, or RL experimentation.

🇺🇸|EnglishTranslated

3 scripts/Checked

AI & Machine Learningkiterlin/intelligent-dete...

torchforge-rl-training

Provides guidance for PyTorch-native agentic RL using torchforge, Meta's library separating infra from algorithms. Use when you want clean RL abstractions, easy algorithm experimentation, or scalable training with Monarch and TorchTitan.

🇺🇸|EnglishTranslated

AI & Machine Learningkiterlin/intelligent-dete...

grpo-rl-training

Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training

🇺🇸|EnglishTranslated

2 scripts/Attention