Overview# Welcome to AReaL’s documentation!# Version History Key Milestones Getting Started with AReaL-lite Running GRPO on GSM8K Dataset Tutorial Installation Quickstart OpenAI-Compatible Workflows Agentic Reinforcement Learning Evaluation Fine-tuning Large MoE Models Configurations Best Practices Diagnosing RL Performance Debugging Guide Handling OOM Issues Performance Profiling Customization Dataset Rollout and Agentic RL Training Algorithm References Benchmark Guide Reproduction Guide Algorithms Asynchronous RL Group Relative Policy Optimization (GRPO) REINFORCE Leave-One-Out (RLOO) Decoupled Clip and Dynamic Sampling Policy Optimization (DAPO) Group Sequence Policy Optimization (GSPO) Group Relative Policy Optimization Done Right (Dr.GRPO) Lite-PPO