Our paper 'SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware RL' has been accepted by NeurIPS 2025.
SRPO-related code has been open sourced on GitHub.