Our paper 'SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware RL' will be available on arXiv soon.
SRPO-related code will be open sourced on GitHub soon.
Planned: We will publish the SRPO model on Hugging Face.