TopPaper
Today's Paper

Toward Training Superintelligent Software Agents through Self-Play SWE-RL

Yuxiang Wei et al. (from Meta FAIR, Meta TBD Lab, UIUC, CMU)

Current LLM-based software engineering agents rely heavily on human-curated data (e.g., GitHub issues, pull requests) and environments (e.g., test suites), which limits their path to superintelligence. The paper introduces Self-play SWE-RL (SSR), a reinforcement learning framework that trains a single LLM agent in a self-play loop with minimal assumptions: only access to sandboxed real-world code repositories (source code + dependencies), no human-labeled issues or pre-existing tests.

View Full Abstract View Full Paper Share