Artificial Intelligence
3 Papers
The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models
Large language models (LLMs) are post-trained to adopt a default "helpful Assistant" persona, but this identity is fragile. The paper explores the internal "persona space" in model activations, discovering a dominant linear direction called the Assistant Axis the primary axis …
Toward Training Superintelligent Software Agents through Self-Play SWE-RL
Current LLM-based software engineering agents rely heavily on human-curated data (e.g., GitHub issues, pull requests) and environments (e.g., test suites), which limits their path to superintelligence. The paper introduces Self-play SWE-RL (SSR), a reinforcement learning framework that trains a single …
Training AI Co-Scientists Using Rubric Rewards
AI "co-scientists" (LLM-based assistants) can help researchers by generating detailed research plans from given goals and constraints. However, current models often produce plans that violate implicit requirements due to the open-ended nature of scientific planning and the lack of fast, …