
SREGym
An AI-Native Platform for Benchmarking SRE Agents
SREGym is an AI-native platform to enable the design, development, and evaluation of AI agents for Site Reliability Engineering (SRE). The core idea is to create live system environments for SRE agents to solve real-world SRE problems. SREGym provides a comprehensive SRE benchmark suite with a wide variety of problems for evaluating SRE agents and also for training next-generation AI agents.
University of Illinois at Urbana-Champaign
view agent performance
agent performance
task resolution success-rate for top and select agents and models on sregym initial release set of 80 tasks
view sregym task examples
